KATIE: A System for Key Attributes Identification in Product Knowledge Graph Construction

Btissam Er-Rahmadi, Arturo Oncevay, Yuanyi Ji, Jeff Z. Pan

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

We present part of Huawei's efforts in building a Product Knowledge Graph (PKG). We want to identify which product attributes (i.e. properties) are relevant and important in terms of shopping decisions to product categories (i.e. classes). This is particularly challenging when the attributes and their values are mined from online product catalogues, i.e. HTML pages. These web pages contain semi-structured data, which do not follow a concerted format and use diverse vocabulary to designate the same features. We propose a system for key attribute identification (KATIE) based on fine-tuning pre-trained models (e.g., DistilBERT) to predict the applicability and importance of an attribute to a category. We also propose an attribute synonyms identification module that allows us to discover synonymous attributes by considering not only their labels' similarities but also the similarity of their values sets. We have evaluated our approach to Huawei categories taxonomy and a set of internally mined attributes from web pages. KATIE guarantees promising performance results compared to the most recent baselines.

Idioma originalInglés
Título de la publicación alojadaSIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorialAssociation for Computing Machinery, Inc
Páginas3320-3324
Número de páginas5
ISBN (versión digital)9781450394086
DOI
EstadoPublicada - 19 jul. 2023
Publicado de forma externa
Evento46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023 - Taipei, Taiwán
Duración: 23 jul. 202327 jul. 2023

Serie de la publicación

NombreSIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Conferencia

Conferencia46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023
País/TerritorioTaiwán
CiudadTaipei
Período23/07/2327/07/23

Huella

Profundice en los temas de investigación de 'KATIE: A System for Key Attributes Identification in Product Knowledge Graph Construction'. En conjunto forman una huella única.

Citar esto