An Unsupervised Learning Approach for Automatically to Categorize Potential Suicide Messages in Social Media

Jorge Parraga-Alava, Roberto Wellington Acuña Caicedo, Jose Manuel Gomez, Mario Inostroza-Ponta

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

17 Citas (Scopus)


In this paper, we present an approach to categorize potential suicide messages in social media which is based on unsupervised learning. Our approach has five phases: the first two correspond to data acquisition and pre-processing where texts available in a corpus for suicide detection were taken and converted into a structured format; in the third phase, similarity between texts are computed using semantic similarity measures; traditional clustering algorithms were used to identify categories of potential suicide messages in the fourth phase; and, in last phase, using validation metrics we verified the usefulness of our approach to replicate the allocation of text into categories as in the original corpus data. Computational results showed that our approach is able to replicate the grouping of messages labeled as 'No risk' and 'Risk' in average rates of 79 % and 87 % and rates up 13 % and 9 % in alert levels for English and Spanish, respectively.
Idioma originalEspañol
Título de la publicación alojadaProceedings - International Conference of the Chilean Computer Science Society, SCCC
EstadoPublicada - 1 nov. 2019
Publicado de forma externa

Citar esto