An Unsupervised Learning Approach for Automatically to Categorize Potential Suicide Messages in Social Media

Jorge Parraga-Alava, Roberto Wellington Acuña Caicedo, Jose Manuel Gomez, Mario Inostroza-Ponta

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

In this paper, we present an approach to categorize potential suicide messages in social media which is based on unsupervised learning. Our approach has five phases: the first two correspond to data acquisition and pre-processing where texts available in a corpus for suicide detection were taken and converted into a structured format; in the third phase, similarity between texts are computed using semantic similarity measures; traditional clustering algorithms were used to identify categories of potential suicide messages in the fourth phase; and, in last phase, using validation metrics we verified the usefulness of our approach to replicate the allocation of text into categories as in the original corpus data. Computational results showed that our approach is able to replicate the grouping of messages labeled as 'No risk' and 'Risk' in average rates of 79 % and 87 % and rates up 13 % and 9 % in alert levels for English and Spanish, respectively.
Original languageSpanish
Title of host publicationProceedings - International Conference of the Chilean Computer Science Society, SCCC
Volume2019-November
StatePublished - 1 Nov 2019
Externally publishedYes

Cite this