TY - JOUR
T1 - Assessment of supervised classifiers for the task of detecting messages with suicidal ideation
AU - Acuña Caicedo, Roberto Wellington
AU - Gómez Soriano, José Manuel
AU - Melgar Sasieta, Héctor Andrés
N1 - Publisher Copyright:
© 2020 The Author(s)
PY - 2020/8
Y1 - 2020/8
N2 - According to the World Health Organization (WHO) close to 800,000 people worldwide die by suicide each year, and many more attempts to do it. In consequence, the WHO recognizes suicide as a global public health priority, which affects not only rich countries but poor and middle-income countries as well. This study makes a systematic analysis of 28 supervised classifiers using different features of the corpus Life to detect messages with suicidal ideation and depression to know if these can be used in an automatic prevention online system. The Life Corpus, used in this research, is a bilingual text corpus (English and Spanish) oriented to the detection of suicide ideation. This corpus was constructed retrieving texts from several social networks and its quality was measured using mutual annotation agreement. The different experiments determined that the classifier with the best performance was KStar, with the corpus features POS-SYNSETS-NUM, achieving the best results with the ROC Area metrics of 0,81036 and F-measure of 0,7148. The present research fulfilled the objective of discovering which supervised classifiers and which features are the most suitable for the automatic classification of messages with suicidal ideation using the Life Corpus. Also, given the imbalance of the results, a new precision measure was developed called the Two-dimensional Accuracy and Recovery Index (GDP), which can provide better results, in unbalanced systems, than the usual measures to assess the quality of the results (measure F, Area ROC), and thus increase the number of messages at risk of suicidal ideation, detected at the cost of receiving more messages that are not related to suicide or vice versa.
AB - According to the World Health Organization (WHO) close to 800,000 people worldwide die by suicide each year, and many more attempts to do it. In consequence, the WHO recognizes suicide as a global public health priority, which affects not only rich countries but poor and middle-income countries as well. This study makes a systematic analysis of 28 supervised classifiers using different features of the corpus Life to detect messages with suicidal ideation and depression to know if these can be used in an automatic prevention online system. The Life Corpus, used in this research, is a bilingual text corpus (English and Spanish) oriented to the detection of suicide ideation. This corpus was constructed retrieving texts from several social networks and its quality was measured using mutual annotation agreement. The different experiments determined that the classifier with the best performance was KStar, with the corpus features POS-SYNSETS-NUM, achieving the best results with the ROC Area metrics of 0,81036 and F-measure of 0,7148. The present research fulfilled the objective of discovering which supervised classifiers and which features are the most suitable for the automatic classification of messages with suicidal ideation using the Life Corpus. Also, given the imbalance of the results, a new precision measure was developed called the Two-dimensional Accuracy and Recovery Index (GDP), which can provide better results, in unbalanced systems, than the usual measures to assess the quality of the results (measure F, Area ROC), and thus increase the number of messages at risk of suicidal ideation, detected at the cost of receiving more messages that are not related to suicide or vice versa.
KW - Automatic classification
KW - Computer science
KW - Machine learning
KW - Social networks
KW - Suicidal ideation
KW - Suicide
KW - Supervised classifiers
UR - http://www.scopus.com/inward/record.url?scp=85088837682&partnerID=8YFLogxK
U2 - 10.1016/j.heliyon.2020.e04412
DO - 10.1016/j.heliyon.2020.e04412
M3 - Article
AN - SCOPUS:85088837682
SN - 2405-8440
VL - 6
JO - Heliyon
JF - Heliyon
IS - 8
M1 - e04412
ER -