An LDA-lexical syntactical approach for events and features extraction of earthquakes from Spanish and English tweets

Enrique Valeriano Loli, Juanjosé Tenorio Peña, Rodrigo López Condori

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva


In the last few years, social networks like Twitter have been a very useful resource for tracking the events that happened before, during and after an earthquake. Several studies of this topic have applied different techniques like Clustering or Temporal models for extracting these events from Twitter. In this paper, however, we propose a new approach for extracting not only the events that happened in the earthquake but also some of its most prominent features like intensity, epicenter and affected places. We performed a lexical syntactical analysis of Spanish and English tweets in order to find the events that happened, in addition to a semantical analysis using statistical metrics and models like Pointwise Mutual Information(PMI) and Latent Dirichlet Allocation(LDA) for extracting the features of the earthquake. Our results show that, by considering the semantics and syntactics of the tweets, we can extract important events and features of an earthquake, which can be used for online detection and tracking of similar disasters.
Idioma originalEspañol
Título de la publicación alojadaCEUR Workshop Proceedings
Número de páginas8
EstadoPublicada - 1 ene. 2017

Citar esto