An LDA-lexical syntactical approach for events and features extraction of earthquakes from Spanish and English tweets

Enrique Valeriano Loli, Juanjosé Tenorio Peña, Rodrigo López Condori

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the last few years, social networks like Twitter have been a very useful resource for tracking the events that happened before, during and after an earthquake. Several studies of this topic have applied different techniques like Clustering or Temporal models for extracting these events from Twitter. In this paper, however, we propose a new approach for extracting not only the events that happened in the earthquake but also some of its most prominent features like intensity, epicenter and affected places. We performed a lexical syntactical analysis of Spanish and English tweets in order to find the events that happened, in addition to a semantical analysis using statistical metrics and models like Pointwise Mutual Information(PMI) and Latent Dirichlet Allocation(LDA) for extracting the features of the earthquake. Our results show that, by considering the semantics and syntactics of the tweets, we can extract important events and features of an earthquake, which can be used for online detection and tracking of similar disasters.
Original languageSpanish
Title of host publicationCEUR Workshop Proceedings
Pages190-197
Number of pages8
Volume2029
StatePublished - 1 Jan 2017

Cite this