IMDEA Networks Institute Publications Repository

Graph-based Techniques for Topic Classification of Tweets in Spanish

Cordobés de la Calle, Héctor and Fernández Anta, Antonio and Chiroque, Luis F. and Pérez, Fernando and Redondo, Teófilo and Santos, Agustín (2014) Graph-based Techniques for Topic Classification of Tweets in Spanish. [Journal Articles]

[img]
Preview
PDF (Graph-based Techniques for Topic Classification of Tweets in Spanish) - Published Version
Download (585Kb) | Preview

Abstract

Topic classification of texts is one of the most interesting challenges in Natural Language Processing (NLP). Topic classifiers commonly use a bag-of-words approach, in which the classifier uses (and is trained with) selected terms from the input texts. In this work we present techniques based on graph similarity to classify short texts by topic. In our classifier we build graphs from the input texts, and then use properties of these graphs to classify them. We have tested the resulting algorithm by classifying Twitter messages in Spanish among a predefined set of topics, achieving more than 70% accuracy.

Item Type: Journal Articles
Uncontrolled Keywords: Classification, Graphs, Happiness, NLP, Text Classification, Topic Classification
Subjects: UNSPECIFIED
Divisions: UNSPECIFIED
Depositing User: Hector Cordobes
Date Deposited: 10 Mar 2014 08:28
Last Modified: 30 Jan 2015 16:37
URI: http://eprints.networks.imdea.org/id/eprint/723

Actions (login required)

View Item View Item