Aspect Based Sentiment Analysis Dataset
ABSA systems are usually tested on the same dataset they're developed on. But how would a system trained on the Semeval2016 dataset perform on new data from different sources? To assess ABSA real-world performances, we manually annotated a completely new dataset from Foursquare comments of about 215K user reviews in English of restaurants all over the world. From these reviews, we randomly selected 585 samples, which contain 1006 sentences, and annotated them with the SemEval2016 annotation guidelines for the restaurant domain.
If you use the data please cite the following paper:
Aspect Based Sentiment Analysis into the Wild
ABSTRACT: In this paper, we test state-of-the-art Aspect Based Sentiment Analysis systems trained on a widely used dataset on "real" data. We created a new manually annotated dataset of user generated data from the same domain as the training dataset, but from other sources and analyse the differences between the new and the standard ABSA dataset. We then analyse the performance results of different versions of the same system on both datasets. We also propose light adaptation methods to increase system robustness.
WASSA 2018 will be held in conjunction with 2018 Conference on Empirical Methods in Natural Language Processing EMNLP 2018, Brussels, Belgium, 2nd November – 4th November 2018
NAVER LABS Europe is a platinum sponsor at EMNLP. More information on our booth to come!