Ngoc-Phuoc-An Vo, Octavian Popescu
KDIR, Porto, Portugal, 9-11 November 2016.
Building a system able to cope with various phenomena which falls under the umbrella of semantic similarity
is far from trivial. It is almost always the case that the performances of a system do not vary consistently or
predictably from corpora to corpora. The contribution of this paper consists of two parts: (1) we carried out
an analysis on the source of this variation, and (2) we present a novel system for the detection of the semantic
similarity between pairs of sentences. The system consistently achieves an accuracy which is very close to the
state of the art, or reaching a new state of the art. The system is based on a multi-layer architecture and is able to deal with heterogeneous corpora which may not have been generated by the same distribution.
Report number: