Hybrid Feature Factored System for Scoring Extracted Passage Relevance in Regulatory Filings
Denys Proux, Claude Roux, Agnes Sandor, Julien Perez
DSMM (Data Science for Macro-Modeling with Financial and Economic Datasets) workshop, part of SIGMOD 2017, Chicago, USA, 14 - 19 May 2017
We report in this paper our contribution to the FEIII 2017 challenge addressing relevance ranking of passages extracted from 10-K and 10-Q regulatory filings. We leveraged our previous work on document structure and content analysis for regulatory filings to train hybrid text analytics and decision making models. We designed and trained several layers of classifiers fed with linguistic and semantic features to improve relevance prediction. We present in this paper our experiments and the results on the competition data set.