David Hull
Proceedings of the TREC-8 Conf.
This report describes the Xerox work on the TREC-8 Question Answering
Track. We linked together a few basic NLP components (a question parser, a sentence
boundary identifier, and a proper noun tagger) with a sentence scoring function and an answer
presentation function built specifically for the TREC Q&A task. Our system found the correct
50-byte answer (in the top 5 responses) to 45% of the questions, a quite respectable performance,
but with considerable room for improvement. Based on the failure analysis presented in this
paper, we can conclude that the system would benefit from having access to a broad range of other NLP
technologies, including robust parsing and coreference analysis, or some good heuristic
approximations thereof. The system also has a clear need for some semantic resources to help with
certain difficult problems,such as finding answers that match the semantic class X in What X?
Report number: