Meal Finder: fine grained information extraction from web-scale image database
Does a restaurant have your favourite dish? What's the price of this dish? Have they vegetarian options? Do they have a kid’s menu?
When looking for a restaurant, all these common questions are not necessarily answered by your favourite search application or social web portal So, as a user, when you have found a nice restaurant, you usually end up using additional sources of information to answer these questions, because the data is not available directly or in an easy way. A great place to look at is directly on the restaurant’s website, to see if they provide additional information (like an updated menu).
The goal of this internship is to investigate how to facilitate this Information Retrieval task by automatically enriching a restaurant database with unstructured information available on the Internet.
The student will have to help design and implement an information extraction tool for restaurant menus. More precisely, the tool should be able to structure a menu, from an image representation, into its main structures and sections, extracting first information on dishes (name, description, price(s)) or on the restaurant itself (address, opening hours, phone number).
The student will work with a multi-disciplinary team with expertise in Computer Vision, OCR, Document Structure Analysis and Information Extraction . A production database gathering more than 100 million places, the associated user reviews, images and menus will be available to test and develop the system. At Naver Labs Europe we encourage participation in the academic community and our researchers publish regularly at top venues in NLP, machine learning and computer vision.
● Student at Master (research-oriented) or PhD level.
● Knowledge of neural networks models and Conditional Random Fields
● Good coding skills in Python