Computer Science Colloquium: Tom Kwiatkowski, Google

Tuesday, February 23, 2021 at 4:30 PMPacific Standard Time

What is a Correct Answer? - Designing Methods of Evaluating Computational Question Answering Systems - Computers that can answer arbitrary user questions are one of the holy grails of Artificial Intelligence. Recently, there has been great progress in question answering technology, fueled by advances in deep learning, and now computers can consistently find good answers to simple questions, from the web. Now, one of our biggest challenges is in working out how to measure how well these computer systems perform at answering the much broader range of questions for which the answer is not always clear.

In this talk I will present a deep dive into the design decisions that went into creating Natural Questions (NQ), a question answering benchmark from Google. NQ contains real user questions, which require an understanding of the questioner's intent. NQ also requires systems to read entire Wikipedia pages to decide whether they fully answer a question. This is much harder than finding an answer given the knowledge that one is present. I will convince you that the question 'when was the last time a hurricane hit massachusetts?' is under-specified with many reasonable answers, and I will tell you how we developed robust evaluation metrics to deal with this ambiguity.

Tom is a Research Scientist in Google's New York office. His focus is on building representations of the knowledge that is expressed in text, and he has a particular interest in modeling the ways in which different texts agree and disagree with each other. Tom has worked on question answering products at Google, as well as engaging with academia. Before joining Google he did a PhD in Edinburgh with Sharon Goldwater and Mark Steedman, and a post-doc at the University of Washington with Luke Zettlemoyer. Both Tom's PhD and post-doc had a focus on semantic parsing and grammar induction.

