A long-term vision is to develop algorithms that automatically, and in a query-driven manner, retrieve materials from the Web and compose comprehensive articles that are akin to Wikipedia articles. Especially for information needs, where the user has very little prior knowledge, the web search paradigm of ten blue hyperlinks is not sufficient. Instead, the goal is to recycle Web materials with the help of Knowlegde graphs to produce a comprehensive overview. While natural language generation methods like GPT-3 dominate the news, a deeper question is how to identify relevant content, what concepts are important to discuss, and how identify and organize relevant subtopics. I will present the latest findings on the way towards developing retrieve-extract-and-generate information systems.
Laura Dietz is an Assistant Professor at the University of New Hampshire, where she leads the lab for text retrieval, extraction, machine learning and analytics (TREMA). She organizes a tutorial/workshop series on Utilizing Knowledge Graphs in Text-centric Retrieval (KG4IR) and coordinates the TREC Complex Answer Retrieval Track. She received an NSF CAREER Award for utilizing fine-grained knowledge annotations in text understanding and retrieval. Previously, she was a research scientist at the Data and Web Science Group at Mannheim University and the Center for Intelligent Information Retrieval (CIIR) at UMass Amherst. She obtained her doctoral degree with a thesis on topic models for networked data from Max Planck Institute for Informatics.