In the year 1665, the first academic journal was published. Fast forward to today, there are millions of scientific papers coming out every year. This explosion of knowledge represents an opportunity to accelerate innovation with automated systems that scour the literature for solutions and inspirations. However, it also creates information overload and isolated "research bubbles" that limit discovery and sharing, slowing down scientific progress and cross-fertilization. In this talk, I will present our work toward addressing these large-scale challenges for the future of science.
In the first part of the talk, I will overview our core approach which consists of identifying key “building blocks” of scientific thought, formalizing and structuring them into computational representations that can power creative innovation systems. These include systems that surface inspirations, recommend novel authors, enable search for challenges, hypotheses and causal relations, and tools for exploration and visualization of collaboration networks.
The second part of the talk will consist of a dive into our new work -- SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021) -- motivated by some of the applications above. We present a new task of cross-document coreference with a referential hierarchy over mention clusters, including a new challenging dataset and models. Finally, if time permits, I will discuss our recent paper --- Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study (AKBC 2021), where we integrate language models and graph embeddings to boost biomedical link prediction with applications in drug discovery.
Tom Hope is a postdoctoral researcher at The Allen Institute for AI (AI2) and University of Washington, working with Daniel Weld on accelerating scientific discovery and closely collaborating with Eric Horvitz, CSO at Microsoft. Tom completed his PhD with Dafna Shahaf at the Hebrew University of Jerusalem in January 2020. His work has received three best paper awards, appeared in top venues (PNAS, KDD, EMNLP, NAACL, ACL, WSDM, AKBC, IEEE), and received media attention from Nature and Science on his systems for COVID-19 researchers, and from VentureBeat on a novel knowledge discovery system he created as research team lead at Intel. Tom was selected for the 2021 Global Young Scientists Summit and 2019 Heidelberg Laureate Forum, was a member of the KDD 2020 Best Paper Selection Committee, and wrote a textbook for O’Reilly on the TensorFlow software library.