Title: Text Simplification: Methods and Evaluation
Abstract: There are rich opportunities to reduce the language complexity of professional content (either human-written or computer-generated) and make it accessible to a broad audience. As a sub-task of natural language generation (NLG), text simplification has considerable potential to improve the fairness and transparency of text information systems.
Recent approaches to text simplification usually complete the task in an end-to-end fashion, employing neural machine translation models in a monolingual setting regardless of the type of simplifications to be done. These models are limited on the one hand due to the absence of large-scale parallel (complex → simple) monolingual training data, and on the other hand due to the lack of interpretability of their black-box procedures. Furthermore, despite fast development of algorithms, there is an urgency to fill the huge gap in evaluating NLG systems in general (including text simplification systems). Indeed, given no clear model of text quality and no agreed objective criterion for comparing the “goodness of texts”, the evaluation of NLG systems is inherently difficult. In this talk I will present my recent work that addresses these problems: i) sample-efficient approaches to NLG that improve the fairness and transparency of text information systems by adapting their content to the literacy level of the target audience, ii) systematically analysis of evaluation metrics for NLG models informed by theory and empirical evidence.
In particular, I will show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process; low-resource text simplification can be framed from a task and domain adaptation perspective which can be decomposed into multiple adaptation steps via meta-learning and transfer learning; and evaluators for NLG can be evaluated at scale and compared with human judgements.