Machine Learning Seminar: Lifan Yuan, "From f(x) and g(x) and f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones."
Apr 24, 2026 2:00 - 3:15 pm

- Sponsor
- Research Area of Artificial Intelligence
- Speaker
- Lifan Yuan
- Contact
- Weixin Chen
- weixinc2@illinois.edu
- Originating Calendar
- Siebel School Speakers Calendar
- Abstract: Does reinforcement learning teach LLMs genuinely new skills, or merely activate existing ones? This talk presents evidence that LLMs can acquire new skills during RL by composing existing ones. Using a controlled synthetic framework, results show that when an LLM has learned atomic functions f and g, RL enables it to learn unseen compositions h(x)=f(g(x)). This compositional ability generalizes to complex compositions and transfers across tasks. Analysis reveals that RL fundamentally changes reasoning behaviors, while standard next-token prediction training yields none of these effects. The findings suggest a paradigm: build base models with various basic skills, then use RL to incentivize advanced compositional capabilities for complex problems.Bio: Lifan Yuan (https://lifan-yuan.github.io) is a second-year PhD student at UIUC advised by Prof. Hao Peng. His research pursues scalable AI systems that can automate AI research through self-evolution and accelerate science. To this end, He works on scaling trial-and-error learning for LLMs (e.g., RL) and learning from feedback, with highlights including Implicit PRM / PRIME (listed as one of the most influential ICML 2025 papers by PaperDigest) and UltraFeedback (one of the most cited ICML 2024 papers; popularized DPO and powered 2K+ HuggingFace models). He previously worked at the Gemini team at Google DeepMind as a student researcher.