Yukyung Lee

Postdoctoral Associate

Professional Summary

I am a postdoctoral associate at Boston University, working with Prof. Najoung Kim and Prof. Sebastian Schuster. I received my Ph.D. from Korea University, advised by Prof. Pilsung Kang. During my Ph.D, I was research intern at NAVER and contributed to CLOVA for Writing. I completed my B.S. at HUFS, advised by Prof. Chungmok Lee

My research focuses on LLM evaluation, aiming to discover, measure, understand, and improve the capabilities of language models. My long-term research vision is to establish a science of evaluation for language models. I like to think about what makes evaluation reliable and what it truly tells us about these models. I am also interested in LLM agents that autonomously solve complex problems in research and engineering, and how to reliably evaluate them.

Download CV (Updated May 2026)

Education

PhD Industrial Management & Engineering

Korea University

BSc Industrial Management & Engineering, BA International Finance (Double Major)

HUFS

Interests

Language Model Evaluation LLM Agent Math/Code Reasoning Writing with AI Anomaly Detection

💡 Check out our math reasoning paper "Cliff Token"

Jun 24, 2026

‘Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning’ now available on arXiv! Work with Jaeyong Ko and Pilsung Kang. Congrats to Jaeyong on his very first paper 🎉

Jun 24, 2026

💡 Check out our implicit CoT paper "CIRF"

May 30, 2026

‘CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models’ now available on arXiv! Collaborated with HUFS team 🦉

May 30, 2026

🎉 "Can Structural Cues Save LLMs?" has been accepted to KDD 2026!

May 15, 2026

Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams. Work with Yebin Lim, Woojun Jung, Wonjun Choi and Susik Yoon 🐯

May 15, 2026

🎉 RExBench has been accepted to ACL 2026!

Apr 6, 2026

RExBench, Can coding agents autonomously implement AI research extensions?

Apr 6, 2026

See all

Selected Publications

For an up-to-date list of publications, check out my Google Scholar.

* denotes equal contribution, † denotes equal contribution as senior role.

See all publications

Jaeyong Ko, Pilsung Kang, Yukyung Lee (2026). Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning. preprint.

Code PDF

Yukyung Lee, Yumeng Shen, Jinhyeong Park, Hyein Yang, Jun-Hyung Park (2026). CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models. preprint.

PDF

Yukyung Lee, Yebin Lim*, Woojun Jung*, Wonjun Choi, Susik Yoon (2026). Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams. KDD 2026.

Code PDF

Nicholas Edwards*, Yukyung Lee*, Yujun(Audrey) Mao, Yulu Qin, Sebastian Schuster†, Najoung Kim† (2026). RExBench: Can coding agents autonomously implement AI research extensions?. ACL 2026.

Code PDF

Yukyung Lee, Joonghoon Kim, Jaehee Kim, Hyowon Cho, Jaewook Kang, Pilsung Kang†, Najoung Kim† (2025). CheckEval: A reliable LLM-as-a-Judge framework for evaluating text generation using checklists. EMNLP 2025.

Code PDF

No results found

Yukyung Lee

Professional Summary

Education

Interests

💡 Check out our math reasoning paper "Cliff Token"

💡 Check out our implicit CoT paper "CIRF"

🎉 "Can Structural Cues Save LLMs?" has been accepted to KDD 2026!

🎉 RExBench has been accepted to ACL 2026!