News

💡 Check out our implicit CoT paper "CIRF"

'CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models' now available on …

🎉 Our paper "Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams" accepted to KDD 2026!

Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams. Collaboration with Korea University team 🐯

🎉 Our paper "RExBench" accepted to ACL 2026!

RExBench, Can coding agents autonomously implement AI research extensions?

🎉 Co-organizing the "AI Modeling for Disappearing Knowledge" workshop at IJCAI 2026.

Please consider submitting your work :)

👩🏻‍🏫 Invited talks at SNU (Jan 2nd), HYU (Jan 8th), and KU (Jan 9th)

Topic - Towards a Science of Evaluation for Language Model

👩🏻‍🏫 Gave an invited talk at Stanford/UW (RExBench 🦖) with Nicholas

Topic - Can coding agents autonomouslyimplement AI research extensions?

🎉 Our paper "CheckEval" accepted to EMNLP 2025 !

CheckEval, A reliable LLM-as-a-Judge framework for evaluating text generation using checklists. Thanks to my collaborators ❤️

👩🏻‍🏫 Gave an invitied talk at Korea University (RExBench 🦖)

Topic - Can coding agents autonomouslyimplement AI research extensions?

💡 Check out our RExBench paper

Can coding agents autonomously implement AI research extensions? Our [RExBench](https://arxiv.org/abs/2506.22598) is now available on arXiv !

🎉 Paper accepted to NAACL 2025 (Industry Track)

Our [WritingPath](https://arxiv.org/abs/2404.13919) paper has been accepted to **NAACL 2025 (Industry Track)**!