Xiaosen Zheng

Researcher

TikTok, Singapore

Email: zhengxiaosen.zxs@tiktok.com; xszheng.2020@phdcs.smu.edu.sg
Github: xszheng2020

[Google Scholar] [Semantic Scholar] [OpenReview] [Linkedin] [Twitter] [WeChat]

We are looking for strong and self-motivated internship students working on fundamental Code AI research problems.

The expected outputs include publications / patents on top-tier academic avenues. Please feel free to drop an email if interested.

Biography

I am currently a Researcher at TikTok AI Innovation Center in Singapore.

My research focuses on Code AI, Data-Centric AI, and AI Safety.

Previously, I had the privilege of conducting research with Tianyu Pang, Chao Du, Qian Liu, and Min Lin at Sea AI Lab.

I received my Ph.D. in Computer Science from Singapore Management University in 2025, where I was advised by Professor Jing Jiang. I earned my B.E. in Software Engineering from Central South University in 2019.

Preprint

(* indicates equal contribution)

Publications

(* indicates equal contribution)

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
Xiaosen Zheng*, Tianyu Pang*, Chao Du, Qian Liu, Jing Jiang, Min Lin
International Conference on Learning Representations (ICLR), Singapore, Singapore, 2025
[code] [arxiv]
RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
Qian Liu*, Xiaosen Zheng*, Niklas Muennighoff, Guangtao Zeng, Longxu Dou, Tianyu Pang, Jing Jiang, Min Lin
International Conference on Learning Representations (ICLR), Singapore, Singapore, 2025
[code] [arxiv]

Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses
Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Jing Jiang, Min Lin
Annual Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2024
[code] [arxiv]
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu*, Xiaosen Zheng*, Tianyu Pang*, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
International Conference on Machine Learning (ICML), Vienna, Austria, 2024
[code] [arxiv]
Intriguing Properties of Data Attribution on Diffusion Models
Xiaosen Zheng, Tianyu Pang, Chao Du, Jing Jiang, Min Lin
International Conference on Learning Representations (ICLR), Vienna, Austria, 2024
[code] [arxiv]
An Empirical Study of Memorization in NLP
Xiaosen Zheng, Jing Jiang
Annual Meeting of the Association for Computational Linguistics (ACL), Dublin, Ireland, 2022
[code] [arxiv]

Honors & Awards

Presidential Doctoral Fellowship@SMU, AY2022, AY2024

Services

I was a reviewer of conferences:
NeurIPS 2025
COLM 2025
ICLR 2025
NeurIPS D&B 2024
ARR 2023 Dec, 2024 April
ACL 2023

I was a reviewer of journals:
TPAMI
Artificial Intelligence © 2024 Xiaosen Zheng