Yiping Wang 王宜平

Yiping Wang
Ph.D student
Paul G. Allen School of Computer Science & Engineering,
University of Washington
Email: ypwang61@cs.washington.edu

Google Scholar / Twitter / Github / LinkedIn

About me

I'm a Ph.D. student (2023.9 - Present) in Paul G. Allen School of Computer Science & Engineering at University of Washington. I feel very fortunate to work with Prof. Simon Shaolei Du. Prior to UW, I studied Computer Science and Mathematics in Zhejiang University, got an honors degree from Chu Kochen Honors College.

My research interests broadly spread across natural language processing, multimodal learning, and machine learning. I'm particularly interested in RL for LLM reasoning (One-Shot RLVR) and data selection algorithms (CLIPLoss). I also studied theoretical understanding of LLMs (Scan&Snap, JoMA) and video generation evaluation (StoryEval). I'm also currently exploring AI4Math. More broadly, I'm excited about developing safe AI systems with super-human reasoning capabilities that can drive independent scientific progress.

I'm grateful to all my collaborators and mentors along the way. I'm privileged to have been working closely with Dr. Yuandong Tian since Spring 2023. I've been interning at Microsoft since June 2024, where I'm fortunate to be advised by Yelong Shen and Shuohang Wang. During my undergraduate, I was fortunate to work with Prof. Huaxiu Yao and Prof. Linjun Zhang.

News

04/2025: Release our work One-Shot RLVR (Arxiv, Code, W&B, Twitter), get #1 Paper of the day on HuggingFace Daily Papers!
04/2025: One paper is accepted by ICML 2025.
02/2025: One paper (StoryEval) is accepted by CVPR 2025.
12/2024: Releasing a new video generation benchmark StoryEval (Arxiv, Code, Twitter, Website).
09/2024: Attending MoDL 2024 in New York sponsored by Simons Foundation, and presenting our CLIPLoss poster.
09/2024: Our CLIPLoss is accepted by NeurIPS 2024 as spotlight!
01/2024: One paper (JoMA) is accepted by ICLR 2024.
09/2023: One paper (Scan&Snap) is accepted by NeurIPS 2023.

Research directions and Selected Papers

(* denotes equal contribution or alphabetic ordering, † denotes corresponding author)

Reinforcement Learning for LLM Reasoning

We analyze the empirical observations of Reinforcement Learning with Verifiable Rewards (RLVR) on Large Language Models (LLMs).

Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang†, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang†, Simon Shaolei Du†, Yelong Shen†
Preprint
[Arxiv] [Code] [W&B] [Twitter]

tl;dr: We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks!

Data Selection for Multimodal Learning

We studied how to efficiently select data for multimodal pretraining tasks, drawing inspiration from both empirical observations and theoretical insights.

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang*, Yifang Chen*, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
NeurIPS 2024 (Spotlight)
[Arxiv] [Code] [Poster] [Twitter] [Previous Versions]

tl;dr: We design simple but efficient data selection methods for CLIP pretraining, and get new SOTA in DataComp benchmark.

Theory of Transformer Dynamics

We attempted to analyze the training dynamics of transformers in a mathematical way.

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian, Yiping Wang, Beidi Chen, Simon Shaolei Du
NeurIPS 2023 (Oral presentation @ ICML2023-HiDL)
[Arxiv] [Poster] [Twitter]

tl;dr: We analyze the 1-layer transformer with next token prediction loss, and rigorously prove its training process.

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du
ICLR 2024
[Arxiv] [Twitter]

tl;dr: We analyze the training dynamics of multilayer transformer, characterizing the role of self-attention and MLP nonlinearity.

Video Generation Evaluation

We explore the common issues existing in the current top video generative models.

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
CVPR 2025
[Arxiv] [Code] [Poster] [Twitter] [Website]

tl;dr: Current top video generative models can not present multi-event stories like "How to Put an Elephant in a Refrigerator".