Publications

For the updated list, check out my Google Scholar page.

(* denotes equal contribution or alphabetic ordering, † denotes corresponding author)

  • Reinforcement Learning for Reasoning in Large Language Models with One Training Example
    Yiping Wang†, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang†, Simon Shaolei Du†, Yelong Shen†
    #1 Paper of the day on HuggingFace Daily Papers
    NeurIPS 2025
    [Arxiv] [Code] [W&B] [Models] [X] [Slides]

  • FloE: On-the-Fly MoE Inference on Memory-constrained GPU
    Yuxin Zhou*, Zheng Li*, Jun Zhang, Jue Wang, Yiping Wang, Zhongle Xie, Ke Chen, Lidan Shou
    ICML 2025
    [Arxiv]

  • Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
    Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
    CVPR 2025
    [Arxiv] [Code] [Poster] [X] [Website]

  • Infer Human's Intentions Before Following Natural Language Instructions
    Yanming Wan, Yue Wu, Yiping Wang, Jiayuan Mao, Natasha Jaques
    AAAI 2025
    [Arxiv] [Code] [X] [Website]

  • CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
    Yiping Wang*, Yifang Chen*, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
    NeurIPS 2024 (Spotlight)
    [Arxiv] [Code] [Poster] [X] [Previous Versions]

  • JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
    Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du
    ICLR 2024
    [Arxiv] [X]

  • Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
    Yuandong Tian, Yiping Wang, Beidi Chen, Simon Shaolei Du
    NeurIPS 2023 (Oral presentation @ ICML2023-HiDL)
    [Arxiv] [Poster] [X]

  • Improved Active Multi-Task Representation Learning via Lasso
    Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du
    ICML 2023
    [Arxiv]

  • C-Mixup: Improving Generalization in Regression
    Huaxiu Yao*, Yiping Wang*, Linjun Zhang, James Zou, Chelsea Finn
    NeurIPS 2022
    [Arxiv] [Code]

Preprints

  • Spurious Rewards: Rethinking Training Signals in RLVR
    Rulin Shao*, Shuyue Stella Li*, Rui Xin*, Scott Geng*, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer
    Preprint 2025
    [Arxiv] [Blog1] [Blog2] [Code] [W&B] [Models] [X]

  • SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
    Yiping Wang, Hanxian Huang, Yifang Chen, Jishen Zhao, Simon Shaolei Du, Yuandong Tian
    preprint 2025
    [Arxiv]

  • Mojito: Motion trajectory and intensity control for video generation
    Xuehai He, Shuohang Wang, Jianwei Yang, Xiaoxia Wu, Yiping Wang, Kuan Wang, Zheng Zhan, Olatunji Ruwase, Yelong Shen, Xin Eric Wang
    preprint 2024
    [Arxiv]