|
Publications
For the updated list, check out my Google Scholar page.
(* denotes equal contribution or alphabetic ordering, † denotes corresponding author)
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang†, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang†, Simon Shaolei Du†, Yelong Shen†
#1 Paper of the day on HuggingFace Daily Papers
NeurIPS 2025
[Arxiv]
[Code]
[W&B]
[Models]
[X]
[Slides]
FloE: On-the-Fly MoE Inference on Memory-constrained GPU
Yuxin Zhou*, Zheng Li*, Jun Zhang, Jue Wang, Yiping Wang, Zhongle Xie, Ke Chen, Lidan Shou
ICML 2025
[Arxiv]
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Yiping Wang, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
CVPR 2025
[Arxiv]
[Code]
[Poster]
[X]
[Website]
Infer Human's Intentions Before Following Natural Language Instructions
Yanming Wan, Yue Wu, Yiping Wang, Jiayuan Mao, Natasha Jaques
AAAI 2025
[Arxiv]
[Code]
[X]
[Website]
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
Yiping Wang*, Yifang Chen*, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
NeurIPS 2024 (Spotlight)
[Arxiv]
[Code]
[Poster]
[X]
[Previous Versions]
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du
ICLR 2024
[Arxiv]
[X]
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian, Yiping Wang, Beidi Chen, Simon Shaolei Du
NeurIPS 2023
(Oral presentation @ ICML2023-HiDL)
[Arxiv]
[Poster]
[X]
Improved Active Multi-Task Representation Learning via Lasso
Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du
ICML 2023
[Arxiv]
C-Mixup: Improving Generalization in Regression
Huaxiu Yao*, Yiping Wang*, Linjun Zhang, James Zou, Chelsea Finn
NeurIPS 2022
[Arxiv]
[Code]
Preprints
Spurious Rewards: Rethinking Training Signals in RLVR
Rulin Shao*, Shuyue Stella Li*, Rui Xin*, Scott Geng*, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer
Preprint 2025
[Arxiv]
[Blog1]
[Blog2]
[Code]
[W&B]
[Models]
[X]
SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
Yiping Wang, Hanxian Huang, Yifang Chen, Jishen Zhao, Simon Shaolei Du, Yuandong Tian
preprint 2025
[Arxiv]
Mojito: Motion trajectory and intensity control for video generation
Xuehai He, Shuohang Wang, Jianwei Yang, Xiaoxia Wu, Yiping Wang, Kuan Wang, Zheng Zhan, Olatunji Ruwase, Yelong Shen, Xin Eric Wang
preprint 2024
[Arxiv]
|