Publications
For the comprehensive list, check out my Google Scholar page.
(* denotes equal contribution or alphabetic ordering, † denotes corresponding author)
2025
2024
2023
2022
Preprints
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang†, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang†, Simon Shaolei Du†, Yelong Shen†
Preprint
[Arxiv]
[Code]
[W&B]
[Twitter]
|