Publications

Publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. UAI
    Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
    Runze Zhao*Yue Yu*, Adams Yiyue Zhu, Chen Yang, and Dongruo Zhou
    2025
  2. arXiv
    Instance-dependent continuous-time reinforcement learning via maximum likelihood estimation
    Runze Zhao*Yue Yu*, Ruhan Wang, Chunfeng Huang, and Dongruo Zhou
    arXiv preprint arXiv:2508.02103, 2025
  3. arXiv
    On the Limits of Test-Time Compute: Sequential Reward Filtering for Better Inference
    Yue Yu, Qiwei Di, Quanquan Gu, and Dongruo Zhou
    arXiv preprint arXiv:2512.04558, 2025