Test-time Scaling Techniques in Theoretical Physics–A Comparison of Methods on the TPBench Dataset

Published in NeurIPS 2025 Machine Learning and the Physical Sciences (ML4PS) Workshop, 2025

Authors: Zhiqi Gao, Tianyi Li, Yurii Kvasiuk, Sai Chaitanya Tadepalli, Maja Rudolph, Daniel J.H. Chung, Frederic Sala, Moritz Münchmeyer

Recommended citation: https://arxiv.org/abs/2506.20729