Large-scale Generation Dataset

This dataset contains answers from 11 LLMs to four heavy-reasoning tasks: AIME2024, AIME2025, GPQA-Diamond, and MATH500. Each problem has at least 80 answers.

Downloads

License

This dataset includes text generated by LLM models. All LLMs used here are licensed under the Apache License 2.0 (Apache-2.0) except for EXAONE-Deep-32B (EXAONE AI Model License Agreement), NVIDIA-Nemotron-Nano-9B-v2 (NVIDIA Open Model License), and Phi-4-reasoning (MIT License). We make the generated tokens of Apache-2.0, MIT, and NVIDIA models available under terms equivalent to CC0 (public domain dedication), to the extent permitted by law. NVIDIA outputs are considered user-owned under the NVIDIA Open Model License. The outputs of EXAONE-Deep-32B remain the property of the Licensor (LG AI Research) and may only be used for non-commercial research purposes, as stated in its license. We believe releasing these generated tokens does not raise copyright concerns. However, if any issues are identified, we will withdraw the dataset. No warranty is provided; use at your own risk.

Citation

@misc{komiyama2025bestofinftyasymptoticperformance,
  title = {Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute},
  author = {Junpei Komiyama and Daisuke Oba and Masafumi Oyamada},
  year = {2025},
  eprint = {2509.21091},
  archivePrefix = {arXiv},
  primaryClass = {stat.ML},
  url = {https://arxiv.org/abs/2509.21091}
}

← Back to Project Page