Cita APA (7a ed.)

Aggarwal, P., Ghazvininejad, M., Kim, S., Kulikov, I., Lanchantin, J., Li, X., . . . Zhao, W. (2026). Reasoning over mathematical objects: On-policy reward modeling and test time aggregation.

Cita Chicago Style (17a ed.)

Aggarwal, Pranjal, et al. Reasoning over Mathematical Objects: On-policy Reward Modeling and Test Time Aggregation. 2026.

Cita MLA (9a ed.)

Aggarwal, Pranjal, et al. Reasoning over Mathematical Objects: On-policy Reward Modeling and Test Time Aggregation. 2026.

Precaución: Estas citas no son 100% exactas.