C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
Published in NeurIPS 2025, 2025
NeurIPS 2025 work on cost-aware LLM cascades that enforce probabilistic compute budgets while sustaining reasoned accuracy.
Download here
