Explores conditional layer execution policies that trade depth for latency on resource-constrained deployments.
Reports 20% lower inference cost on reasoning benchmarks with negligible accuracy loss.