Part 3: Production
Chapter 18: Cost Management
Agentic systems can be expensive. Every reasoning step, tool call, and generated token costs money. Without attention, costs can spiral.
Understanding the Cost Drivers
- Input tokens: Context sent to the model
- Output tokens: What the model generates
- Tool calls: External API costs plus processing
- Iterations: Complex reasoning multiplies all of the above
Cost Optimisation Strategies
- Right-size your model: Not every task needs the most powerful model
- Optimise context: Large contexts mean large costs. Keep context lean
- Limit iterations: Set reasonable bounds on agent loops
- Cache where possible: If the same queries recur, consider caching responses
- Monitor and alert: Track cost per conversation. Set budgets and investigate anomalies
The Cost-Quality Trade-off
Cheaper isn't always better. A cheaper model that gives poor answers damages trust.
Key Principle
The goal is optimal cost for acceptable quality. Understand your quality threshold first, then optimise cost to meet it — not the other way around.
