Part 3: Production

Chapter 19: Operating Agentic Systems

Production agents need ongoing care and feeding. They're not "set and forget."

Continuous Improvement

Your agent should get better over time:

Change Management

Changes to agent DNA can have unexpected effects.

Incident Response

When things go wrong, have a plan:

  1. Detect: Know something is wrong (via monitoring)
  2. Triage: Assess severity and impact
  3. Mitigate: Stop the bleeding (disable agent if necessary)
  4. Investigate: Understand root cause
  5. Fix: Address the underlying issue
  6. Learn: Update processes to prevent recurrence

Governance and Accountability

Agentic systems make decisions. Someone needs to be accountable.

Operational Reality

Plan for ongoing operations from day one. The work doesn't end at deployment — that's when the real work begins.

☰ Contents