I Got Certified in Agentic AI — After a Year of Shipping It // yokai

DeepLearning.AI Agentic AI certificate awarded to Jacob Leyva

The Short Version

I completed DeepLearning.AI's Agentic AI program and earned the certificate above. Going in, I had a hypothesis: I'd already been building agentic agents and orchestrators in production for paying work, so most of the curriculum would be naming things I'd already shipped. That turned out to be largely true — with one genuinely useful exception I'll get to.

This post is the honest version. Not "I learned agentic AI." More like: here's a field that finally has shared vocabulary for patterns I'd been building by feel, and here's the one area where the program made me sharper.

What the Program Covered

Stripped of branding, the labs walked through the core agentic design patterns:

Tool-calling / ReAct loops — give a model a set of tools, let it choose which to call automatically, run the tool, feed the result back, and repeat until it returns a final answer. The canonical agent loop.
Task decomposition & planning — break a goal into discrete steps where each step is something an LLM, a function call, or a short piece of code can actually do. Including the more autonomous version: let the model write the plan (even as executable code) instead of hard-coding the sequence.
Multi-agent workflows — instead of one model prompted over and over, a team of specialized agents (planner, researcher, writer, editor) that each own a slice of the task and hand off to each other.
Reflection — have a model critique its own first draft and revise it, rather than shipping the first thing it produces.
Code execution — let the model write and run code to solve a task, instead of building a separate tool for every possible operation.
Evals, error analysis, and cost/latency — how to actually measure whether any of the above is working, where to focus improvement effort, and when to optimize for speed and cost.

The capstone tied it together: a research agent that searches arXiv, the web, and Wikipedia through tool-calling, drafts a report, reflects on it, and produces a polished result.

What I'd Already Built

Here's why the hypothesis held. Before this program I'd already shipped, for real client and personal work:

Multi-agent orchestrators — systems that coordinate several specialized agents through a structured task loop, with persistent run logs for every job. The "team of agents" pattern, in production.
Tool-using agents with real execution — agents that don't just describe an action but actually run shell commands and read and write files against a real workspace. The course's tool loop is a sandboxed version of this; mine execute for real.
Self-hosted LLM backends — a private, team-shared model server I tuned for throughput, so the agents run against models I control rather than a metered API.
A compliance assistant for a regulated lender — a domain-specific agentic system with cited retrieval, role-locked access, and a swappable model backend, taken from idea to live in weeks.

Several of those are written up elsewhere on this site. The point isn't that the course taught me these — it's that the field's curriculum and my shipped work converged on the same patterns. That convergence is the most reassuring signal you can get as a builder: the things you reached for under deadline pressure turn out to be the documented best practices.

The single biggest lesson of the program — separate the orchestration layer from the model's reasoning — is the exact architectural decision I'd already committed to. The system runs the loop; the model decides the next move inside it. Building that distinction first, then hearing it taught as the key insight, was the most satisfying part of the whole program.

What the Cert Actually Added

If I claimed I learned nothing, that'd be ego talking. The genuinely additive part was evaluation discipline.

I've always been strong on building agentic systems. The program's contribution was the measurement rigor around them:

Error analysis — reading real outputs, tagging where the system falls short, and using that to decide which component to improve next instead of guessing.
Eval-driven iteration — building lightweight evals from observed failures rather than imagining failure modes up front.
Quantifying reflection — reflection usually helps, but it costs an extra step. The discipline is to actually measure how much it helps before committing to it, rather than assuming.
Cost and latency tradeoffs — when to chase output quality first (almost always) and when optimizing per-request cost and speed becomes the real problem (once you have the users to make it one).

That's the part I'm carrying forward. Building agentic systems is a skill I already had. Proving they work, finding where they break, and knowing where to spend the next hour of effort — that's the muscle this program built.

Why It Matters

For anyone hiring or partnering: this is a credential that confirms what the rest of this projects page already shows. I've been building agentic and multi-agent systems in production, and now there's an independent stamp on the methodology behind them.

For other builders: don't wait for permission to build the real thing. The patterns aren't gatekept — tool loops, decomposition, reflection, and orchestration are learnable by doing. But once you've built them, go get disciplined about evals. That's the difference between a demo that impresses once and a system you can actually trust in front of users.

I built first. The cert formalized it. Both were worth doing.