Insights · Production AI

Architecture Insights

Deep technical pieces from real production experience. No content marketing. No sponsored posts.

RAG · Production

Why 95% of RAG Demos Fail in Production

The seven failure modes that kill RAG systems between the notebook demo and production — with the exact fixes.

12 Jun 2026 · 9 min read

Read →

Protocols · MCP · A2A

MCP vs A2A: Which Protocol Should Your Enterprise Bet On?

MCP and A2A solve different problems — one connects agents to tools, the other connects agents to agents. A practical decision guide for enterprise architects, from someone running both in production.

10 Jun 2026 · 7 min read

Read →

Agentic AI · Architecture

AI Agents vs Agentic AI vs Workflows: What Your Enterprise Actually Needs

Most "agentic AI" projects fail because they used an agent where a workflow belonged. A field guide to choosing between workflows, single agents, and multi-agent systems — with the autonomy ladder I use with clients.

03 Jun 2026 · 6 min read

Read →

Security · AI Agents

Prompt Injection: Your AI Agent's Biggest Production Risk

The moment your agent reads untrusted content and holds real tools, every document becomes a potential attacker. The layered defense stack we deploy for financial-services agents — because there is no single fix.

27 May 2026 · 7 min read

Read →

RAG · Fine-Tuning · Architecture

RAG vs Fine-Tuning vs Long Context: The 2026 Decision Framework

Knowledge, behaviour, or workspace? Most teams pick the wrong technique because they never separated the three. The decision framework I use on enterprise engagements, with the cost curves that decide it.

20 May 2026 · 7 min read

Read →

Agentic AI · Orchestration

Multi-Agent Orchestration Patterns That Survive Production

In 2026, orchestration — not model choice — is where enterprise agentic value is won or lost. The supervisor pattern, parallel fan-out, and the failure-recovery machinery that separates systems from demos.

13 May 2026 · 7 min read

Read →

LLMOps · Cost

The Real Cost of LLMs in Production: Token Economics for Enterprises

The demo cost ₹40 per query and nobody noticed. At 50,000 queries a day, the CFO noticed. The cost-optimization stack — routing, caching, context discipline — that cuts AI bills 60–90% without losing accuracy.

06 May 2026 · 7 min read

Read →

Evals · LLMOps

Evaluating AI Agents: Beyond Vibes and Demos

"We tried ten prompts and it looked good" has launched a thousand failing agent systems. How to build the evaluation harness — golden sets, trajectory checks, LLM judges, and gates — that lets you ship agents you can defend.

29 Apr 2026 · 7 min read

Read →

Agentic AI · Memory

Agent Memory: The Hardest Unsolved Problem in Agentic AI

Persistent memory is what separates an agent from a goldfish with tools — and it's where most agentic architectures quietly fall apart. Working, episodic, and semantic memory, and the governance problem nobody prices in.

22 Apr 2026 · 7 min read

Read →

Azure AI Foundry · Review

Azure AI Foundry: The Honest Enterprise Review

After deploying a bank compliance system on Azure AI Foundry: what genuinely works, what's marketing, and the gotchas the documentation doesn't mention. An architect's field review.

15 Apr 2026 · 8 min read

Read →

Fine-Tuning · Case Study

LoRA Fine-Tuning for Finance: How We Gained 31 Points Over the Base Model

How we fine-tuned a model on 12K RBI compliance examples and beat the base model by 31 percentage points — and why the win came from data preparation, not the training run.

08 Apr 2026 · 8 min read

Read →

No spam. Unsubscribe anytime. New Tuesdays.