Spec27 Blog

Insights on AI agent validation, LLM evaluation, regression testing, and safer deployment of AI systems in production.

We write about how teams can move beyond manual prompt testing and vibes-based reviews toward repeatable, evidence-driven validation for AI agents and applications.

All Posts
Enterprise Reality
Event
Practical Implementation

How to Automate AI Agent Validation

In a hurry? There's a preloaded example you can run in CI today, jump to how you can try this yourself.

Michael Wagstaff

Jun 235 min read

Alessio Lomuscio, CTO of Safe Intelligence, presenting Spec27

Spec27 at Google DeepMind Startup Session

On 11 June, we joined the Google DeepMind Startup Sessions at the Ministry of Sound in London – a morning of talks, demos, and live pitches bringing together AI startups and the Google DeepMind community. The lineup included a talk by Dr Raia Hadsell, a recap of Google’s annual developer conference, and a pitch showcase featuring around ten selected startups. We were glad to be one of them, presenting Spec27, our product for validating how AI agents behave before deployment a

Jovanca Garnadi

Jun 162 min read

Michal Cichra talks on BDD ADR PRD WTF: Capturing Decisions for Humans and AI Alike

BDR, ADR, PRD, WTF: Capturing Decisions for Humans and AI Alike

At AI Engineer Europe 2026, Spec27 Principal Engineer Michal Cichra explored why engineering decisions need to be captured clearly for both human teams and AI systems. Watch the talk and learn how BDD, ADRs, and PRDs connect to specification-driven validation.

Jovanca Garnadi

Jun 51 min read

How to Evaluate Third-Party AI Agents Without Code Access

Third-party AI agents can help teams deploy faster, but they also create a validation challenge: the buyer owns the operational risk without full visibility into the internals. This guide explains how to evaluate vendor-built AI agents from the outside using specifications, realistic test cases, pass criteria, re-validation, and evidence.

Jovanca Garnadi

Jun 38 min read

The Five Types of Enterprise AI Agents

How many agents do we have? And when do they start hiring their own agents? AI Agents are taking the enterprise by storm (at least on executive roadmaps!) and many leading AI tech companies are rolling out frameworks and platforms for AI Agents. Navigating through what works and what doesn't can be hard to keep track of. There's huge value in a lot of the deployments, but they also each present their own challenges. As the number of agents increases the interactions between t

Steven Willmott

May 138 min read