📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper emphasizes that in AI-assisted software engineering, the actual AI model is only a small part of the system. The focus should be on the harness and context engineering, which drive performance and cost-efficiency.

A new Google whitepaper, The New SDLC With Vibe Coding, states that the AI model accounts for only about 10% of a system’s behavior. The paper emphasizes that the real skill lies in configuration, verification, and context engineering, which collectively determine system performance and cost. This insight challenges common assumptions about AI development and has significant implications for how teams allocate resources and design workflows.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, highlights that most failures in AI agents stem from configuration errors—missing tools, vague rules, or noisy context—rather than the underlying model. Experiments cited in the paper show that changing the harness—prompts, tools, and middleware—can dramatically improve agent performance, often more than upgrading the model itself. For example, a coding agent moved from outside the Top 30 to the Top 5 on a benchmark by only modifying the harness, not the model.

The authors argue that costs are driven by token economy, making disciplined, structured approaches more economical in the long run. While vibe coding appears free initially, it incurs high operating costs due to token waste, maintenance, and security vulnerabilities. Conversely, investing upfront in schema design, testing, and context management reduces marginal costs over time.

At a glance
reportWhen: published March 2026
The developmentThe new SDLC framework shifts focus from the AI model to configuration, verification, and context management as the primary drivers of successful AI integration in software development.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This whitepaper fundamentally shifts the understanding of AI system design, emphasizing that the real value lies in how AI is integrated and controlled, not in the model itself. Organizations that focus on harness and context engineering can achieve better performance at lower costs, gaining a durable competitive advantage. It also suggests that AI teams should prioritize configuration, tooling, and verification over chasing the latest models.

AI Model Risk Blueprint: Model Validation Testing | Ethical Considerations in AI Models | Integrating AI with Business Risk Plans | Real-World AI Model ... Strategies | AI Governance Tools & Resource

AI Model Risk Blueprint: Model Validation Testing | Ethical Considerations in AI Models | Integrating AI with Business Risk Plans | Real-World AI Model … Strategies | AI Governance Tools & Resource

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on the Evolving AI Development Paradigm

As AI adoption accelerates, many teams have historically focused on acquiring or upgrading models, assuming performance improvements come primarily from better models. However, recent experiments and industry reports indicate that system behavior is dominated by how models are integrated and guided. The whitepaper builds on earlier insights from Andrej Karpathy and others, reinforcing that the shift towards ‘agentic engineering’—structured, verified, and well-scaffolded AI workflows—is now central to successful AI deployment.

“The model is only 10% of what determines behavior; the harness and context are 90%.”

— Addy Osmani

Software Change And Configuration Management A Complete Guide - 2021 Edition

Software Change And Configuration Management A Complete Guide – 2021 Edition

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of Implementation and Industry Adoption

It is not yet clear how widely organizations will adopt the recommended practices of harness and context engineering. The specific methods for scaling these approaches across diverse projects and teams are still being developed. Additionally, the long-term impact on AI model development cycles remains to be seen, as some industry players may continue to prioritize model upgrades.

AI-Native Software Delivery: Proven Practices to Produce High-Quality Software Faster

AI-Native Software Delivery: Proven Practices to Produce High-Quality Software Faster

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Teams and Developers

Organizations should evaluate their current AI workflows, focusing on improving harness and context management. Future research and industry discussions are likely to explore standardized frameworks for structured AI integration, along with tools that facilitate better configuration and verification. Monitoring how these practices influence cost and performance over time will be critical.

Context Engineering for Claude: A Practitioner's Guide to CLAUDE.md, Memory Tools, and Three-Layer Workflows for Solopreneurs, Freelancers, and Product Managers

Context Engineering for Claude: A Practitioner's Guide to CLAUDE.md, Memory Tools, and Three-Layer Workflows for Solopreneurs, Freelancers, and Product Managers

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of system behavior?

The whitepaper explains that most of the system’s behavior depends on how the model is integrated, configured, and guided through prompts, tools, and verification processes, which constitute the remaining 90%.

What is harness engineering in AI development?

Harness engineering involves designing and managing the prompts, tools, rules, and observability layers that control how the AI model operates within a system, significantly impacting performance and reliability.

How does cost factor into this new approach?

Cost is driven by token economy; disciplined harness and context management reduce unnecessary token usage, lowering operational costs compared to vibe coding, which appears cheap but is more expensive long-term.

Will this change how AI models are developed?

While model development continues, the whitepaper suggests that the focus should shift towards better system integration, configuration, and verification to maximize value and control costs.

Is this approach applicable to all AI systems?

The principles are broadly applicable, especially in professional and enterprise AI deployments where performance, cost, and reliability are critical.

Source: ThorstenMeyerAI.com

You May Also Like

Appointment no-show recovery planner for therapy practices

A new appointment no-show recovery planner is being tested for small therapy practices to reduce missed appointments and improve scheduling efficiency.

One-idea-per-email drip platform for developer onboarding

A developer-relations lead at a startup is testing a new email drip tool focused on delivering one technical idea per message to improve onboarding activation.

Purchase order exception tracker for small manufacturers

Small manufacturers are testing a new purchase order exception tracker to improve handling of supplier issues amid supply volatility.

Data processing agreement tracker for micro SaaS teams

A new DPA tracker designed for founder-led SaaS teams aims to simplify vendor and customer data paperwork management, addressing compliance needs.