📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google argues that AI models constitute only about 10% of the software development process. The real focus should be on harnessing and verifying AI outputs through structured context and configuration, which has significant implications for development costs and strategies.

A new Google whitepaper, The New SDLC With Vibe Coding, emphasizes that the AI model accounts for only about 10% of the software development process. Instead, the focus should be on the harness—the prompts, tools, and configurations surrounding the model—and on verification. This shift has major implications for how organizations deploy and manage AI in development workflows.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the dominant part of AI-driven development is not the model itself but the configuration, tools, and policies that guide its behavior. Experiments cited show that adjusting the harness—such as prompts, rules, and middleware—can dramatically improve performance, even with the same underlying model. For example, moving a coding agent from outside the Top 30 to the Top 5 on a benchmark involved only harness modifications.

Furthermore, the paper distinguishes between vibe coding—quick, minimal prompts—and agentic engineering, which involves structured verification, testing, and oversight. The authors argue that verification, judgment, and context engineering are the new craft, shifting the focus from model innovation to configuration and process management.

At a glance
reportWhen: published early 2026
The developmentThe Google whitepaper highlights that the core of effective AI-assisted development lies in harnessing and verifying the AI, not just the model itself.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This insight shifts the strategic focus for organizations adopting AI. Instead of investing heavily in the latest models, companies should prioritize building robust harnesses—the configurations, tools, and verification processes that shape AI behavior. This approach can lead to significant cost savings and more reliable outcomes, as the paper notes that ad-hoc prompting can cost 3–10 times more per feature than disciplined, structured development.

By understanding that the model is only 10% of the equation, organizations can better allocate resources, reduce vulnerabilities, and improve the quality of AI-generated code and decisions. This paradigm encourages a shift from model chasing to process engineering, which has long-term benefits for scaling AI in enterprise settings.

Mastering Codex for Parallel AI Agents: Run multiple AI agents at once and verify their work — a non-engineer's guide to supervising Codex (Codex Mastery Series Book 2)

Mastering Codex for Parallel AI Agents: Run multiple AI agents at once and verify their work — a non-engineer's guide to supervising Codex (Codex Mastery Series Book 2)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI Coding and Development Practices

The whitepaper builds on recent trends where AI tools are now used by over 85% of professional developers, with roughly 41% generating new code via AI. Prior to 2026, the focus was on model improvements; now, the emphasis has shifted to how models are integrated and controlled within workflows. The concept of vibe coding, popularized in early 2025, is contrasted with the emerging discipline of agentic engineering, which emphasizes structure, testing, and verification.

This development reflects a broader industry understanding that AI’s value depends heavily on how it is used and controlled, not just the underlying technology. The experiments cited in the paper demonstrate that small changes in configuration can produce outsized improvements in AI performance, reinforcing the importance of process over raw model power.

“The biggest shift in software engineering isn’t a new language or framework; it’s moving from writing code to expressing intent and trusting machines to realize that intent.”

— Addy Osmani

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unexplored Aspects of Implementation and Cost

While the paper presents compelling evidence that harness and verification are critical, it does not specify exact best practices for different types of projects or organizational sizes. The long-term impact on cost savings and security remains to be validated across diverse real-world scenarios. Additionally, the optimal balance between upfront configuration costs and ongoing operational expenses is still under discussion.

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

lweiyupeixx Press Model Separator Press Type Automatic Model Parts Detacher Part Separation Tool Hobby Assembling Model Ergonomic

Effortlessly separate model components with our Press Type Model Separator, enhances efficiency and minimize damage risk.

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations Adopting AI-Driven Development

Organizations should begin assessing their current AI workflows to identify configuration and verification bottlenecks. Developing structured harnesses, investing in testing frameworks, and training teams in context engineering will be key. Further research and case studies are expected to clarify best practices and quantify long-term benefits, especially around cost and security.

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)

AI Model Validation & Testing: Ensuring Reliable AI Systems — Bias Testing, Robustness Evaluation & Regulatory Compliance (AI Compliance Toolkit)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the AI development process?

The whitepaper argues that the model’s behavior is heavily influenced by how it is configured, guided, and verified. The surrounding harness—prompts, tools, rules—accounts for roughly 90%, making it the primary driver of performance and reliability.

How can organizations improve their AI outputs without changing models?

By focusing on harness design—adjusting prompts, adding tools, setting rules—and implementing rigorous verification and testing processes, organizations can significantly enhance AI performance using the same underlying models.

What are the risks of neglecting harness and verification?

Neglecting these aspects can lead to higher costs, security vulnerabilities, and unreliable outputs, as most failures are configuration-related rather than model-related, according to the whitepaper.

Does this mean AI models are becoming less important?

Not less important, but the whitepaper highlights that their impact is limited without proper harnessing, configuration, and verification. The focus shifts from model innovation to process engineering.

What should development teams prioritize next?

Teams should prioritize building and refining their harnesses—prompts, tools, rules—and establishing robust verification processes to maximize AI effectiveness and control costs.

Source: ThorstenMeyerAI.com

You May Also Like

Readiness: Before You Fund The Answer

A new diagnostic tool offers organizations a quick, 20-minute assessment to determine if their AI investments are poised for success or failure.

Xbox weighs canceling Blade game and shuttering Arkane

Microsoft is reportedly weighing the cancellation of the Blade game and the closure of Arkane Studios, according to sources. The moves could impact upcoming projects.

Best Portable External Hard Drives Compared

Compare top portable external hard drives based on size, speed, durability, price, and features to find the best fit for your storage needs.

Samsung’s new wide foldable phone revealed in leaked images

Leaked images reveal Samsung’s upcoming Galaxy Z Fold 8 Ultra, featuring a wide foldable design, triple rear camera, and new case options, set for release this July.