Prompt & Armature Method: A Discipline for AI-Assisted Feature Development

Introduction

The most interesting use (to me) of an AI coding agent like Claude Code has always been on a brownfield project - a large, complex existing codebase where any new contributor is facing a steep learning curve. Such an AI assistant can feel like a great power tool: when used with clear direction, it delivers reliable implementation, but without that structure, it tends to amplify inconsistencies already present in the codebase.

In the past few months a number of structured methodologies have emerged, and begun to mature - most notably Agentic TDD and Advanced Context Engineering. Each has its strengths and trade offs, and I've used both (and others) with varying degrees of success. Along the way, some of my early assumptions about LLM-assisted engineering have been challenged, and occasionally overturned. The experience has been humbling.

Through this process, I discovered a pattern that consistently works for the brownfield projects, the one achieving a surprisingly high degree of reliability, predictability, and accuracy.

Spec to Armature

It always starts with an engineering spec - but I don’t feed that into an LLM and expect miracles. Instead, I design the armature first: every class and its single public method, the inputs and outputs, and the contracts between components. I don’t write code at this stage, not even pseudocode. Inside each class, I leave a comment describing its purpose and the I/O for its public method. If the class touches the database or calls an external API, I specify the data structures and request/response format with absolute clarity.

I apply the Single Responsibility Principle, wherein each class does exactly one thing. If your style differs, you can apply that granularity at the method level, but the goal remains the same: clear boundaries, explicit contracts, and predictable entry points.

Facades and Orchestrators

This is the most critical layer of any feature. The composition tier is made up of facades and orchestrators, and the specifications must be written with absolute precision. Rigour is the objective here: any engineer and AI agent should be able to understand completely the full control interface of a system simply by reading them.

Document every dependency and interaction within their public methods — which subordinate classes are invoked, in what order and under which preconditions. Capture the pre-flight checks, strategy selections, side-effect boundaries, and failure-handling logic.

Tests

Every class from the armature deserves tests. I follow the same pattern: create an empty test class, then add multiple test methods - each with a single, clear purpose. This stage is harder than designing the armature because you must anticipate scenarios per class. At the very minimum, cover the happy path to guard against regressions, and add a minimal set of edge cases or error conditions you can already foresee. For each test method, include a brief comment explaining what you’re testing and how.

As expected, facade and orchestrator tests are more complex. They must cover every pathway but are designed the same way: one test class per class with multiple methods or methods using data providers, depending on your style. If you use the latter, I recommend designing the data providers manually and not delegating it to the LLM. In either case, confirmation that all paths are covered is ultimately your responsibility.

Database

Any new tables introduced within a feature are your responsibility. You are to build the migrations - though you can safely leave the model scaffolding to the LLM, since that’s largely mechanical code generation.

Your feature will also likely need to access existing tables. That should have been accounted for while designing the armature. The cleanest approach, in my experience, is to use dedicated repository classes for database access, though you can permit direct model access within service classes if that’s your style.

Prompt

To flesh out all of these mechanics, I employ a single structured prompt. This is not an exercise in creative writing and eloquence is irrelevant. The goal is precision and a high signal-to-noise ratio.

Here is what a well-designed feature prompt must include:

Instruction

A clear, self-contained instruction set describing how the agent should consume and execute the prompt.

Goal

The overall goal of the feature - a concise description that answers: "What exactly are we building?" This is not the place for every detail, only the essence.

Components

All components involved. Include the full path or namespace for every file in your armature. No further description is needed since each file already contains its own internal instruction. Listing them here minimises the agent’s need for exploratory ls, find or grep actions, further reducing noise.

Tests

List the location of all tests, just like components, and include instructions for running them.
If you want the LLM to test as it builds, instruct it to run individual tests instead of the full suite for faster iteration and token economy.

Database

List any new or modified migrations required for the feature.
For each migration, the agent should generate the corresponding model and define relationships where appropriate.
If the feature interacts with existing tables, specify the expected schema shape or relevant fields to ensure accurate query construction.

Example Prompt

# Sync Subscriptions to Salesforce Service
## Instruction
Build the complete feature as defined below. Follow these rules:
- Implement the listed classes exactly according to their docstrings. 
- Implement tests according to the existing stubs.
- Do not create additional files unless strictly required.
- After implementing each class, run its test individually: `php artisan test <path-to-test>`
- For any migrations listed in Database section, create the corresponding model and relationships where appropriate.

## Goal
Build a service and supporting classes that synchronize our user subscriptions to Salesforce, marking inactive ones appropriately when no active plan exists.
---

## Components
app/Domain/Subscriptions/Services/SyncSubscriptionsToSalesforceService.php  
app/Domain/Subscriptions/Repositories/GetLatestSubscriptionRepo.php  

## Tests
tests/Domain/Subscriptions/SyncSubscriptionsToSalesforceServiceTest.php  
tests/Domain/Subscriptions/GetLatestSubscriptionRepoTest.php

## Database
database/migrations/0200000_add_salesforce_account_id_to_users_table.php

## Engineering practices
- Follow PSR-12 coding standards.
- All services are to have a single public method - run()
- All repositories are to have a single public method - perform()
- Avoid closures and `map()` in service logic.

Checkpoint

Once your prompt is complete, save it as an .md file and make a git commit.
Then start a fresh session of your coding agent. If the session isn’t fresh, run /clear first and re-instruct it with the prompt.

Work validation

Now that the feature is complete, responsibility shifts back to you. Scrutinising the agent’s work is the engineer’s job. My validation process is:

Run the test suite.
Inspect the tests for coverage and intent.
Review the code
Run /review and follow the automated review process.
Hand off the feature for peer code review and E2E/staging validation.

If issues arise, I address them manually or instruct Claude to fix them in a fresh session. For rare cases where the output becomes unrecoverable, I would revert to the previous checkpoint - though this has not happened in practice.

Conclusion

This methodology may seem laborious, but it consistently solves three major problems for me:

The LLM respects the mental model of my feature — it builds exactly what I designed, following my engineering practices.
Because I design the feature architecture, no code is introduced that falls into my blind spot - I retain full control of the codebase.
Life is too short to write HTTP wrappers and object key mappings. LLMs handle these with machine-like precision.

Ultimately, this creates a distinct separation of roles: I engineer the software; the LLM acts as the programmer. That division keeps the process disciplined, creative, and human-centered.