Treating AI like an engineering team
Like everyone, I've been absolutely drinking from the firehose of new AI powered workflows, most of which promise to make all of my wildest dreams come true (or at least help me actually finish some of my many, many, side projects...). I've been talking with a few people recently about the workflows that have become the most useful to me, and how I maximize the quality of results from the various LLM and AI tools that I've started using day to day, and I thought I would write some quick thoughts on what I've found to be the most effective.
The short version: Treat your AI tools like engineers. Human engineers. Talk with them, plan with them, pair program with them. This produces great results, and can help you avoid the runaway effect of an AI tool going off the rails and getting stuck, or creating something you don't want.
Application Development
Having an LLM create an application from the ground up for the first time is one of the coolest tech experiences I've had in quite a while (up there with the first time I used a modern VR headset, or when I got my first Android phone). Giving an LLM (either chat interface or agent) a description of an app, and watching it spin for a bit and spit out code that looks like something a competent engineer would write is neat. And then you go to run it and...it doesn't quite work. No worries, the LLM can fix it, right? Yeah, sometimes. Other times, it may do something silly, like delete half the app trying to fix a missing import error. It can get frustrating pretty quick.
I've found, after many trials and frustrations that there is an inverse relationship between how much agency you give the AI and the quality of what it spits out. Much like human engineers. You give them a broad, poorly defined task, and you can't guarantee that what you'll end up with is what you originally wanted. This isn't anything new or novel, lots of other people have written about this, but I think it bears repeating: Like engineers and engineering teams, I've found LLM's to work their best when given very clear, very small, very directed tasks. You can still build an application from nothing, even something quite large, but you need to approach it with a reasoned set of requirements.
My process is as follows, for a brand new app, with no code currently written:
- I use a chat LLM (I have a max subscription for Claude, and its latest models are my goto) to describe my idea in as much detail as I can. Really dig in. I then prompt the LLM to ask me any clarifying questions that it would need to come up with a proper design document for the idea. This will usually spur several rounds of question and answer, which I find to be excellent and helping me suss out behaviors and flows that I have not thought about, or not fully fleshed out.
- Once I am happy with the general context of the idea, I ask the LLM to create a design document based on our conversation (I will also add things like the language I want the project to use, and any additional things I want included).
- I'll iterate with the LLM on the design doc, to make sure I am happy with it. At this point, I instruct the LLM to keep all code (except for basic pseudo code listings) out of the document. This produces our baseline document on which the next steps are based.
- The next set of instructions I ask of the LLM is to create a phased approach for development of the design doc. I ask it to be as explicit as possible, while still mostly omitting code listings.
- This produces a nice list of phases that clearly outline steps to take to implement each piece of the application (Claude will even add timeframes on these most of the time; unnecessary, but nice).
- For each phase, I ask the LLM to drill in and start creating an implementation document, further decomposing each phase into steps. Depending on the size of each step, we may drill in further, creating detailed steps to accomplish a given step. This produces a very explicit set of instructions to follow, that I have vetted based on my own experience.
- I take each phase breakdown, and the high level design doc, and add them to a
documentation
directory in my new project (usually this is the only thing in the project, at this point).
The artifacts we have produced so far are useful, and are often enough for an experienced engineer to go and start writing code. But...we want the AI to do that for us. So, we continue, ditching the chat interface, and moving to an agentic tool (Claude Code, in my case). We could continue using the chat interface, and copy/paste the code it produces into our project, but I am much more fond of using an agent directly on my project; its much faster at editing files directly, and can more easily build context around your source code.
- I fire up my coding agent on my newly created project, with my fleshed out
documentation
directory in place. I then instruct the agent to examine the docs in the project, and create a directive file (CLAUDE.md in my case). We'll keep this file up to date as we progress through the project, ensuring the agent can keep track of the nuances of our project are clear across sessions. - Now for the fun part: I instruct my agent carefully read over the first phase of development, and present to me a plan for implementation of the first step of the phase plan. I also, as above, instruct it to ask me any clarifying questions it needs to get started. After a bit of back and forth, we have a step by step implementation plan, that I have vetted, for the first step, of the first phase of the project. I tell my agent to begin implementation.
And then we iterate, in exactly this fashion. I keep a close eye on what the agent is doing (auto-accept edits off), and will stop it mid-cycle if I see something I disagree with (usually by asking if it thinks that what its doing is best, and presenting an alternative). I end each step by asking if it thinks we can simplify anything, to make our code both cleaner and easier to work with for future steps and phases. And on and on...
I've found this produces consistently good results, but it has a few caveats:
- Its time intensive; sometimes I feel like I would be better served just writing the code myself, instead of babysitting an agent
- It requires close attention; but, so does writing the code yourself.
Throwing large problems at an LLM will inevitably result in...something, but it likely won't fit your vision exactly, and in many cases won't work like you want at all. Breaking down the problems into the smallest reasonable chunk of work, though, yields much better results. In my experience, this approach works well for both large, and small scale projects.
To summarize: Treat your LLMs and agents like any other engineer (or team of engineers):
- Run planning exercises much like you do with a team, iterating on designs, and making sure to ask and answer pointed questions to remove ambiguity
- Create artifacts that can be referenced later
- Treat code creation like a pair programming exercise, working with your LLM or agent as you would any other engineer