2026-02-23

Working with AI: February 2026 edition

code

A few months ago I wrote about how I work with AI (in Spanish). Since then, my workflow has changed quite a bit, so here's an updated version.

Index

The tools

My setup is simple now: Claude Code with a Claude Team subscription, running locally. That's the AI model side of things.

On top of that, I built ralph, a tool that wraps Claude Code into an opinionated development workflow. It takes a feature description, breaks it into right-sized user stories, and implements them one by one — handling the full cycle: planning, coding, testing, and committing.

Ralph comes in two flavors:

  • ralph CLI — an interactive tool for hands-on work
  • autoralph — a daemon that runs autonomously, integrated with Linear and GitHub

I'll explain how I use each one below.

Two kinds of work

Not all tasks are the same. I split them into two categories:

1. Clear, scoped work — The requirements are well defined. You know what needs to happen. Examples: "add a webhook endpoint for Stripe events", "migrate this table to the new schema", "add validation to the signup form".

2. Exploratory work — The path isn't obvious. Maybe the bug is hard to reproduce, or you're evaluating different approaches, or the requirements need shaping as you go. Examples: "figure out why payments are failing intermittently", "find the best way to implement real-time notifications", "refactor the auth module".

For the first kind, I let the AI work autonomously. For the second, I stay close and drive.

Autonomous work: autoralph

For clear, scoped tasks, I use autoralph. It's a daemon that runs locally, watching for work and executing it end to end. I interact with it through two interfaces depending on the phase: Linear for pre-code work and GitHub for code work.

The full lifecycle looks like this:

  ┌─────────────────────┐
  │  Assign issue        │ Linear
  │  Refine via comments │
  │  Approve             │
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  Create workspace    │
  │  Break into stories  │ autoralph
  │  Implement (TDD)     │
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  Open PR             │
  │  Review + feedback   │ GitHub
  │  Merge               │
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  Clean up + update   │ autoralph
  └─────────────────────┘

Refinement in Linear

When I create a Linear issue and assign it to autoralph, it picks it up and starts a refinement conversation. The bot posts clarifying questions as comments on the issue — things like "should this endpoint be authenticated?" or "do you want to handle the edge case where X?".

I respond via comments. We go back and forth until the requirements are solid. When I'm happy with the scope, I approve it with I approve this.

Implementation

Once approved, autoralph creates an isolated workspace (git worktree + branch), generates a specification from our conversation, breaks it into stories, and starts building. Each story goes through the full TDD cycle: write a failing test, make it pass, refactor. Quality gates (tests, linter, type checker) must pass before anything gets committed.

When all stories are done, it runs a QA phase to verify everything works together, then opens a GitHub pull request.

Code review in GitHub

From here, I interact exclusively through the PR. I review the code, leave comments, request changes — the same flow as with any human contributor. Autoralph picks up review comments and CI failures, pushes fixes, and iterates until the PR is ready to merge.

When things go wrong

Since autoralph uses the ralph CLI under the hood and runs locally, I can always intervene. If something goes off track, I switch to the workspace (ralph switch <name>) and take over directly. This is one of the things I like most about the setup: it's autonomous until it isn't, and the transition is seamless.

Hands-on work: ralph CLI

For exploratory tasks — debugging, prototyping, investigating — I use the ralph CLI directly. The workflow is more interactive:

  ┌─────────────────────┐
  │  ralph new <name>    │ describe + plan
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  ralph run           │ implement stories
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  ralph chat          │ course-correct
  └────────┬────────────┘
           v
  ┌─────────────────────┐
  │  ralph done          │ merge + clean up
  └─────────────────────┘
  1. ralph new <name> — Creates an isolated workspace and starts a conversation. I describe what I want to do, Claude asks clarifying questions, and together we shape a plan.
  2. ralph run — Ralph implements the plan story by story: write a failing test, make it pass, run quality checks, commit, next story. I can watch progress in real time via the terminal UI.
  3. ralph chat — If I need to course-correct mid-flight, I can drop into an interactive session with full project context.
  4. ralph done — Squash-merges everything into main and cleans up the workspace.

The key difference from autoralph is that I'm in the loop. I can stop, adjust, try something different, or scrap a direction entirely. For work that requires judgment calls or rapid iteration, this is much more effective than fully autonomous mode.

Parallel workspaces

One thing I really like about the ralph CLI is how it lets me work on multiple things at the same time. Each ralph new creates an isolated workspace using git worktrees under the hood — so every task gets its own branch and working directory, completely independent from the others.

This means I can have autoralph working on a feature autonomously in one workspace, while I'm hands-on debugging something in another, and have a third one paused mid-implementation waiting for my input. They never step on each other's toes. It's the closest thing I've found to having multiple developers working on the same repo without merge headaches.

I also use ralph chat on its own quite often — sometimes I just want to think through a problem with an AI that has full context of the codebase, without necessarily committing to a plan.

What changed since last time

Compared to last time:

  • Fewer tools, more focus. I went from juggling Claude, Gemini, and ChatGPT to just Claude. The quality gap narrowed, but more importantly, having one model that I've tuned my rules and workflow around works better than switching between three.
  • From editor-integrated agents to a dedicated tool. I used to run agents inside Zed. Now ralph manages the entire lifecycle outside the editor, from planning to PR. The editor is just for reading code and manual edits.
  • From manual orchestration to autonomous loops. The biggest shift. Instead of manually passing context between tools and managing the back-and-forth, autoralph handles the full cycle. I spend my time on refinement and review, not on babysitting the agent.

This is still evolving. What works today might not be the best approach tomorrow, but this setup has been working well for me.