Building ArgueMint: A CLI Where AI Agents Debate Each Other

Every non-trivial decision comes with the same internal tug-of-war. The optimist in my head says "go for it." The pessimist says "here's why it'll fail." I wanted both voices externalized so I could see the arguments laid out instead of looping in my head.

So I built ArgueMint, a CLI where two AI personas debate any topic you throw at them. Took about a day, pair-programming with Claude Code.

╭─ ☀️  Sunny ──────────────────────────────────────────────────────────────────╮
│ This could be the best decision of your life. The job market for             │
│ experienced engineers is strong – your safety net is real...                 │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ 🌧️  Murphy ─────────────────────────────────────────────────────────────────╮
│ Let's talk about the 90% of startups that fail. Your runway isn't just       │
│ money – it's your mental health, relationships...                            │
╰──────────────────────────────────────────────────────────────────────────────╯

Useful for decision-making. Also entertaining to watch.

A Debate That Actually Changed My Mind

I was leaning toward rewriting a side project from scratch in a new framework. Ran it through ArgueMint with the builder_critic pair.

Mason (the builder) made the case: cleaner architecture, modern tooling, better DX. But Raven (the critic) asked the question I'd been dodging: "You've rewritten this twice already. Each time you got 60% through and lost momentum. What's different this time?"

Nothing was. I started improving the existing codebase instead. The debate surfaced what I already knew but wasn't admitting.

The Design Process

My first instinct was to just have two system prompts and alternate them. That works for a quick hack but falls apart when you want to pause mid-debate, inject your own comments, or skip to a summary.

So I spent time on the design before coding. The key decision was treating this as a state machine rather than a simple loop:

SETUP → RUNNING ⇄ PAUSED → SUMMARIZING → DONE

This handles interrupts cleanly. Pause emits an event. Resume picks up where you left off. Skip-to-summary transitions the state. The logic stays simple because each state has explicit transitions.

I ended up with 6 persona pairs. Initially I was going to ship with just optimist/pessimist, but naming the characters made me want to build more:

Pair	Characters
Optimist / Pessimist	Sunny vs Murphy
Builder / Critic	Mason vs Raven
Steelman / Strawman	Ada vs Richie
Philosopher / Pragmatist	Sophia vs Max
Devil's Advocate / Believer	Loki vs Faith
Expert / Novice	Prof vs Scout

"Sunny says..." reads better than "The Optimist says..." It makes the debates feel like conversations between people.

Architecture: Keep the Engine Pure

The engine knows nothing about LLMs or terminals. It manages state and emits events. That's it.

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   CLI       │────▶│   Engine    │────▶│ LLM Provider│
│  (Typer)    │     │(State Machine)│   │  (Ollama,   │
└─────────────┘     └─────────────┘     │   OpenAI)   │
                           │            └─────────────┘
                           ▼
                    ┌─────────────┐
                    │   Output    │
                    │   (Rich)    │
                    └─────────────┘

The engine emits on_agent_start, on_agent_chunk, on_agent_done. The CLI subscribes and renders with Rich. This separation meant I could write 11 unit tests for debate logic without touching LLM calls or terminal output.

I considered a simpler approach, just a while loop with streaming prints. Would've been faster to build. But I knew I'd want pause/resume, and retrofitting state management into a loop is painful. The state machine was maybe 30 minutes more upfront and saved me from rewriting later.

The Streaming Bug

First version had every turn rendering twice. Once during streaming (with a blinking cursor), once after (final version). Looked broken.

Took me 20 minutes to figure out why. Rich's Live context doesn't clear when you call stop(). It leaves the last rendered frame on screen. So my code was:

Stream with Live (panel with cursor)
Stop Live (panel stays on screen)
Render final turn (second panel appears)

The fix:

def stop_streaming(self) -> None:
    if self._live and self._current_persona:
        self._live.update(self._build_final_panel())  # Replace cursor with final
        self._live.stop()

Update the Live context to the final version before stopping. Then don't re-render. The streaming panel becomes the final panel in place.

What Didn't Work

The summary prompts are mediocre. They list bullet points from each side but miss the interesting tensions. In the contract debate, the summary didn't catch Raven's "three months becomes six" point—the thing that actually mattered. I've tweaked the prompts twice, still not satisfied. Probably needs few-shot examples of good summaries.

Ollama is slow on my machine. Debates with llama3.2:3b take 30-40 seconds per turn. Fine for testing, frustrating for actual use. I run it with Groq now (free tier, fast inference). Should've prioritized that provider earlier.

Six personas was overkill for launch. I've used optimist_pessimist and builder_critic repeatedly. philosopher_pragmatist once. The others sit unused. Should've shipped two pairs and added more based on what people actually want.

What Worked

The character names were worth the effort. Murphy (named after Murphy's Law) feels like a person being cautious, not an AI being negative. The personas stay in character across turns because the system prompts are specific about tone, not just position.

I ran "Should I learn Rust or Go?" with philosopher_pragmatist. Sophia asked what kind of programmer I want to become. Max countered with what I'd actually ship in the next 6 months. That framing, identity vs pragmatism, was more useful than any "Rust vs Go" blog post.

The tool is at github.com/ravisankar-r/arguemint. Works with Ollama out of the box.

arguemint debate "Should I quit my job to start a startup?" --pair optimist_pessimist

Let the voices in your head argue it out.