Fluffy Parrot

Tuning Claude until you can feel what changed.

The question I was chasing

Tuning a language model usually feels like editing a settings file. I wanted it to feel like working a piece of hardware.

I spend a lot of time adjusting prompts against Claude, and the feedback loop was always the weak part. Every playground I had used treated the dials (how creative the answer is, how wide the search for words gets, how long it can run) as plain form fields. Type a number, rerun, hope, lose track of what changed. There was no instrument, only trial, error and vibes.

The constraints

Solo build, native-feeling on Mac, and one hard rule: trustworthy enough that a developer would paste in a real API key. That meant no backend, no tracking, and nothing sending data home. The key lives in the Mac Keychain and the app talks straight to Anthropic. Open source, so anyone can check that rather than take my word for it.

The decisions that mattered

Hardware-style knobs, not sliders. The first version used sliders. They were precise and lifeless. Rebuilding them as knobs changed the character of the whole tool, and tuning became something you could feel in your hands.

Every run gets its own tab, with its cost and round-trip time attached. Put two takes side by side and you read the real difference instead of trusting your memory. That is what turned a playground into an instrument.

Keychain-only key storage was non-negotiable. The moment a credentials tool asks a developer to trust its server, it has lost them.

Built with: Electron, React, TypeScript, the Claude API, Claude Code

Where it landed

It is live at fluffyparrot.com, free and open source. It is where I go when I need to feel how a prompt behaves as I turn the dials.

Two takes side by side answers most of what I ask. A wall of runs would answer more. For now the instrument stops at a pair.

Part of the Rolling Waves work archive.