How Code Canary works, what data it collects, and why privacy comes first.
Code Canary is a decentralized quality barometer for AI coding agents. It collects lightweight ratings from developers after their coding sessions and aggregates them into a public leaderboard. Think of it as a collective signal for which models and agents are actually helping developers get work done.
Code Canary has three components:
The installer supports Claude Code, Codex, and Gemini CLI. For Claude Code, it installs a
Stop hook in ~/.claude/settings.json. For Codex, it sets
notify in ~/.codex/config.toml. For Gemini CLI, it
configures a post-session hook in ~/.gemini/settings.json. In each case, the hook opens
your browser to a rating form pre-filled with session metadata (model name, turn count,
context usage). You rate the session, optionally tag the task type and leave a comment,
and submit. That's it.
Code Canary was designed from the ground up to never touch your code. Here's how each component enforces this:
The hook scripts (~/.codecanary/codecanary-hook.sh for Claude,
~/.codecanary/codecanary-codex-hook.sh for Codex, and
~/.codecanary/codecanary-gemini-hook.sh for Gemini CLI) only extract metadata
— never code content. Here's the Claude parser snippet:
for line in open('$TRANSCRIPT'):
try:
d = json.loads(line)
if d.get('type') == 'assistant':
msg = d.get('message', {})
if not model:
model = msg.get('model', '')
if not version:
version = d.get('version', '')
usage = msg.get('usage', {})
inp = usage.get('input_tokens', 0) \
+ usage.get('cache_creation_input_tokens', 0) \
+ usage.get('cache_read_input_tokens', 0)
if inp > total_input:
total_input = inp
turns += 1
elif d.get('type') == 'user' \
and d.get('userType') == 'external':
turns += 1
except: pass
For Claude, the extracted values are: model, version,
token usage (to estimate context percentage), and turns. For Codex,
metadata is extracted from the notify payload and session event metadata under
~/.codex/sessions. For Gemini CLI, metadata is extracted from
the session output. No message content, file paths, code, or prompts
are ever read or transmitted.
The hook constructs a URL and opens it in your browser. The URL contains only:
${CODECANARY_BASE}/rate?feedback_id=${FEEDBACK_ID}
&session_id=${SESSION_ID}
&agent=(claude-code, codex-cli, or gemini-cli)
&model=${MODEL}
&context_pct=${CONTEXT_PCT}
&turns=${TURNS}
&client_version=${CLIENT_VERSION}
&token=${CODECANARY_TOKEN}
No code, no prompt text, no file names. The feedback_id is a locally-generated
UUID. The token is your auth credential for linking ratings to your account.
When you submit the form, the server stores a Rating record with these fields:
agent # e.g. "claude-code"
model # e.g. "claude-sonnet-4-5-20250514"
rating # 1–5 integer
task_type # optional (bugfix, feature, refactor, etc.)
comment # optional free text you wrote
context_pct # estimated context window usage
turns # number of conversation turns
feedback_id # UUID to prevent duplicate submissions
user_id # your GitHub-linked account (for counting reviewers)
That's the complete schema. There is no field for code, diffs, prompts, file paths, or repository information.
To avoid prompt fatigue, Code Canary only asks for a rating on a random subset of sessions. During installation you pick a rate (10%, 5%, or 2%). The hook rolls a random number each time it fires:
ROLL=$((RANDOM % 100))
if [ "$PROMPT_PCT" -lt 100 ] && [ "$ROLL" -ge "$PROMPT_PCT" ]; then
exit 0 # skip this session
fi
You can change the rate anytime by editing ~/.codecanary/credentials.json
and setting prompt_pct to any value between 0 (never) and 100 (always).
During installation, Code Canary opens a GitHub OAuth flow in your browser. After you
authorize, a JWT token is saved to ~/.codecanary/credentials.json (permissions
600, readable only by you). This token is sent with each rating submission
so the server can associate ratings with your account for reviewer counting.
Authentication is optional — you can skip it during install by pressing Ctrl+C. Unauthenticated ratings are still recorded but won't count toward unlocking the dashboard.
The dashboard shows average ratings per agent/model combination over the last 30 days. Models are ranked by their average score. The "Quality Over Time" chart shows daily trends for the top models, with a volume overlay showing total daily rating counts.
Code Canary gates the dashboard behind a minimum of 2 submitted ratings to ensure the community is contributing, not just consuming. It also encourages developers to actually try the hook before viewing results.
Install:
curl -sL https://codecanary.net/install | bash
Uninstall:
curl -sL https://codecanary.net/uninstall | bash
The installer downloads hook scripts, optionally authenticates via GitHub, lets you pick a prompt rate, and then configures Claude, Codex, Gemini CLI, or any combination depending on your choice. The uninstaller removes all Code Canary hook entries.
Code Canary ships hooks for Claude Code, Codex, and Gemini CLI. The rating system is agent-agnostic though — any agent that supports session hooks could be integrated. Contributions welcome.