Frequently Asked Questions

How Code Canary works, what data it collects, and why privacy comes first.

What is Code Canary?

Code Canary is a decentralized quality barometer for AI coding agents. It collects lightweight ratings from developers after their coding sessions and aggregates them into a public leaderboard. Think of it as a collective signal for which models and agents are actually helping developers get work done.

How does it work?

Code Canary has three components:

A shell hook that fires when your coding agent session ends
A browser-based rating form where you submit a 1–5 score
A public dashboard that aggregates scores across all reviewers

The installer supports Claude Code, Codex, and Gemini CLI. For Claude Code, it installs a Stop hook in ~/.claude/settings.json. For Codex, it sets notify in ~/.codex/config.toml. For Gemini CLI, it configures a post-session hook in ~/.gemini/settings.json. In each case, the hook opens your browser to a rating form pre-filled with session metadata (model name, turn count, context usage). You rate the session, optionally tag the task type and leave a comment, and submit. That's it.

Privacy-First Design

Code Canary was designed from the ground up to never touch your code. Here's how each component enforces this:

The Hook: No Code Ever Leaves Your Machine

The hook scripts (~/.codecanary/codecanary-hook.sh for Claude, ~/.codecanary/codecanary-codex-hook.sh for Codex, and ~/.codecanary/codecanary-gemini-hook.sh for Gemini CLI) only extract metadata — never code content. Here's the Claude parser snippet:

for line in open('$TRANSCRIPT'):
    try:
        d = json.loads(line)
        if d.get('type') == 'assistant':
            msg = d.get('message', {})
            if not model:
                model = msg.get('model', '')
            if not version:
                version = d.get('version', '')
            usage = msg.get('usage', {})
            inp = usage.get('input_tokens', 0) \
                + usage.get('cache_creation_input_tokens', 0) \
                + usage.get('cache_read_input_tokens', 0)
            if inp > total_input:
                total_input = inp
            turns += 1
        elif d.get('type') == 'user' \
             and d.get('userType') == 'external':
            turns += 1
    except: pass

For Claude, the extracted values are: model, version, token usage (to estimate context percentage), and turns. For Codex, metadata is extracted from the notify payload and session event metadata under ~/.codex/sessions. For Gemini CLI, metadata is extracted from the session output. No message content, file paths, code, or prompts are ever read or transmitted.

The Rating Form: Client-Side URL Only

The hook constructs a URL and opens it in your browser. The URL contains only:

${CODECANARY_BASE}/rate?feedback_id=${FEEDBACK_ID}
  &session_id=${SESSION_ID}
  &agent=(claude-code, codex-cli, or gemini-cli)
  &model=${MODEL}
  &context_pct=${CONTEXT_PCT}
  &turns=${TURNS}
  &client_version=${CLIENT_VERSION}
  &token=${CODECANARY_TOKEN}

No code, no prompt text, no file names. The feedback_id is a locally-generated UUID. The token is your auth credential for linking ratings to your account.

What Gets Stored

When you submit the form, the server stores a Rating record with these fields:

agent        # e.g. "claude-code"
model        # e.g. "claude-sonnet-4-5-20250514"
rating       # 1–5 integer
task_type    # optional (bugfix, feature, refactor, etc.)
comment      # optional free text you wrote
context_pct  # estimated context window usage
turns        # number of conversation turns
feedback_id  # UUID to prevent duplicate submissions
user_id      # your GitHub-linked account (for counting reviewers)

That's the complete schema. There is no field for code, diffs, prompts, file paths, or repository information.

How does the probabilistic trigger work?

To avoid prompt fatigue, Code Canary only asks for a rating on a random subset of sessions. During installation you pick a rate (10%, 5%, or 2%). The hook rolls a random number each time it fires:

ROLL=$((RANDOM % 100))
if [ "$PROMPT_PCT" -lt 100 ] && [ "$ROLL" -ge "$PROMPT_PCT" ]; then
  exit 0  # skip this session
fi

You can change the rate anytime by editing ~/.codecanary/credentials.json and setting prompt_pct to any value between 0 (never) and 100 (always).

How does authentication work?

During installation, Code Canary opens a GitHub OAuth flow in your browser. After you authorize, a JWT token is saved to ~/.codecanary/credentials.json (permissions 600, readable only by you). This token is sent with each rating submission so the server can associate ratings with your account for reviewer counting.

Authentication is optional — you can skip it during install by pressing Ctrl+C. Unauthenticated ratings are still recorded but won't count toward unlocking the dashboard.

How is the leaderboard calculated?

The dashboard shows average ratings per agent/model combination over the last 30 days. Models are ranked by their average score. The "Quality Over Time" chart shows daily trends for the top models, with a volume overlay showing total daily rating counts.

Why do I need 2 ratings to see the dashboard?

Code Canary gates the dashboard behind a minimum of 2 submitted ratings to ensure the community is contributing, not just consuming. It also encourages developers to actually try the hook before viewing results.

How do I install / uninstall?

Install:

curl -sL https://codecanary.net/install | bash

Uninstall:

curl -sL https://codecanary.net/uninstall | bash

The installer downloads hook scripts, optionally authenticates via GitHub, lets you pick a prompt rate, and then configures Claude, Codex, Gemini CLI, or any combination depending on your choice. The uninstaller removes all Code Canary hook entries.

What agents can I use Code Canary with?

Code Canary ships hooks for Claude Code, Codex, and Gemini CLI. The rating system is agent-agnostic though — any agent that supports session hooks could be integrated. Contributions welcome.