· ci / github-actions / claude-code
Claude Code in CI — automated code review in GitHub Actions
Add Claude Code automated reviews to every PR with anthropics/claude-code-action@v1. Ten-minute setup, $0.01–$0.15 per review, catches real bugs before merge.
By Ethan
1,905 words · 10 min read
anthropics/claude-code-action@v1 drops a Claude Code reviewer into every PR. It comments directly on the pull request, costs between $0.01 and $0.15 per review with Sonnet 4.6, and takes about 10 minutes to wire up. If you’ve seen tutorials that use direct_prompt: in the workflow YAML — those are for the old beta. The v1 field is prompt:.
Who this is for
Teams on GitHub with Actions enabled who want a code review catch layer without paying for a dedicated tool. You need an Anthropic API key and about 10 minutes. This is not a replacement for human review — it’s a first pass that catches the off-by-one errors, unchecked error paths, and null-handling bugs humans miss when they’re tired or rushed.
If you haven’t used Claude Code before, the Claude Code 2026 review gives a useful baseline before wiring it into CI.
Why automate Claude Code in CI
Three things it catches reliably:
Logic bugs before human review. Claude reads the full diff and flags things like off-by-one errors, unchecked error paths, and inconsistent null handling — the kind humans gloss over when reviewing 400-line PRs at the end of the day. It doesn’t have context fatigue.
Style and standards drift. Give it a system prompt that describes your team’s conventions and it will flag deviations. Not a linter — those handle formatting. This handles patterns: “we always validate before mutating state,” “this service never makes synchronous HTTP calls in request handlers.” It catches the higher-level drift a linter won’t see.
Security anti-patterns. Hardcoded secrets, SQL string concatenation, missing input validation, dangerous flag usage. Not a SAST scanner, but it catches the obvious ones and flags them before they land in main.
Cost math: at Sonnet 4.6 rates, a 300-line PR costs about $0.05. Fifty PRs a month is under $5. A 3–5 person team at 10–15 PRs/week runs around $15–25/month. Compare that to a human reviewer’s hourly rate or any dedicated code quality SaaS.
Prerequisites
- An Anthropic API key. Get one at platform.claude.com under API Keys.
- A GitHub repository with Actions enabled. The free tier works.
- That’s it. Claude Code doesn’t need to be installed locally for this workflow — the action installs it inside the runner.
Step 1 — Add the API key as a GitHub secret
In your repository: Settings → Secrets and variables → Actions → New repository secret.
Name: ANTHROPIC_API_KEY
Value: your key from platform.claude.com
Keep the name exactly as shown. The workflow references it as ${{ secrets.ANTHROPIC_API_KEY }} — a typo here produces a silent empty string, which means Claude gets an authentication error but the workflow may report success.
Failure mode: if you skip this and reference the secret in the workflow, the action exits with API key not found. The error message is clear; the fix is adding the secret. Check the Actions tab under the failed run for the full log.
Step 2 — Write the workflow file
Create .github/workflows/claude-review.yml:
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize]
paths-ignore:
- '*.md'
- 'docs/**'
jobs:
review:
runs-on: ubuntu-latest
# Skip draft PRs — no cost until the PR is real
if: github.event.pull_request.draft == false
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 1
- uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
prompt: "Review this PR for bugs, security issues, and code quality. Be specific and concise. Flag anything that would block a merge."
claude_args: "--model claude-sonnet-4-6 --max-turns 5 --max-budget-usd 0.50"
Three things worth noting:
prompt: not direct_prompt: — the old beta action used direct_prompt:. Every tutorial written before August 2025 has this wrong. The v1 action silently ignores unknown inputs, so if you paste from a pre-GA guide the action will fall back to @claude mention-trigger mode and do nothing on pull_request events. The current field is prompt:.
--max-turns 5 — without this, a runaway agentic session can run 20+ turns on a large PR. Five turns is enough for a read-only review.
--max-budget-usd 0.50 — a hard dollar cap per invocation. The action exits cleanly at the limit rather than running up unexpected costs on very large diffs.
This workflow also supports an interactive mode: if someone comments @claude on a PR or issue, Claude responds directly in the thread. To enable that, add these triggers alongside the pull_request event:
on:
pull_request:
types: [opened, synchronize]
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
For the interactive triggers, add issues: write to the permissions block.
Step 3 — Trigger it and read the output
Push a commit to a non-draft PR. The workflow fires on pull_request + synchronize. After 30–90 seconds, Claude posts a review comment directly on the PR.
What to expect from the first run:
The comment format varies by what Claude found. For a diff with clear problems it produces a bulleted list of findings with file references and line numbers. For a clean diff it may post a brief summary or nothing at all — silence is fine, an empty comment means nothing flagged.
The Actions tab shows the full run output. If you add --output-format json to claude_args, the step output includes a total_cost_usd field you can monitor:
claude_args: "--model claude-sonnet-4-6 --max-turns 5 --output-format json"
Look at the raw logs to see per-run cost. Over a few weeks, this tells you whether your prompt is too broad (high token usage, generic comments) or well-scoped.
If the workflow runs but no comment appears: check that the permissions block includes pull-requests: write. The action needs write access to post. This is the most common silent failure on fresh setups — the workflow exits with code 0 but has done nothing.
Step 4 — Tune the prompt
The default prompt: in the example is intentionally generic. Two levers that change results most:
--append-system-prompt appends instructions to Claude’s system prompt without replacing the action’s built-in context:
claude_args: >-
--model claude-sonnet-4-6
--max-turns 5
--append-system-prompt "This is a TypeScript/Node.js codebase.
Flag: unchecked promises, missing error boundaries, type assertions that bypass the type system.
Skip style comments — we have a linter for those."
Scoping the system prompt to your stack cuts noise. A generic “review for bugs” prompt will comment on Python idioms in a TypeScript codebase, or flag import ordering that your formatter handles automatically. The more specific the constraint, the more signal-to-noise in the output.
--model controls the cost-quality tradeoff directly:
# Security-sensitive codebase, $0.50/PR is acceptable:
claude_args: "--model claude-opus-4-8 --max-turns 8 --max-budget-usd 1.00"
# Fast-moving startup, $0.03/PR ceiling:
claude_args: "--model claude-haiku-4-5-20251001 --max-turns 3 --max-budget-usd 0.10"
# Standard: balanced cost and coverage
claude_args: "--model claude-sonnet-4-6 --max-turns 5 --max-budget-usd 0.50"
Sonnet 4.6 is about 40% cheaper than Opus and catches most issues. Haiku misses subtler logic bugs but works for straightforward style and security checks. Use Opus when the codebase is complex enough that you’d want a thorough human reviewer and you’re willing to pay for it.
Cost and rate-limit considerations
| Control | How to set it | Effect |
|---|---|---|
| Model tier | --model claude-sonnet-4-6 | Sonnet ≈ 40% cheaper than Opus |
| Turn cap | --max-turns 5 | Hard stop at 5 agentic steps |
| Dollar cap | --max-budget-usd 0.50 | Exits cleanly at limit |
| Event filter | types: [opened] (not synchronize) | Once per PR, not per push |
| Path filter | paths-ignore: ['*.md', 'docs/**'] | Skip doc-only changes |
| Draft filter | if: github.event.pull_request.draft == false | No cost on drafts |
| Concurrency | concurrency: cancel-in-progress: true | Cancels queued runs on new push |
| Timeout | timeout-minutes: 10 | Kills runaway jobs |
Security
Prompt injection was a real vulnerability — patched in v1.0.94 (2026-04-13). A crafted issue body can inject instructions causing Claude to read ACTIONS_ID_TOKEN_REQUEST_TOKEN from the process environment and exchange it for a write-access GitHub token. Anthropic patched this with human actor verification, environment variable scrubbing, and a custom gh wrapper that blocks exfiltration patterns. If you are running anything before v1.0.94, update now: uses: anthropics/claude-code-action@v1 pulls the latest v1 release automatically. Pin to a specific SHA only if your security policy requires it.
Don’t use --dangerously-skip-permissions in CI. This flag disables all safety guardrails and allows Claude to execute arbitrary shell commands without confirmation. Tutorials recommend it to eliminate permission prompts, but it’s the wrong tradeoff in CI where Claude is reading untrusted diff content. Use --allowedTools instead:
# Grant only what a read-only review actually needs
claude_args: '--allowedTools "Read" "Bash(git log *)" --max-turns 5'
For read-only code review, Read is sufficient. Add Bash(git log *) if you want Claude to inspect commit history. Nothing else should be granted.
Fork PRs: There is no currently safe configuration for running this on fork PRs. The pull_request event blocks secrets from fork-triggered workflows, so Claude can’t authenticate. The pull_request_target event does expose secrets, but it runs the fork’s workflow code with your secrets in scope — a known attack vector. The safe default: skip fork PRs and let human reviewers cover them.
Troubleshooting
1. No PR comment appears, workflow exits cleanly.
Missing pull-requests: write permission. This is the most common new-setup failure. Add the permission and re-run.
2. “API key not found” error.
The secret isn’t set, is named differently, or the workflow doesn’t have an env: block passing it. Verify the secret name matches exactly (case-sensitive), then check the step-level env if the action requires it.
3. Prompt with special characters breaks the action.
Unquoted prompt: values fail when the string contains colons or square brackets. Use a YAML block scalar:
prompt: |
Review this PR for bugs and security issues.
Flag: unchecked promises, SQL string concatenation, missing auth checks.
4. Workflow fires on every push and racks up cost.
types: [opened, synchronize] fires on every commit push to the PR. Change to types: [opened] for once-per-PR reviews, or add concurrency cancellation so only the latest push triggers a paid run:
concurrency:
group: claude-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
5. Review comment is empty or returns null.
An occasional intermittent failure — exit code 0, empty result. Validate the output explicitly:
RESULT=$(claude -p "..." --output-format json | jq -r '.result')
if [ -z "$RESULT" ] || [ "$RESULT" = "null" ]; then
echo "Claude returned empty output — check the Actions log"
exit 1
fi
Or use the JSON output’s result field as the guard: jq -e '.result != null' exits with code 1 on null, which will fail the workflow step.
Next steps
To use Claude Code for automated fixes rather than read-only review, see the --allowedTools Edit,Write patterns in the official headless mode docs. Be aware this grants write access to the repository from within a CI runner — scope the allowlist carefully.
For non-GitHub CI systems: the same claude --bare -p approach works on GitLab CI/CD, Bitbucket Pipelines, and any CI with shell access. You’re running the CLI directly rather than through the Action wrapper. Set ANTHROPIC_API_KEY as an environment variable, use --bare to skip local context loading, and add --allowedTools Read to keep it read-only. GitLab CI/CD support is documented at code.claude.com/docs/en/gitlab-ci-cd.
For monorepo setups, see how to set up Claude Code for a monorepo — CI context scoping behaves differently there. If you want to go deeper on Claude Code automation beyond CI, Claude Code Hooks is the natural next read.