There are plenty of great sides of AI. But there's one that's really driving me crazy. That's the amount of noise that's produced with AI. From social media to emails, and even in AI code reviews. On some projects, open source maintainers are blocking external contributions as there are too many issues reported and pull requests opened with the AI. The same can happen in your merge requests if you just throw some AI code review process at them. So, how to avoid that? How to make AI code review actually useful?
Why the code reviews?
Before diving too deep into AI code review, let's first ask ourselves: "Why do we do code reviews in the first place?" Open-source maintainers do it to ensure new contributions are not malicious, fit the project, solve the problem, are well tested, and so on. Some teams are doing it because SOC2 requires it. Others, because "everyone else is doing it".
In my opinion, there are three major benefits of doing the code review:
- Context sharing - Someone else is also aware of the change. If the author is out of the office, there's someone else who's explicitly aware of how and where a change was made.
- Looking at the same problem and solution from a different perspective - Someone else might more easily find a blind spot or inconsistency as they are looking from a slightly different perspective.
- Alignment on how to do certain things - Maybe the problem that you solved could be done in a different way, or you could reuse something that's already in the code base.
Nothing here specifies how or when the code review should be conducted. It's up to you and your team to either do it as you go when pairing, when MR is ready, or once per week on the main branch. The important parts are:
- it increases collaboration,
- it increases confidence in how to do things in the team,
- it helps you share the knowledge within the team,
- it helps you improve the system.
If those criteria are not met, it's likely to be more harmful than not.
Keep MRs Small for AI Code Review
As mentioned in the beginning of this series, we need to work with small chunks of work if we want to move fast. It's no different when it comes to code review. When reviewing too large parts of the code base (e.g., a large MR), it's impossible to benefit much from it. That's because the context window is limited - in our human brains and inside the AI tool of your choice. Yes, 1M tokens is a large context window, but it's still limited.
Tip #1: If you want the AI code review to be useful, make sure that you're reviewing small parts of the code. In MR size, that means small enough MR that you're willing to completely throw it away if you see that you or the AI have completely derailed.
Write a Custom AI Code Review Prompt
Generic code review will, more often than not, result in nitpicking comments. That's pure noise. It's much more powerful if you and your team sit down together and define what to check for in the code review. Find mutual agreement and write a markdown file that can be used as a prompt. Use the context that you have - it's you who's working with the system. Make sure your life is easier, not harder. If you're short on ideas, check the comments on previous MRs and try to formulate some rules from them. It's no different for the AI code review. The better you formulate expectations, the better the output will be.
Tip #2: Write down a command for code review together with your team. Make sure everyone contributes. Add the file to the repository.
For example:
Run `git diff -U5 main -- src/ tests/ pyproject.toml` to get the diff.
Then check for:
1. Imports belong at the top — no exceptions. Don't place imports inside functions, methods, classes, or try/except blocks.
2. Prefer keyword-only arguments in function and method signatures.
3. Flatten nested conditionals — Favor guard clauses and early returns over deep nesting, especially for "nothing to do" scenarios.
Add AI Code Review to Your CI/CD Pipeline
Once you have completed the first two steps, you need to run the code review. Add it to your CI/CD pipeline as an optional job that needs to be manually triggered. This way, you make sure you don't waste tokens on every push, and you can proceed with good old human-only review in case the AI API of your choice is down.
GitLab CI/CD + Claude Code example:
ai-review:
image: node:22-alpine
stage: test
interruptible: true
timeout: 10 minutes
resource_group: ai-review-$CI_MERGE_REQUEST_IID
variables:
GIT_STRATEGY: clone
GIT_DEPTH: 100
CLAUDE_MODEL: claude-sonnet-4-6
ANTHROPIC_API_KEY: ${AI_CODE_REVIEW_ANTHROPIC_API_KEY}
before_script:
- apk add --no-cache git bash curl jq
- npm install -g @anthropic-ai/claude-code
- git fetch origin main:refs/remotes/origin/main --depth="$GIT_DEPTH"
# Redact sensitive env vars before Claude runs
- |
for var in $(env | grep -iE '^[^=]*(TOKEN|PASSWORD|SECRET|KEY)[^=]*=' | cut -d= -f1); do
case "$var" in *[!a-zA-Z0-9_]*) continue;; esac
[ "$var" != "ANTHROPIC_API_KEY" ] && export "$var=[REDACTED]"
done
script:
- |
claude -p "$(cat .claude/commands/code-review.md)" \
--output-format json \
--max-turns 30 \
--model "$CLAUDE_MODEL" \
--allowedTools "Read,Grep,Glob,Bash(git diff *),Bash(git log *),Bash(git show *),Bash(git merge-base *),Agent" \
| jq -r '.result // "Review produced no output"' > ai-review-output.txt
artifacts:
paths:
- ai-review-output.txt
expire_in: 1 hour
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: manual
allow_failure: true
ai-review-post:
image: alpine:latest
stage: test
interruptible: true
variables:
GIT_STRATEGY: none
needs:
- job: ai-review
artifacts: true
before_script:
- apk add --no-cache curl
script:
- |
curl --fail --request POST \
--header "PRIVATE-TOKEN: ${GITLAB_AI_CODE_REVIEW_TOKEN}" \
"${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes" \
--data-urlencode "body@ai-review-output.txt"
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: on_success
allow_failure: true
A couple of things to notice here: - Sensitive environment variables for the Claude Code part are overridden to avoid "Claude Code brought my production down" type of situations - Jobs are optional - There's a second job adding a comment to the merge request with the code review output
GitHub Actions + Claude Code example:
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
ai-review:
if: contains(github.event.pull_request.labels.*.name, 'ai-review')
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout PR branch
uses: actions/checkout@v4
with:
fetch-depth: 100
- name: Fetch main branch
run: git fetch origin main --depth=100
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
- name: Install Claude Code
run: npm install -g @anthropic-ai/claude-code
- name: Run AI Review
env:
ANTHROPIC_API_KEY: ${{ secrets.AI_CODE_REVIEW_ANTHROPIC_API_KEY }}
CLAUDE_MODEL: claude-sonnet-4-6
run: |
claude -p "$(cat .claude/commands/code-review.md)" \
--output-format json \
--max-turns 30 \
--model "$CLAUDE_MODEL" \
--allowedTools "Read,Grep,Glob,Bash(git diff *),Bash(git log *),Bash(git show *),Bash(git merge-base *),Agent" \
| jq -r '.result // "Review produced no output"' > ai-review-output.txt
- name: Post review comment
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
gh pr comment ${{ github.event.pull_request.number }} \
--body-file ai-review-output.txt
A couple of things to notice here:
- The workflow only runs when the ai-review label is added to the PR — this keeps it optional and avoids wasting tokens on every push
- GitHub Actions secrets are not exposed to the environment by default, so there's no need for the env var redaction step
- The gh CLI is pre-installed on GitHub runners, so posting the comment is a single command
- permissions is scoped to the minimum needed: read the code, write PR comments
Tip #3: Add it to the CI/CD pipeline and start using it. Update the code review prompt as you go and learn more.
Benefits vs. costs
When talking about anything AI, costs are becoming an increasingly important aspect. It's no different when it comes to code review. At the time of writing this article, Anthropic costs are estimated at $15–$25/review for their code review feature. That seems rather excessive - especially compared to the pricing of the Claude Code Max subscription. Obviously, the cost depends on the token usage, which correlates to MR size. At Ren Systems, we've been running it on every backend MR for weeks, and the costs are somewhere between $0.15 - $1.5 per review. We're using a combination of the latest Sonnet and Haiku models, and we're very satisfied with the results. It usually takes 1 - 2 minutes for the AI review to finish, and it's very useful. We see almost the same comments as if someone from the team had added them. It's certainly one of the things helping us reduce the lead time from idea to a reliable production release.
Conclusion
If you want AI to produce something useful, it needs the right context. Without context, you quickly end up with noise. It's the same with AI code review. To make it work, collaborate with the team, write down what you want to get out of it, and use that in the CI/CD job. After that, learn and adapt. If you do that, AI code review will help you a lot. If you just throw some AI code review tool at your codebase and workflow, you'll gain way fewer benefits. If you have any questions, feel free to reach out to me on Twitter or LinkedIn.