Python Testing Tip #16 / April 14, 2026

3 Tips for AI Code Review That Doesn't Suck

There are plenty of great sides of AI. But there's one that's really driving me crazy. That's the amount of noise that's produced with AI. From social media to emails, and even in AI code reviews. On some projects, open source maintainers are blocking external contributions as there are too many issues reported and pull requests opened with the AI. The same can happen in your merge requests if you just throw some AI code review process at them. So, how to avoid that? How to make AI code review actually useful?

Why the code reviews?

Before diving too deep into AI code review, let's first ask ourselves: "Why do we do code reviews in the first place?" Open-source maintainers do it to ensure new contributions are not malicious, fit the project, solve the problem, are well tested, and so on. Some teams are doing it because SOC2 requires it. Others, because "everyone else is doing it".

In my opinion, there are three major benefits of doing the code review:

  • Context sharing - Someone else is also aware of the change. If the author is out of the office, there's someone else who's explicitly aware of how and where a change was made.
  • Looking at the same problem and solution from a different perspective - Someone else might more easily find a blind spot or inconsistency as they are looking from a slightly different perspective.
  • Alignment on how to do certain things - Maybe the problem that you solved could be done in a different way, or you could reuse something that's already in the code base.

Nothing here specifies how or when the code review should be conducted. It's up to you and your team to either do it as you go when pairing, when MR is ready, or once per week on the main branch. The important parts are:

  • it increases collaboration,
  • it increases confidence in how to do things in the team,
  • it helps you share the knowledge within the team,
  • it helps you improve the system.

If those criteria are not met, it's likely to be more harmful than not.

Keep MRs Small for AI Code Review

As mentioned in the beginning of this series, we need to work with small chunks of work if we want to move fast. It's no different when it comes to code review. When reviewing too large parts of the code base (e.g., a large MR), it's impossible to benefit much from it. That's because the context window is limited - in our human brains and inside the AI tool of your choice. Yes, 1M tokens is a large context window, but it's still limited.

Tip #1: If you want the AI code review to be useful, make sure that you're reviewing small parts of the code. In MR size, that means small enough MR that you're willing to completely throw it away if you see that you or the AI have completely derailed.

Write a Custom AI Code Review Prompt

Generic code review will, more often than not, result in nitpicking comments. That's pure noise. It's much more powerful if you and your team sit down together and define what to check for in the code review. Find mutual agreement and write a markdown file that can be used as a prompt. Use the context that you have - it's you who's working with the system. Make sure your life is easier, not harder. If you're short on ideas, check the comments on previous MRs and try to formulate some rules from them. It's no different for the AI code review. The better you formulate expectations, the better the output will be.

Tip #2: Write down a command for code review together with your team. Make sure everyone contributes. Add the file to the repository.

For example:

Run `git diff -U5 main -- src/ tests/ pyproject.toml` to get the diff.

Then check for:
1. Imports belong at the top — no exceptions. Don't place imports inside functions, methods, classes, or try/except blocks.
2. Prefer keyword-only arguments in function and method signatures.                                                                                                                                    
3. Flatten nested conditionals — Favor guard clauses and early returns over deep nesting, especially for "nothing to do" scenarios. 

Add AI Code Review to Your CI/CD Pipeline

Once you have completed the first two steps, you need to run the code review. Add it to your CI/CD pipeline as an optional job that needs to be manually triggered. This way, you make sure you don't waste tokens on every push, and you can proceed with good old human-only review in case the AI API of your choice is down.

GitLab CI/CD + Claude Code example:

ai-review:
    image: node:22-alpine
    stage: test
    interruptible: true
    timeout: 10 minutes
    resource_group: ai-review-$CI_MERGE_REQUEST_IID
    variables:
      GIT_STRATEGY: clone
      GIT_DEPTH: 100
      CLAUDE_MODEL: claude-sonnet-4-6
      ANTHROPIC_API_KEY: ${AI_CODE_REVIEW_ANTHROPIC_API_KEY}
    before_script:
      - apk add --no-cache git bash curl jq
      - npm install -g @anthropic-ai/claude-code
      - git fetch origin main:refs/remotes/origin/main --depth="$GIT_DEPTH"
      # Redact sensitive env vars before Claude runs
      - |
        for var in $(env | grep -iE '^[^=]*(TOKEN|PASSWORD|SECRET|KEY)[^=]*=' | cut -d= -f1); do
          case "$var" in *[!a-zA-Z0-9_]*) continue;; esac
          [ "$var" != "ANTHROPIC_API_KEY" ] && export "$var=[REDACTED]"
        done
    script:
      - |
        claude -p "$(cat .claude/commands/code-review.md)" \
          --output-format json \
          --max-turns 30 \
          --model "$CLAUDE_MODEL" \
          --allowedTools "Read,Grep,Glob,Bash(git diff *),Bash(git log *),Bash(git show *),Bash(git merge-base *),Agent" \
          | jq -r '.result // "Review produced no output"' > ai-review-output.txt
    artifacts:
      paths:
        - ai-review-output.txt
      expire_in: 1 hour
    rules:
      - if: $CI_PIPELINE_SOURCE == "merge_request_event"
        when: manual
        allow_failure: true

  ai-review-post:
    image: alpine:latest
    stage: test
    interruptible: true
    variables:
      GIT_STRATEGY: none
    needs:
      - job: ai-review
        artifacts: true
    before_script:
      - apk add --no-cache curl
    script:
      - |
        curl --fail --request POST \
          --header "PRIVATE-TOKEN: ${GITLAB_AI_CODE_REVIEW_TOKEN}" \
          "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}/notes" \
          --data-urlencode "body@ai-review-output.txt"
    rules:
      - if: $CI_PIPELINE_SOURCE == "merge_request_event"
        when: on_success
        allow_failure: true

A couple of things to notice here: - Sensitive environment variables for the Claude Code part are overridden to avoid "Claude Code brought my production down" type of situations - Jobs are optional - There's a second job adding a comment to the merge request with the code review output

GitHub Actions + Claude Code example:

name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  ai-review:
    if: contains(github.event.pull_request.labels.*.name, 'ai-review')
    runs-on: ubuntu-latest
    timeout-minutes: 10
    permissions:
      contents: read
      pull-requests: write
    steps:
      - name: Checkout PR branch
        uses: actions/checkout@v4
        with:
          fetch-depth: 100

      - name: Fetch main branch
        run: git fetch origin main --depth=100

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '22'

      - name: Install Claude Code
        run: npm install -g @anthropic-ai/claude-code

      - name: Run AI Review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.AI_CODE_REVIEW_ANTHROPIC_API_KEY }}
          CLAUDE_MODEL: claude-sonnet-4-6
        run: |
          claude -p "$(cat .claude/commands/code-review.md)" \
            --output-format json \
            --max-turns 30 \
            --model "$CLAUDE_MODEL" \
            --allowedTools "Read,Grep,Glob,Bash(git diff *),Bash(git log *),Bash(git show *),Bash(git merge-base *),Agent" \
            | jq -r '.result // "Review produced no output"' > ai-review-output.txt

      - name: Post review comment
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          gh pr comment ${{ github.event.pull_request.number }} \
            --body-file ai-review-output.txt

A couple of things to notice here: - The workflow only runs when the ai-review label is added to the PR — this keeps it optional and avoids wasting tokens on every push - GitHub Actions secrets are not exposed to the environment by default, so there's no need for the env var redaction step - The gh CLI is pre-installed on GitHub runners, so posting the comment is a single command - permissions is scoped to the minimum needed: read the code, write PR comments

Tip #3: Add it to the CI/CD pipeline and start using it. Update the code review prompt as you go and learn more.

Benefits vs. costs

When talking about anything AI, costs are becoming an increasingly important aspect. It's no different when it comes to code review. At the time of writing this article, Anthropic costs are estimated at $15–$25/review for their code review feature. That seems rather excessive - especially compared to the pricing of the Claude Code Max subscription. Obviously, the cost depends on the token usage, which correlates to MR size. At Ren Systems, we've been running it on every backend MR for weeks, and the costs are somewhere between $0.15 - $1.5 per review. We're using a combination of the latest Sonnet and Haiku models, and we're very satisfied with the results. It usually takes 1 - 2 minutes for the AI review to finish, and it's very useful. We see almost the same comments as if someone from the team had added them. It's certainly one of the things helping us reduce the lead time from idea to a reliable production release.

Conclusion

If you want AI to produce something useful, it needs the right context. Without context, you quickly end up with noise. It's the same with AI code review. To make it work, collaborate with the team, write down what you want to get out of it, and use that in the CI/CD job. After that, learn and adapt. If you do that, AI code review will help you a lot. If you just throw some AI code review tool at your codebase and workflow, you'll gain way fewer benefits. If you have any questions, feel free to reach out to me on Twitter or LinkedIn.

Share this tip

The complete testing system, not just tips.

Stop piecing together advice from blog posts. This course gives you a structured approach to Python testing that scales with your codebase and keeps your AI agents in check.

Get the Course $20