Code Review 4 de May de 2026

Rust Code Review: Best Practices + AI Tools (2026)

Q: What is the best AI code review tool for Rust?

The best tool depends on what you need. For teams that require deep repository-level analysis, enforcement of domain-specific rules, and control over data through self-hosting or BYOK, Kodus is the best option due to its semantic understanding of Rust and natural language rules.

Q: How do you review unsafe Rust code safely?

Reviewing unsafe requires extreme care. The first question should always be whether the unsafe block is necessary. If it is, the review should focus on the justification comment, which must clearly state the invariants being manually guaranteed and what the compiler cannot validate. The block should be as small as possible, isolating unsafe operations. The reviewer needs to act like an adversarial compiler, trying to break those guarantees. AI tools help by pointing out common incorrect usages, but experienced human review is still essential.

Q: What is the best LLM model for code review?

There is not a single best model. Benchmarks like Code Review Bench show that different models excel at different things. Some are better at finding bugs, others at giving useful suggestions or handling context. In practice, the best model is the one that works best with your code and your team context. See comparisons here: https://www.codereviewbench.com/.

Edvaldo Freitas

Rust code review is a high-risk process. The language’s compile-time guarantees eliminate entire categories of bugs, but its complexity (ownership, lifetimes, concurrency, and macros) puts a heavy burden on the reviewer. A superficial review lets subtle performance pitfalls or incorrect unsafe blocks slip through, while a thorough review can take hours, slowing the whole team down.

The problem is that manual reviews don’t scale with team growth or the complexity of the codebase. As more engineers contribute, keeping the code idiomatic, performant, and safe becomes a real challenge. Engineers spend too much time on repetitive, low-level feedback, leaving less mental energy for architectural decisions that actually matter. The only way to solve this is with a combination of solid human judgment and targeted AI assistance, not just telling everyone to “be more careful.”

Common challenges in Rust code review

Even for experienced engineers, some aspects of Rust demand close attention during review.

Ownership and borrowing complexity: The borrow checker catches most errors, but logical flaws still slip through. Reviews often find unnecessary clone() calls, overly complex lifetime annotations, or patterns that fight the borrow checker instead of working with it. This won’t cause compilation errors, but it leads to performance loss or code that’s hard to maintain.
Asynchronous programming: async/.await simplifies syntax, but the mechanics behind Future and Pin, along with task execution details, are complex. Reviewing async code requires checking for potential deadlocks, correct use of synchronization primitives like Mutex from tokio vs std, and ensuring futures aren’t dropped before completion.
Macro opacity: Procedural and declarative macros are powerful, but they hide the code that actually runs. Reviewers need to understand the code the macro expands into, not just the invocation. That requires familiarity with the macro’s implementation or using tools to expand it.

Best practices for Rust code review

A good review process makes the code and the team better. It’s about collective ownership, not just finding fault. Here are a few principles that keep the process constructive.

Keep comments actionable

Instead of saying “This is confusing,” try asking a question that guides the author to a better solution: “What if we named this variable user_id_from_cache to clarify its origin?” or “Could we extract this logic into a private function to simplify the main flow?”

Look for idiomatic Rust

The language has strong conventions. A review should check if the code uses iterators instead of manual for loops, prefers Result and Option for error handling over panics, and uses traits to define behavior. Following these conventions helps other Rust developers understand and maintain the code.

For example, a review should catch code that takes ownership when it only needs to read the value:

fn normalize_username(username: String) -> String {
    username.trim().to_lowercase()
}

fn main() {
    let username = String::from(" Gabriel ");
    let normalized = normalize_username(username);

    // username can no longer be used here
}

This works, but the function does not need to own username. It only reads it and returns a new normalized string. A more idiomatic version accepts a string slice:

fn normalize_username(username: &str) -> String {
    username.trim().to_lowercase()
}

fn main() {
    let username = String::from(" Gabriel ");
    let normalized = normalize_username(&username);

    println!("Original: {}", username);
    println!("Normalized: {}", normalized);
}

The second version gives the caller more flexibility. It accepts String, &String, and string literals, and it avoids moving ownership when the function only needs read access. This is the kind of small design choice that makes Rust APIs easier to use as the codebase grows.

Balance performance with readability

Rust gives you the tools to optimize heavily, but premature optimization is still a problem. A well-placed .clone() can be perfectly acceptable if it simplifies the ownership model and isn’t on a hot path. The review should question performance-driven complexity. Is this convoluted lifetime annotation really necessary? Have we profiled this to confirm it’s a bottleneck?

Trust the author’s intent

The person who wrote the code has the most context. Assume they made their choices for a reason. Your job as a reviewer is to understand that reason and challenge it constructively if you see a better alternative. Start with questions, not directives.

Move long debates offline

If a comment thread on a PR has more than a few back-and-forths, you need a higher-bandwidth conversation. Jump on a quick call or start a design document. A PR is not the place to solve deep architectural disagreements.

Checklist for Rust code review

To keep things consistent, teams need a shared set of priorities. This checklist isn’t about ticking boxes, but about directing attention to what actually builds solid, maintainable systems.

1. Scope and impact of the change

Does the PR clearly state the problem, the approach, and the impact on the system?
Is the change small enough for a safe review?
Were there changes to public API, contract, error format, or observable behavior?
Were changes to Cargo.toml, Cargo.lock, features, or dependencies justified?
If the project is a library, does the change respect semver and compatibility?

2. Modeling and Rust idiomaticity

Do the types model the domain well, using enum, struct, and newtype instead of loose flags?
Is there excessive use of String, Vec, or owned types when &str, slices, or iterators would suffice?
Does the code avoid unnecessary clone() calls?
Are lifetimes simple and readable, without accidental complexity?
Does the use of traits, generics, and bounds improve the design instead of making maintenance harder?
Does the code follow idiomatic Rust conventions, instead of almost literally porting patterns from other languages?

3. Ownership, borrowing, and mutability

Is ownership clear at each API boundary?
Does the code prefer borrowing over moving/copying when that reduces cost and complexity?
Were Rc, Arc, RefCell, Mutex, and interior mutability used out of real necessity?
Is there a risk of overly long borrows or excessive coupling by reference?
Is mutability restricted to the smallest possible scope?

4. Error handling and failures

Are Result and Option used correctly and consistently?
Do unwrap() and expect() appear only where they are truly acceptable, such as in tests, prototypes, or very well-established invariants?
Do errors have enough context for diagnosis?
Does the code distinguish between recoverable failures and fatal failures?
Were panics avoided in production paths and libraries?
Does the exposed error type make sense for the layer, such as thiserror in libraries and anyhow more at the edge of the application?

5. Concurrency and async

Does the code avoid blocking inside async contexts?
Are locks not held across .await?
Do Send and Sync make sense for shared types?
Is there a risk of deadlock, starvation, or unnecessary contention?
Do tokio::spawn, tasks, and channels have proper lifecycle, cancellation, and error handling?
Was using Arc<Mutex<T>> really the best choice?

6. Memory safety and unsafe

Does every unsafe block have a clear justification and documented invariants?
Is the scope of unsafe minimized?
Is there proper validation for FFI, pointers, buffers, alignment, and ownership crossing boundaries?
Were repr(C) types used when necessary?
Did the review check whether it would be possible to remove or better encapsulate unsafe?

7. Performance and resource usage

Are there avoidable allocations, unnecessary copies, or excessive conversions?
Do data structures make sense for the volume and access pattern?
Were loops, parsing, serialization, and hot paths handled carefully?
Does the code create backpressure and limits for queues, buffers, and memory consumption?
Were premature optimizations avoided, but obvious bottlenecks addressed?

8. Architecture and boundaries

Does the change respect boundaries between domain, application, infra, and adapters?
Do modules, crates, and features still have cohesive responsibilities?
Is there no leakage of infrastructure details into the domain?
Is visibility (pub, pub(crate)) minimal and intentional?
Did coupling between crates or modules increase unnecessarily?

9. Tests

Are there tests for happy paths, errors, edge cases, and regressions?
Did changes in public behavior come with integration or contract tests?
Did concurrent, async, or race conditions get adequate coverage?
When it makes sense, are there property-based tests, snapshot tests, or fuzzing?
Does the PR maintain or improve the reliability of the suite?

10. Observability and operations

Do logs, traces, and metrics help operate the change in production?
Do logs avoid exposing secrets, tokens, or sensitive data?
Are error messages and events useful for troubleshooting?
Does the change require controlled rollout, feature flag, migration, or playbook?

11. Maintainability

Do names, modules, and functions clearly communicate intent?
Were functions that are too long or have multiple responsibilities broken down?
Do comments explain invariants and trade-offs, not the obvious?
Was Rustdoc updated when the public API changed?
Did the code become easier to evolve six months from now?

12. Tooling and pipeline quality

Does it pass cargo fmt, cargo clippy, and cargo test?
Were ignored lints justified?
Do relevant changes call for extra checks like miri, sanitizers, benchmarks, or dependency audits?
Do the minimum Rust version and workspace features remain consistent?

Development Asset

Rust Code Review

Copy this Rust checklist and share it with your team to maintain high code quality.

## 1. Scope and change impact…

✓ Checklist copied to clipboard!

The role of AI in Rust code review

Humans are good at understanding intent and business logic. Agents are good at identifying patterns, checking rules, and analyzing complex states. The most effective review process combines both.

The main role of AI is to help the team deliver with more speed and quality. It can point out security violations, identify non-idiomatic code, and suggest performance improvements. This frees up reviewers to focus on higher-level concerns.

Automating initial checks: AI tools can detect incorrect borrowing patterns, missing error handling, and clone() calls inside loops with good accuracy. This reduces repetitive, low-value comments made by humans.
Deeper analysis: More advanced tools can trace data flow across multiple files to identify potential concurrency issues or explain the implications of a complex unsafe block. They can also suggest refactorings to simplify logic or generate targeted tests for edge cases not covered in a pull request.

AI should complement the reviewer, not replace them. AI points out what is happening, like “this code allocates inside a loop.” The person decides why it matters, evaluating whether this is a critical performance path that needs optimization now or if the focus should first be on correctness.

Comparison of AI code review tools for Rust in 2026

Choosing the right tool depends on what your team needs in terms of Rust analysis, context understanding, and customization capability. A simple diff-based bot is very different from a tool that understands the entire repository.

Tool	Rust depth	Cross-file context	Custom rules	Integrations	Price
Kodus	Excellent	Repository-level	Natural language + plugins	GitHub, GitLab, Bitbucket, Azure Devops	$10
CodeRabbit	Good	Diff-based	Limited	GitHub, GitLab, Bitbucket, Azure Devops	$60
Bito	Good	Diff-based (IDE-focused)	Limited	GitHub, GitLab, Bitbucket	$25
CodeAnt AI	Moderate	Diff-based	Limited	GitHub, GitLab, Bitbucket, Azure Devops	$30
SonarQube	Good (security-focused)	Project-level (static analysis)	Yes (custom rules)	GitHub, GitLab, Bitbucket, Azure Devops	$32

Kodus

Kodus was designed for complex languages like Rust. It doesn’t just analyze the diff, but builds a semantic understanding of the entire repository. This makes it possible to track lifetimes, data flow, and potential race conditions across files. It also allows defining custom rules in natural language. You can write something like: “Flag any new public function in the billing module that does not call authorize_transaction.” This type of project-specific rule, tied to context, is hard to replicate in other tools.

The open source core, model-agnostic approach, and BYOK model give full control over data and costs. This is useful for companies or sensitive projects that cannot send code to third-party services.

CodeRabbit

CodeRabbit provides line-by-line feedback, conversation summaries, and suggestions directly in the pull request. It works well for catching general issues and helping with the initial reading of the PR. Its understanding of Rust is good for common idiomatic patterns, but it usually lacks the depth needed to analyze complex unsafe blocks or broader architectural patterns across the repository.

Bito

Bito focuses on productivity inside the IDE. It helps generate code, write tests, and explain existing code. It has review features, but its strength is real-time assistance, not automated validation in CI/CD. It is useful for the person writing the code, but less as a mechanism to enforce team standards during review.

CodeAnt AI

CodeAnt AI also focuses on PRs, generating summaries and suggesting improvements. It supports multiple languages, but in Rust the analysis tends to be more shallow. It works for identifying common anti-patterns, but may generate more noise or miss subtle lifetime and concurrency issues.

SonarQube

SonarQube is a well-established static analysis tool. Its main strength is security (SAST) and tracking quality metrics over time. It has good support for Rust and can be self-hosted, which is essential for many large companies. The analysis is based on traditional static rules, which tend to be less flexible and slower to evolve than LLM-based approaches.

How to integrate AI into your Rust review process

Adopting these tools without a plan will just create more noise.

Start with a pilot: Use the tool in a small, non-critical project, with a team open to testing. Adjust the configuration and understand the signal vs noise level.
Customize the rules: Default configurations tend to generate a lot of noise. Disable what doesn’t make sense and use custom rules to reflect your team’s standards.
Train the team: Developers need to know how to use the AI. Is the feedback mandatory or just a suggestion? Who is responsible for responding to comments? Define a clear process.

The best flow puts AI first. The bot runs on every PR commit and gives immediate feedback. The developer resolves automated feedback before requesting human review. This way, when a more experienced engineer steps in, they don’t waste time on basic details and can focus on architecture and logic.

How to choose the right tool

The choice depends on what your team needs.

For open source projects: A tool with a free or open source plan is essential. Kodus’s open source core is a good option, as well as CodeRabbit’s free plan.
For startups and small teams: Ease of setup and simple per-user pricing usually matter more. Kodus, CodeRabbit, or CodeAnt AI can deliver value quickly without much configuration.
For enterprise or self-hosted: Security, privacy, and deep customization are required. The ability to run self-hosted and use BYOK is essential. Kodus fits this scenario well, allowing processing inside the company’s infrastructure. SonarQube is the more traditional choice, with solid static analysis and a strong focus on compliance.

The goal of all this is to protect the team’s attention. Human review is expensive and slow. When AI handles repetitive work, your best engineers can focus on what really matters: architecture, business logic, and long-term code health.

FAQ

What is the best AI code review tool for Rust?

The “best” tool depends on what you need. For teams that require deep repository-level analysis, enforcement of domain-specific rules, and control over data (self-hosted/BYOK), Kodus is the best option due to its semantic understanding of Rust and natural language rules.

How do you review `unsafe` Rust code safely?

Reviewing unsafe requires extreme care. The first question should always be: “Is this unsafe block necessary?” If it is, the review needs to focus on the justification comment. It should clearly state which invariants are being manually guaranteed and that the compiler cannot validate. The block should be as small as possible, isolating unsafe operations. The reviewer needs to act like an adversarial compiler, trying to break those guarantees. AI tools help by pointing out common incorrect usages, but experienced human review is still essential.

Are there free AI code review tools for Rust?

Yes. Many tools offer free plans for open source projects or small teams. CodeRabbit offers a free plan for public repositories. Kodus’s core is open source, allowing free usage via self-host, usually connected to an LLM API via your own key. SonarQube also has a free and open source Community version.

Are there self-hosted options available?

Yes. For companies with strict security and privacy requirements, self-hosting is important. Kodus was designed to run self-hosted, giving full control over the code. SonarQube also offers a Data Center version for enterprise use.

What should a Rust code review check first?

A Rust code review should start with intent and impact, not syntax. First, check if the PR clearly explains the problem, the approach, and the system impact. Then review ownership, API boundaries, error handling, async behavior, unsafe usage, tests, and changes to dependencies or public APIs. Checks like cargo fmt, cargo clippy, and cargo test should run before human review.

What are the most common mistakes in Rust code reviews?

The most common mistakes are reviewing Rust like any other language, over-relying on the compiler, and only checking whether the code works. The compiler prevents many memory bugs, but it does not catch poorly designed APIs, unnecessary clone(), blocking inside async code, errors without context, overuse of Arc<Mutex<T>>, or subtle performance issues.

How does AI improve the quality of Rust code review?

AI improves review by catching repetitive issues before a person even looks at the PR. It can flag unnecessary allocations, strange borrowing patterns, weak error handling, missing tests, and risks in async or concurrency. The biggest gain is not replacing human review, but reducing noise so more experienced engineers can focus on architecture, correctness, and business logic.

How to reduce false positives in AI code review for Rust?

The best way is to configure the tool based on your repository, not rely only on default rules. Disable low-value suggestions, add project-specific rules, ignore generated files, and teach the tool your team’s patterns. In Rust, this is even more important because some uncommon patterns may be intentional for performance, FFI, embedded systems, or low-level control.

What is the best LLM model for code review?

There isn’t a single best model. Benchmarks like Code Review Bench show that different models excel at different things. Some are better at finding bugs, others at giving useful suggestions or handling context. In practice, the best model is the one that works best with your code and your team’s context. See comparisons here: https://www.codereviewbench.com/