Reduce false positives in semgrep scan

Customize your rules

If you notice that a specific Semgrep Community rule generates a high rate of false positives, the rule is said to be noisy. You can:

Fork and customize that rule to improve its performance
Remove the rule from the scan

Set up local rules

To have more granular control over the rules in a ruleset, you must add the ruleset to your machine, then configure Semgrep to use those local rules.

Navigate to the repository that hosts the rule. Usually, the Semgrep community rules repository hosts the rule, but Semgrep also imports rules from other repositories. From the rule’s entry in the Registry, check the author of the rule to confirm the source.

Fork or clone the repository to create a local copy of all the rules.

To clone, click Code then copy and run the cloning command in your CLI. This creates a semgrep-rules repository.

To fork, click Fork and follow the steps provided by GitHub. You must also clone the forked repository to your machine.

In your CLI, navigate to your semgrep-rules repository.

Find and copy the rules you want to use in a folder within your target codebase. Give the folder a descriptive name, such as semgrep-rules.

To use the local rules, run the following command:

semgrep scan —config=‘SEMGREP_RULES_FOLDER/‘

Customize a rule from a Semgrep Community ruleset

Edit the noisy rule to improve its performance.
Test your rule improvements by entering:

semgrep scan —config=‘SEMGREP_RULES_FOLDER/NAME_OF_IMPROVED_RULE.yaml’

Remove the rule from the scan

Delete the rule from the folder containing your Semgrep rules.

Use advanced analyses and Pro rules

Optimizing rules can be a time-consuming process. Often, rules are not necessarily noisy, but lack additional analysis to detect true positives while ignoring false positives. Semgrep Code provides cross-function (interprocedural) and cross-file (interfile) analyses. These analyses both reduce false positives and detect true positives that Semgrep Community Edition (CE) can’t find. For some languages and frameworks, such as Java or the Python Django framework, Semgrep also provides advanced analyses that take into account the language’s characteristics, framework-specific dataflows, and the like. These analyses are available by default once you’ve signed in to Semgrep.

NOTESemgrep Code is free for up to 10 users.

You need a GitHub or GitLab account to sign in to Semgrep.

Enter the following command:

semgrep login

Follow the steps to create an account and proceed.

Optional: Enter semgrep ci to run a scan. By default, these scans use Semgrep Pro rules, cross-function analysis, and language-specific improvements.

TIPYou can’t use the --config option with semgrep ci once you are logged in. To use your custom rules, add them to your Policies page.

Analyses and improvements available by default

The following features are enabled by default and help reduce false positives.

Pro rules

Semgrep Pro rules are high-confidence, professionally maintained rules provided exclusively by Semgrep. The goal of Pro rules is to provide a set of well-supported rules with improved coverage across languages and vulnerability types. Semgrep Pro rules are written using Semgrep’s latest features and, in general, target users who are looking to produce accurate, actionable findings. To see the languages with Pro rules, go to Supported languages.

Cross-function analysis

Cross-function analysis means that interactions between functions are taken into account. This improves taint analysis, which tracks unsanitized variables flowing from a source to a sink through arbitrarily many functions. To see cross-function analysis in action, run the interactive example.

Language-specific improvements

Languages such as Java and frameworks such as Django, FastAPI, and Flask have specific improvements that take into account language features and implicit dataflows. To learn more:

Supported languages > Python

Semantic detection in Java

Enable cross-file analysis

Cross-file analysis (also known as interfile analysis) takes into account how information flows between files. In particular, cross-file analysis includes cross-file taint analysis, which tracks unsanitized variables flowing from a source to a sink through arbitrarily many files. Other analyses performed across files include constant propagation and type inference. Cross-file analysis is usually used in contrast to intrafile, or per-file analysis, where each file is analyzed as a standalone block of code. To run a scan with cross-file analysis, use the following command:

semgrep ci —pro

RUN SCA AND SAST SCANS WITH ONE COMMANDThe semgrep ci command can also run SCA scans with the Semgrep Supply Chain product, which makes use of the same analyses mentioned in this document to determine reachability and reduce false positives.Dataflow and interfile analyses in particular ensure that Semgrep Supply Chain provides a high true positive rate while reducing false positives. Read the Doyensec Software Composition Analysis Benchmark to learn more.

​Customize your rules

​Set up local rules

​Customize a rule from a Semgrep Community ruleset

​Remove the rule from the scan

​Use advanced analyses and Pro rules

​Sign in to Semgrep

​Analyses and improvements available by default

​Pro rules

​Cross-function analysis

​Language-specific improvements

Supported languages > Python

Semantic detection in Java

​Enable cross-file analysis

Customize your rules

Set up local rules

Customize a rule from a Semgrep Community ruleset

Remove the rule from the scan

Use advanced analyses and Pro rules

Sign in to Semgrep

Analyses and improvements available by default

Pro rules

Cross-function analysis

Language-specific improvements

Enable cross-file analysis