This document covers the Semgrepignore v2 target filtering system that replaces the legacy v1 implementation, referred to as “v1”.Documentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
The target filtering process
Asemgrep scan command takes one or more scan roots as
arguments. The default scan root is the current folder, ..
Scan roots are folders, individual files, or named pipes that should be
expanded into a list of regular files to be analyzed. Symbolic links are
allowed as scan roots.
Expanding a folder consists of listing its contents recursively with
the following exceptions:
- Symbolic links other than the original scan roots are ignored.
- In Git projects, Git submodules are ignored.
- Paths excluded via Semgrepignore patterns are ignored. Semgrepignore patterns can be of different sources which are detailed in the upcoming section.
.semgrepignore
files as well as .gitignore files in Git projects. In v1 where is no
notion of a project root, the .semgrepignore file is unique and
looked up in the current folder.
Semgrep determines the project root for each scan root by first
obtaining the real path (physical path) to the scan root. Then,
Semgrep searches up the file hierarchy for a .git folder or
similar used by one of the popular file version control systems
(Git, Mercurial, etc.) indicating a project root.
If no project root is found this way, it
defaults to the scan root itself if it is a folder or to its containing
folder if it is a regular file.
Sources of Semgrepignore patterns
A Semgrepignore pattern is a glob pattern that is matched by Semgrep against file paths to determine whether these paths should be allowed or disallowed as target files. Semgrep looks up Semgrepignore patterns in the following places:- command-line
--excludeand--includefilters; - the
.semgrepignorefile in the current folder (v1 only); - all the
.semgrepignorefiles in the project (v2 only); - all the
.gitignorefiles in the project (v2 only); - default Semgrepignore patterns.
- command-line
--excludeand--includefilters; - default Semgrepignore patterns,
.gitignorefiles,.semgrepignorefiles.
.semgrepignore file:
--exclude or --include filters,
hello.c will be first deselected by *.c and then
reselected by the negated pattern !hello.c.
However, if we move the *.c exclusion pattern to the command line by
invoking semgrep --exclude *.c,
the file hello.c is deselected and ignored even if
the .semgrepignore file contains !hello.c.
In a Git project under Semgrepignore v2, .gitignore and
.semgrepignore files are consulted in the same order as in the
Gitignore specification. In a folder containing both a .gitignore
and a .semgrepignore file, the .gitignore file is read before the
.semgrepignore file.
Default Semgrepignore patterns apply in projects that lack a main
.semgrepignore file. In v1, the main .semgrepignore file is
expected in the current folder. In v2, it is expected at the project
root. These default patterns are:
Semgrepignore pattern syntax
In Semgrepignore v2, the pattern syntax conforms to the Gitignore pattern syntax. They are glob patterns which support* and ** with their usual
meanings. For example, pattern **/tmp/*.js matches paths tmp/foo.js and
src/tmp/bar.js.
Note that the Gitignore specification contains subtleties associated
with determining whether a pattern is anchored (relative to the folder
containing the pattern) or floating (relative to the folder containing
the pattern or any of its subfolders). For
example, /a and a/b are anchored patterns but not a/. Please
consult the Gitignore documentation for details.
As a deviation from the Gitignore syntax, Semgrepignore syntax supports
:include directives. :include followed by an unquoted file path
relative to the path of folder of the source .semgrepignore file
(the current folder in v1) inserts patterns from that file.
A common use case is to insert the line :include .gitignore at the
beginning of a .semgrepignore file so as to avoid duplicating the
Gitignore patterns. Included files may not contain include
directives.
Legacy Semgrepignore v1
In Semgrepignore v1, the following exceptions to the v2 specification apply:- unsupported: pattern negation with
! - unsupported: character ranges such as
[a-z] - only one
.semgrepignorefile is supported and it must be in the current folder