With default CI configurations, monorepos will be scanned as a single project in Semgrep. However, monorepos very often contain a large amount of code and the code is usually divided into to different components or modules. As such, it can be helpful to scan a monorepo in parts for multiple reasons:Documentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
- To improve scan performance in CI and reduce CI run times
- To logically split the monorepo to simplify managing findings
NOTEHistorical secrets scanning relies on examining the entire repo commit history and cannot be split up by path as other scan types like Code, Supply Chain, or Secrets. As such, it is recommended to turn off historical secrets when splitting up a monorepo by path.
How to configure Semgrep in CI to split up a monorepo
When scanning a repo with Semgrep in CI, the base command issemgrep ci. To understand this default setup for your source code manager (SCM) and CI provider, see Getting started with Semgrep in continuous integration (CI).
There are two features provided by Semgrep to split up a repo. Consider a monorepo named monorepo with four main modules:
--subdir flag with the relevant path to only scan files in that module’s code path:
src/moduleA/, this command sends the results to a project called monorepo/src/moduleA. If you want to change the project name, set the SEMGREP_REPO_DISPLAY_NAME environment variable, available since Semgrep version 1.61.1.
For example:
SEMGREP_REPO_DISPLAY_NAME. This is necessary to ensure findings have a consistent status and is helpful for developers and security engineers to understand which findings pertain to the module that they are responsible for.
To scan the entire monorepo, trigger one scan for each module.
INFOYou must only change
SEMGREP_REPO_DISPLAY_NAME. Ensure that SEMGREP_REPO_NAME is still properly set (either automatically if using a supported SCM and CI provider or explicitly) as with any Semgrep scan, in order to retain hyperlink and PR/MR comment functionality.--subdir flag takes a single folder as input. If you want to scan multiple folders as part of one scan, you will have to use --include and --exclude (see CLI reference) to tell Semgrep what paths to include. This performs file targeting across the whole monorepo. but only analyzes the included files.
Unlike --subdir, --include and --exclude don’t automatically direct results to a corresponding project, so you always have to set SEMGREP_REPO_DISPLAY_NAME.
Here’s an example using --include.
INFOWARNING: if
--include and --exclude are used in a semgrep ci scan without setting SEMGREP_REPO_DISPLAY_NAME, that scan might close findings that aren’t detected because that part of the repo was not scanned.Examples using GitHub Actions
The following examples each provide a GitHub Actions workflow file. This is 1 of 4 workflow files you would need to set up all the necessary scans. Each workflow file corresponds to a module of the monorepo you would like to scan and treat as a separate project in Semgrep AppSec Platform. Place all the files in the monorepo’s.github/workflows/ folder.
You can name each workflow file whatever you like, but it may be helpful to name it after the module it corresponds to. In this example, something like semgrep_moduleA.yml would be ideal.