This article explains how to buildDocumentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
semgrep-cli so that you can make and test changes to the Python wrapper.
The semgrep-cli name refers to the project that exposes the actual semgrep command. The README explains the relationship between semgrep-cli and semgrep-core.
Prerequisite
- Python >= 3.10 installed in your local machine.
pipenvfor managing your virtual environment.- Install it by following the
pipenvdocumentation. - Ensure that
pipenvis on your$PATHbefore proceeding.
- Install it by following the
Set up the environment
Most Python development is done inside thecli directory:
pytest, and installs semgrep in editable mode in the virtual environment. From the cli directory, run the following command:
(cli) when the virtual environment is active.
Next, install the Python dependencies:
INFO
SEMGREP_SKIP_BIN tells the installer that you’ll use your own semgrep-core; see below.*which semgrep should return a path within your virtual environment. On macOS, this is likely contained within $HOME/.local/share/virtualenvs/.
Get the semgrep-core binary
Almost all usages of semgrep-cli require the semgrep-core binary.
To get the binary, follow the instructions in Building semgrep-core. It takes approximately 20 minutes.
Use a precompiled binary
You can use a precompiled binary, but note two downsides:- You cannot modify
semgrep-core, for example, to fix a parse error. - Semgrep scans fail if the interface between
semgrep-cliandsemgrep-corehas changed since the binary was compiled. This has happened roughly every two months historically, but can happen at any time without notice.
brew install semgrep), a semgrep-core binary was bundled within that installation. However, it is not made available on your $PATH by default.
You can add the bundled binary to your $PATH with this series of commands, provided you have jq installed:
Run semgrep-cli
Ensure that you are in the cli/ directory, and then issue the following command:
Install semgrep
You can always run semgrep from cli/, which will use your latest changes in that directory, but you may also want to install the semgrep binary. To do this, run
semgrep team on Slack.
Now you can run semgrep --help from anywhere.
If you have installed semgrep-core from source, there are convenient targets in the root Makefile that let you update all binaries. After you pull, run:
cli/
Add Python packages to semgrep
Semgrep uses mypy to do static type-checking of its Python code. Therefore, when adding a new Python package, you also need to add typing stubs for that package. This can be done in three steps. For example, suppose you are adding the package pyyaml to Semgrep.
Install the corresponding package with typing stubs. For this
pyyaml example, the corresponding package is types-pyyaml. In the following command, --dev specifies that this package is needed for development but not in production. This command updates cli/Pipfile with the typing stubs package, and adds both the typing stubs and the package itself to your Pipfile.lock. This allows you to import the package in your code (for example, import yaml as pyyaml).Add the typing stubs package to
.pre-commit-config.yaml so that the pre-commit mypy hook can find the package.Add the original package to This change makes your package a dependency of published Semgrep. Without this change, if you create a pull request, the CI job called
cli/setup.py in the install_requires list variable. You can find the version number either in the Pipfile.lock file or by looking up the most recent major version of the package online.build docker image fails with a ModuleNotFoundError, indicating it cannot find your package.Troubleshooting
For a reference build that’s known to work, consult the rootDockerfile
to build Semgrep inside a container. You can check that it builds with
Testing
semgrep-cli uses pytest for testing.
To run tests, run the following command:
semgrep-cli also includes pytest-benchmark
to allow for basic benchmarking functionality. Run the following command: