By default, Semgrep reports a tainted source-sink permutation only once and reports the data flow that traverses the shortest path. Any longer paths with the same source-sink combination are not shown.Documentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
Analysis of two tainted data flows
Take a look at these two examples: Call stack 1:Interfile analysis
If both tainted data flows are identified in the same scan, and the scan has interfile analysis enabled (--pro, or Pro Engine enabled in the Cloud Platform), only Call stack 1 is reported as a finding. It has a shorter path, and has the same sourceA -> sinkB taint.
This speeds up triage by ensuring you are only reviewing unique findings. It’s especially useful for languages with polymorphic classes that can add noise for a singleton taint.
Intrafile analysis
If only intrafile / interprocedural analysis is performed (--pro-intrafile), Semgrep only reports a finding for call stack 2. Call stack 1 would not be identified, because it crosses file boundaries.
Best practices for testing tainted data flows
To understand in greater detail how Semgrep detects tainted data flows, you can use your own test cases to review different paths.Dry runs
To avoid sending test data to Semgrep AppSec Platform and potentially confounding existing findings, usesemgrep scan or semgrep ci --dry-run.
When testing locally, adding --dataflow-traces allows you to see the taint traces as you would in the Semgrep AppSec Platform UI.
Sample taint dataflow reporting
The following is an example that shows dataflow traces traversing multiple files, demonstrating interfile taint tracking:Changing Pro analysis options
You can change whether you are using the--pro or --pro-intrafile option depending on the exact flow you’re testing, as described in the preceding section, Analysis of two tainted data flows.