When Semgrep rules target specific languages, they generally do not match comments in the targeted code files. Comments are not part of the semantic and syntactic structure of the document, so in most cases they are ignored. However, it’s sometimes useful to match comments. For example, comments can control the behavior of other linters, such as type checkers. You might also have certain formatting standards for comments, such as requiring that aDocumentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
TODO comment contains a ticket capturing the required work.
To match comments with Semgrep, use the generic language target to invoke generic pattern matching. (Alternatively you may use pattern-regex which does file-level matching rather than semantic / syntactic matching, which is beyond the scope of this article.)
Example rule
Suppose that your organization requires allTODO comments to have an associated Jira ticket. This rule finds TODO lines with no atlassian.net content and identifies any lines not containing a Jira Cloud ticket link.
NOTETry this pattern in the Semgrep Playground.
generic_ellipsis_max_span option, which limits the ellipsis to matching on the same line and prevents it from over-matching in this generic context.
Limiting the match to certain file types
If particular types of comments are only relevant for certain files, you can use thepaths: key to limit the rule to files of that type. For example, mypy type ignores are only relevant in Python files.
Ignoring some comments in generic mode
It is possible to ignore comments of particular types in generic mode using thegeneric_comment_style option. For example, to ignore C-style comments but match any other style: