Imagine opening your server’s terminal and letting a stranger type whatever they want. That is essentially what happens when untrusted input reaches an operating system command in your code. If your code runs OS commands using untrusted data, you are effectively doing just that. Allowing attackers to execute os commands can result in exposure of sensitive data, disruption of services, and arbitrary code execution. In short, total compromise of the system. To avoid running OS commands with untrusted input, use libraries with the same functionality wherever possible. If OS commands are unavoidable, use system libraries that can separate the command from the arguments and flags, and always validate and escape the input. In this article, we will first explain what OS command execution is and why developers use it. Next, we will cover common attacks that exploit this functionality. After that, we will show how these issues can be detected in code. Finally, we will discuss practical steps you can take to reduce risk.Documentation Index
Fetch the complete documentation index at: https://docs.semgrep.dev/llms.txt
Use this file to discover all available pages before exploring further.
What is Command Injection?
Applications sometimes call out to the underlying operating system to perform tasks that are difficult to implement otherwise. Examples include listing files, converting media formats, invoking system utilities, or starting background processes. The convenience of delegating work to existing system tools is what makes this technique appealing. The risk arises because system shells interpret more than just text. They recognize special characters such as&, ;, or | that can change the meaning of a command. If untrusted input from users, APIs, or external systems reaches the shell without proper handling, it can alter the command in ways the developer did not intend. This creates an opening for OS command injection.
Common Code Injection Attacks
One of the most basic forms of OS command injection involves chaining commands. Suppose an application accepts a user parameter and uses it directly in a system call. An attacker could supply input that ends the original command and appends a new one. For example, consider a URL that runs a script with user input:%20 character into a string and the actual command executed would become:
Detecting Command Injection Vulnerabilities in Your Code
To illustrate, here is a simplified Python Flask code example:directory parameter is used in a system command. If someone requests:
whoami is executed. The vulnerability arises from the data flow: input from a web request moves directly into an OS command without filtering or validation.
Developers can look for red flags such as functions that invoke the shell (system, exec, popen, or subprocess with shell=True) combined with input that originates from outside the application. Tools like Semgrep can automatically trace this flow. Semgrep can identify when untrusted sources, like web request parameters, reach sensitive functions that execute commands. This makes it possible to scan your codebase for such patterns and prevent them before release.