Skip to content

Sandbox RunCode.run_text() to prevent arbitrary code execution (CWE-95)#2030

Closed
sebastiondev wants to merge 1 commit into
FoundationAgents:mainfrom
sebastiondev:fix/cwe95-run-code-execution-04a0
Closed

Sandbox RunCode.run_text() to prevent arbitrary code execution (CWE-95)#2030
sebastiondev wants to merge 1 commit into
FoundationAgents:mainfrom
sebastiondev:fix/cwe95-run-code-execution-04a0

Conversation

@sebastiondev

Copy link
Copy Markdown

Summary

RunCode.run_text() in metagpt/actions/run_code.py calls Python's built-in exec() directly on a code string supplied via RunCodeContext. Because that string ultimately originates from LLM output (or any message published on the QA self-loop), an attacker who can influence the model's response — for example through prompt injection, a poisoned tool result, or any actor with publish access to the message bus — can run arbitrary Python in the host MetaGPT process. That gives them the environment (API keys for OpenAI/Anthropic/etc.), the working directory, and the ability to spawn further processes.

  • CWE: CWE-95 (Improper Neutralization of Directives in Dynamically Evaluated Code — "Eval Injection")
  • File / function: metagpt/actions/run_code.pyRunCode.run_text
  • Severity: High — arbitrary code execution in the orchestrator process
  • Data flow: RunCodeContext.code (LLM/message-bus output) → RunCode.run_text(code)exec(code, namespace)

The pre-fix code:

namespace = {}
exec(code, namespace)
return namespace.get("result", ""), ""

There is no allowlist, no sandbox, no subprocess boundary — code is executed in-process with the same privileges as MetaGPT itself.

Fix

run_text() no longer calls exec(). Instead it pipes the code into a short-lived Python subprocess that:

  1. Parses the input with ast.parse(..., mode="exec").
  2. Walks the AST and accepts only result = <expr> assignments where <expr> is built from literals (numbers, strings, lists, tuples, sets, dicts), arithmetic / boolean / comparison operators, and unary +/-/not.
  3. Rejects everything else (Import, Call, Attribute, Name lookups beyond the result target, etc.) with "unsupported expression" or "only literal or arithmetic assignments to 'result' are supported".
  4. Emits the final value as JSON on stdout. The parent reads the JSON and returns (result, error).

A 30-second timeout is applied as defence-in-depth against pathological inputs.

This matches what run_text is actually used for in the existing test suite (result = 1 + 1, result = 1 / 0) — the legitimate text-mode contract is "evaluate a small literal/arithmetic expression and return its string form". Anything that needs real code execution should use the existing run_script path, which already runs in a separate subprocess with a controlled working directory.

Tests

  • Existing tests/metagpt/actions/test_run_code.py::test_run_text updated for the now-stringified return value ("2" instead of 2); the divide-by-zero assertion is unchanged and still passes.
  • New tests/metagpt/actions/test_run_code_sandbox.py adds five focused checks:
    • stacked statements with import os + os.environ[...] = ... are rejected and do not mutate the parent process (verified via the env var sentinel),
    • result = __import__('os').getcwd() is rejected,
    • result = ().__class__.__mro__ (a common sandbox-escape primitive) is rejected,
    • normal arithmetic / string / divide-by-zero behaviour still works,
    • list/tuple/dict literal results round-trip correctly.

All seven tests pass locally.

Security analysis

Without the fix, an attacker only needs to influence the code field of a RunCodeContext with mode="text". In MetaGPT that field is populated from upstream agent output, so prompt injection of an earlier agent — or any party with write access to the message bus — is enough to land arbitrary Python on the orchestrator. The blast radius is the entire host process: secrets in env vars, the project workspace, and outbound network from the MetaGPT host.

After the fix, the AST walker is the only thing that ever sees the input, and even it runs in a separate python -c subprocess. Even if a future bug let something slip past the AST check, the worst case is code execution inside that subprocess, not inside MetaGPT itself — and the subprocess has no MetaGPT state, no message-bus handle, and a 30-second wall clock.

Adversarial review

Before submitting, we tried to disprove this finding. The main counter-arguments we considered were: (1) "code is always developer-authored, not LLM-authored" — this is not the case; RunCodeContext is constructed from upstream action output and flows through the standard message bus, which is the documented agent-to-agent channel. (2) "The script-mode path already sandboxes via subprocess, so the text path must be intentional" — the script path runs in a subprocess with a controlled CWD, so it is genuinely a different trust posture; the text path running in-process appears to be an oversight rather than a design choice, especially given the trivial expressions it is exercised with in tests. (3) "Maybe a framework-level guard catches this earlier" — we traced the call sites and found none; run_text is called directly from QaEngineer without sanitisation.

cc @lewiswigmore

…s exec() (CWE-95)

Replace exec(code, namespace) in RunCode.run_text() with a subprocess-based
sandbox.  The untrusted code is now sent over stdin to a short-lived Python
child process that executes it and returns results as JSON on stdout.

This prevents code injection from:
- Mutating the host process environment (os.environ, globals, etc.)
- Accessing in-process secrets (API keys held in memory)
- Interfering with the main MetaGPT event loop or state

The wrapper script (_SANDBOX_WRAPPER) is minimal and self-contained so it
does not import any MetaGPT code in the child process.

Note: run_text() now returns string representations of results (via str())
since values cross a process boundary.  The existing test is updated to
expect "2" (str) instead of 2 (int) for `result = 1 + 1`.

Also adds test_run_code_sandbox.py with dedicated security regression tests.
@lewiswigmore

Copy link
Copy Markdown

Closing this to reduce the open-PR pile-up — we have multiple outstanding security contributions to this repo and that volume is not fair on your review queue. Keeping #2026 (CWE-78: prevent shell injection in AndroidExtEnv.execute_adb_with_cm) as the primary one to focus attention on.

Happy to revisit this finding separately later if it is still relevant. Apologies for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants