Sandbox RunCode.run_text() to prevent arbitrary code execution (CWE-95)#2030
Closed
sebastiondev wants to merge 1 commit into
Closed
Sandbox RunCode.run_text() to prevent arbitrary code execution (CWE-95)#2030sebastiondev wants to merge 1 commit into
sebastiondev wants to merge 1 commit into
Conversation
…s exec() (CWE-95) Replace exec(code, namespace) in RunCode.run_text() with a subprocess-based sandbox. The untrusted code is now sent over stdin to a short-lived Python child process that executes it and returns results as JSON on stdout. This prevents code injection from: - Mutating the host process environment (os.environ, globals, etc.) - Accessing in-process secrets (API keys held in memory) - Interfering with the main MetaGPT event loop or state The wrapper script (_SANDBOX_WRAPPER) is minimal and self-contained so it does not import any MetaGPT code in the child process. Note: run_text() now returns string representations of results (via str()) since values cross a process boundary. The existing test is updated to expect "2" (str) instead of 2 (int) for `result = 1 + 1`. Also adds test_run_code_sandbox.py with dedicated security regression tests.
|
Closing this to reduce the open-PR pile-up — we have multiple outstanding security contributions to this repo and that volume is not fair on your review queue. Keeping #2026 (CWE-78: prevent shell injection in AndroidExtEnv.execute_adb_with_cm) as the primary one to focus attention on. Happy to revisit this finding separately later if it is still relevant. Apologies for the noise. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RunCode.run_text()inmetagpt/actions/run_code.pycalls Python's built-inexec()directly on acodestring supplied viaRunCodeContext. Because that string ultimately originates from LLM output (or any message published on the QA self-loop), an attacker who can influence the model's response — for example through prompt injection, a poisoned tool result, or any actor with publish access to the message bus — can run arbitrary Python in the host MetaGPT process. That gives them the environment (API keys for OpenAI/Anthropic/etc.), the working directory, and the ability to spawn further processes.metagpt/actions/run_code.py→RunCode.run_textRunCodeContext.code(LLM/message-bus output) →RunCode.run_text(code)→exec(code, namespace)The pre-fix code:
There is no allowlist, no sandbox, no subprocess boundary —
codeis executed in-process with the same privileges as MetaGPT itself.Fix
run_text()no longer callsexec(). Instead it pipes the code into a short-lived Python subprocess that:ast.parse(..., mode="exec").result = <expr>assignments where<expr>is built from literals (numbers, strings, lists, tuples, sets, dicts), arithmetic / boolean / comparison operators, and unary +/-/not.Import,Call,Attribute,Namelookups beyond theresulttarget, etc.) with"unsupported expression"or"only literal or arithmetic assignments to 'result' are supported".(result, error).A 30-second timeout is applied as defence-in-depth against pathological inputs.
This matches what
run_textis actually used for in the existing test suite (result = 1 + 1,result = 1 / 0) — the legitimate text-mode contract is "evaluate a small literal/arithmetic expression and return its string form". Anything that needs real code execution should use the existingrun_scriptpath, which already runs in a separate subprocess with a controlled working directory.Tests
tests/metagpt/actions/test_run_code.py::test_run_textupdated for the now-stringified return value ("2"instead of2); the divide-by-zero assertion is unchanged and still passes.tests/metagpt/actions/test_run_code_sandbox.pyadds five focused checks:import os+os.environ[...] = ...are rejected and do not mutate the parent process (verified via the env var sentinel),result = __import__('os').getcwd()is rejected,result = ().__class__.__mro__(a common sandbox-escape primitive) is rejected,All seven tests pass locally.
Security analysis
Without the fix, an attacker only needs to influence the
codefield of aRunCodeContextwithmode="text". In MetaGPT that field is populated from upstream agent output, so prompt injection of an earlier agent — or any party with write access to the message bus — is enough to land arbitrary Python on the orchestrator. The blast radius is the entire host process: secrets in env vars, the project workspace, and outbound network from the MetaGPT host.After the fix, the AST walker is the only thing that ever sees the input, and even it runs in a separate
python -csubprocess. Even if a future bug let something slip past the AST check, the worst case is code execution inside that subprocess, not inside MetaGPT itself — and the subprocess has no MetaGPT state, no message-bus handle, and a 30-second wall clock.Adversarial review
Before submitting, we tried to disprove this finding. The main counter-arguments we considered were: (1) "
codeis always developer-authored, not LLM-authored" — this is not the case;RunCodeContextis constructed from upstream action output and flows through the standard message bus, which is the documented agent-to-agent channel. (2) "The script-mode path already sandboxes via subprocess, so the text path must be intentional" — the script path runs in a subprocess with a controlled CWD, so it is genuinely a different trust posture; the text path running in-process appears to be an oversight rather than a design choice, especially given the trivial expressions it is exercised with in tests. (3) "Maybe a framework-level guard catches this earlier" — we traced the call sites and found none;run_textis called directly fromQaEngineerwithout sanitisation.cc @lewiswigmore