Skip to content

LLM tool integration

An LLM tool should treat the sandbox as an untrusted execution target. The model proposes code or commands; mvm runs them in a microVM with policy. For the production tool boundary, request schema, response schema, and retention rules, see Agent tool contract.

LLM
-> tool call: run code
-> app validates request
-> mvm sandbox exec
-> app redacts output
-> LLM receives result

Status: Planned lifecycle API.

def run_code_tool(code: str) -> dict:
sandbox = Sandbox.create(
image="nix:./flake#python-tool",
network=NetworkPolicy.deny_by_default(),
)
try:
sandbox.files.write("/work/main.py", code.encode())
result = sandbox.exec(["python", "/work/main.py"], timeout_seconds=10)
return {
"exit_code": result.exit_code,
"stdout": redact(result.stdout),
"stderr": redact(result.stderr),
}
finally:
sandbox.stop()
  • Validate the tool request before it reaches mvm.
  • Set a timeout.
  • Keep network disabled unless the tool needs a named endpoint.
  • Use secret references only when the tool has a policy reason to access a credential.
  • Redact output before returning it to the model.
  • Store the audit/run identifier with the LLM trace.