Skip to content

Agent tool contract

Agent integrations should expose mvm as a narrow tool, not as ambient shell access. The application owns validation, policy selection, redaction, and cleanup. The microVM owns guest execution.

Use this contract when an LLM, coding agent, or workflow engine needs to run generated code, inspect files, call a command, or preserve recoverable state.

Tool actionApp responsibilitySandbox responsibility
Create sandboxChoose image, policy, TTL, resource limits, and owner.Boot the selected microVM artifact.
Write input filesValidate paths, size, content type, and retention.Receive files under guest paths.
Run commandValidate argv, timeout, working dir, env refs, and output budget.Execute inside the guest policy boundary.
Read output filesRestrict paths and size; scan or redact content before model use.Return requested bytes from guest storage.
Return logs/resultsRedact secrets and user data; attach audit/run IDs.Emit command status, logs, and receipts where available.
Preserve stateDecide pause, cold, snapshot, or volume retention.Save backend-specific state.
CleanupStop, destroy, lock volumes, delete snapshots, or keep with explicit TTL.Release compute and retained state according to command.

Keep the model-facing request small and typed. Do not let the model pass raw host paths, arbitrary environment variables, or host network bindings.

{
"language": "python",
"files": [
{
"path": "/work/main.py",
"content": "print('hello')"
}
],
"command": ["python", "/work/main.py"],
"timeout_seconds": 20,
"network": {
"mode": "none"
},
"state": {
"retention": "destroy"
}
}

Validation rules:

  • reject absolute host paths and path traversal;
  • bound file count, file size, stdout, stderr, and runtime;
  • allow only known image or flake targets;
  • require deny-by-default network unless the caller has a policy grant;
  • require secret references, not literal secret values;
  • require an explicit retention choice: destroy, stop, cold, or snapshot.

Today the broadest shipped lifecycle surface is the CLI. A tool runner can use the CLI while preserving the same contract the SDK target should expose.

Terminal window
mvmctl build ./agent-tool
mvmctl up ./agent-tool --name agent-tool-call
mvmctl fs write agent-tool-call /work/main.py < /tmp/request-main.py
mvmctl exec agent-tool-call --timeout 20 -- python /work/main.py
mvmctl logs agent-tool-call
mvmctl down agent-tool-call

For one-off command execution, prefer a single bounded run when the workflow does not need staged files or persistent state:

Terminal window
mvmctl run --timeout 20 -- python -c 'print("bounded tool call")'

Keep the sandbox name, command exit status, receipt path, and audit/run IDs with the agent trace when those values are available.

The SDK target should make the same contract easier to write without hiding security decisions:

from mvm import NetworkPolicy, Sandbox
def run_agent_tool(request: dict) -> dict:
checked = validate_request(request)
with Sandbox.create(
image=checked["image"],
network=NetworkPolicy.deny_by_default(),
ttl_seconds=checked["ttl_seconds"],
) as sandbox:
for item in checked["files"]:
sandbox.files.write(item["path"], item["content"])
result = sandbox.commands.run(
checked["command"],
timeout_seconds=checked["timeout_seconds"],
max_output_bytes=checked["max_output_bytes"],
)
return {
"exit_code": result.exit_code,
"stdout": redact(result.stdout),
"stderr": redact(result.stderr),
"audit_id": result.audit_id,
}

Check Operations cookbook and Lifecycle matrix before treating a helper as shipped in a language SDK.

Start closed:

{
"network": {
"mode": "none"
}
}

If the agent needs outbound access, issue a separate reviewed grant:

{
"network": {
"mode": "bridge",
"allow": [
{
"host": "api.openai.com",
"port": 443
}
]
}
}

Do not let the model invent network destinations. The application should map approved tool capabilities to concrete policy, then pass only that policy to the sandbox.

Secrets should enter through references controlled by the application or operator, not through model output.

{
"secrets": {
"OPENAI_API_KEY": {
"ref": "openai-api-key"
}
}
}

Security rules:

  • never pass secrets in command args;
  • do not echo resolved secrets back to the model;
  • redact stdout, stderr, logs, error messages, and receipts before returning them to a model context;
  • grant secrets per operation, not per long-lived agent identity.

Choose one retention behavior per tool call:

RetentionUse whenCleanup requirement
destroyThe call is disposable.Stop compute and remove retained local state.
stopBuild artifacts can remain, but compute should stop.Review logs, receipts, volumes, and snapshots separately.
coldThe next call needs memory or filesystem continuity.Treat cold state as sensitive and set a TTL.
snapshotA reviewer or retry path needs a named recovery point.Store retention metadata and delete when no longer needed.

Cold state and snapshots can contain prompts, tool outputs, files, process memory, browser sessions, and credentials. Treat them as sensitive artifacts.

Return bounded, structured results to the model:

{
"sandbox_id": "agent-tool-call",
"exit_code": 0,
"stdout": "redacted output",
"stderr": "",
"timed_out": false,
"audit_id": "run_...",
"retention": "destroy"
}

If the run fails, distinguish policy denial, validation failure, timeout, guest command failure, transport failure, and cleanup failure. They require different retries and different user-facing messages.