Skip to content

Errors & metrics

SDKs should make runtime failures easy to handle without hiding security semantics. A command failure, a policy denial, and a transport failure are different events and should not collapse into one generic exception.

ErrorMeaningCaller action
CommandFailedThe guest command exited non-zero.Inspect exit code and bounded stdout/stderr.
TimeoutThe operation exceeded its deadline.Stop, retry, or snapshot intentionally.
PolicyDeniedAdmission, network, secret, filesystem, or profile policy refused the operation.Fix policy or remove the operation.
SandboxUnavailableThe named sandbox is not running or cannot be reached.Recreate, resume, or inspect lifecycle state.
BackendUnsupportedThe selected backend cannot provide the requested feature.Choose another backend or avoid that feature.
TransportErrorCLI, vsock, socket, or SDK transport failed before guest result was available.Retry only if the operation is idempotent.

Current Python and TypeScript runtime SDKs expose lower-level errors for mode selection and live transport failures. The table above is the product parity target for higher-level lifecycle clients.

Runtime SDK command execution should converge on a result shape like:

exit_code
stdout
stderr
duration_ms
run_id
audit_id
timed_out

Security requirements:

  • stdout and stderr are bounded;
  • raw secret values are redacted before surfacing in errors or logs;
  • command args and env values are not written into audit detail strings;
  • callers can correlate a result with an audit record without exposing payload bytes.

Metrics should be split by scope:

ScopeExamples
Runtimeactive sandboxes, build queue depth, backend selection, failed launches.
SandboxCPU, memory, disk, network counters, lifecycle state, last readiness transition.
Commandduration, exit status, timeout, output byte counts, backpressure events.
Buildbuilder VM status, cache hit/miss, artifact hash, elapsed build phases.

Metrics are operational metadata. They should not include argv values, env values, file contents, stdout/stderr payloads, or secret names unless a policy explicitly permits the label.

Debug logs are useful but risky. Recommended rules:

  • log IDs, hashes, counts, state names, and backend names;
  • do not log raw credentials, stdin, stdout, stderr, or guest file contents;
  • gate verbose logs behind an explicit env var or CLI flag;
  • include the audit/run identifier when available.

Use these while SDK error and metric helpers mature:

Terminal window
mvmctl run --receipt /tmp/receipt.json -- python task.py
mvmctl boot-report devbox --json
mvmctl logs devbox
mvmctl audit tail -n 20
mvmctl audit verify --tenant local
mvmctl doctor --json

The SDK reference should move rows from target behavior to shipped behavior only when shared tests prove the shape across the relevant languages.