Building MicroVM Images
mvm is a library, not a project to fork. You keep your code, your flake.nix, and your mvm.toml in your own repository, and mvmctl builds your microVM image by running nix build against your flake. You should never need to edit anything inside the mvm repository.
Under the hood, mvm wraps microvm.nix (MIT) — that’s the NixOS module that abstracts Firecracker, Cloud Hypervisor, QEMU, crosvm, kvmtool, and stratovirt. The choice is recorded in ADR-013.
The two files in your project
Section titled “The two files in your project”Every mvm project has a mvm.toml and a flake.nix:
flake = "."profile = "default"vcpus = 1memory_mib = 256{ description = "my microVM app";
inputs = { nixpkgs.url = "github:NixOS/nixpkgs/nixos-25.11"; mvm.url = "github:tinylabscom/mvm"; };
outputs = { self, nixpkgs, mvm, ... }: { packages.x86_64-linux.default = mvm.lib.x86_64-linux.mkGuest { name = "my-app"; services.web = { command = [ "/usr/local/bin/web" ]; }; }; };}That’s the whole user-side surface. mvmctl build reads mvm.toml, follows flake = "." to your flake, and runs nix build against it.
Building
Section titled “Building”From your project directory:
mvmctl build # reads mvm.toml; builds the named flake targetmvmctl run # builds (if needed) + bootsmvmctl build is a host command. You run it from macOS or Linux, and mvm sends the Linux-only Nix work into the builder VM. You do not need to enter mvmctl dev shell first. The shell is for debugging the build environment manually, not a prerequisite for normal builds.
mvmctl selects the runtime backend automatically when you boot the finished image. Use --hypervisor on runtime commands when you want to force a specific runtime backend:
mvmctl up --flake . --hypervisor apple-containermvmctl up --flake . --hypervisor firecrackerIf you want to drive nix build directly without mvmctl in the loop:
nix build .#defaultThat direct Nix command is only for users who intentionally manage their own Nix environment. It bypasses mvm’s builder VM orchestration and is not required for the normal workflow. See Builder VM for the detailed build boundary.
What mkGuest accepts
Section titled “What mkGuest accepts”mvm.lib.<system>.mkGuest { … } takes a single attribute set:
| Field | Type | Purpose |
|---|---|---|
name | string | Human-readable identifier; baked into the rootfs at /etc/mvm/name. |
entrypoint | attrs | The boot-time workload. Exactly one of three forms (see below). |
services | attrs (optional) | Auxiliary supervised services. Same shape as entrypoint.services. |
packages | [pkg] (optional) | Extra Nix packages added to the rootfs closure. |
hypervisor | string (optional) | Override the default (firecracker). |
vcpus, memory_mib | int (optional) | Resource defaults; mvm.toml overrides at run time. |
dev | bool (optional) | Explicit accessible-vs-sealed override. Inferred from entrypoint by default. |
uids | attrs (optional) | `{ agent = 990; entrypoint = 0 |
extraFiles | attrs (optional) | { "/abs/path" = { content; mode?; }; } baked into the rootfs at build time. |
Entrypoint forms
Section titled “Entrypoint forms”entrypoint declares exactly one of:
# Form 1 — interactive PTY shell (accessible image, dev-friendly)entrypoint.shell = "/bin/bash";
# Form 2 — single sealed program (production default)entrypoint.command = [ "/usr/local/bin/serve" "--port" "8080" ];
# Form 3 — supervised multi-serviceentrypoint.services = { web = { command = [ "/bin/web" ]; }; worker = { command = [ "/bin/worker" ]; restart = "always"; };};Attached vs detached — lifecycle of the running VM
Section titled “Attached vs detached — lifecycle of the running VM”Independent of the sealed/accessible distinction, mvm exposes two runtime lifecycle modes modeled after libkrun’s SpawnMode:
| Mode | What it means | When to use |
|---|---|---|
attached | VM lifecycle bound to the calling process — Ctrl-C / process exit sends SIGTERM to the VM. | mvmctl run interactive, mvmctl dev shell sessions, test harnesses that want deterministic teardown. |
detached | VM survives caller exit — only mvmctl down (or VmBackend::stop) terminates it. | mvmctl up (background), production agents, CI fixtures that boot once and run multiple phases. |
The default is detached. Override:
mvmctl run --attached # attached mode; CLI Ctrl-C kills VMmvmctl run # detached mode (default); VM keeps runningmvmctl detach my-app # convert a running attached VM to detachedmvmctl wait my-app # block until VM exits (only meaningful for attached)The lifecycle mode is orthogonal to the sealed/accessible distinction:
| Combination | Use case |
|---|---|
| accessible + attached | Dev-mode debug session: entrypoint.shell, Ctrl-C ends the session. |
| accessible + detached | Long-running dev container: shell available, survives reconnect. |
| sealed + attached | Test harness running an entrypoint to completion, exit captured. |
| sealed + detached | Production: entrypoint.command, runs forever until mvmctl down. |
The trait surface lives at mvm_core::vm_backend::{StartMode, VmBackend::start_with_mode, VmBackend::wait, VmBackend::detach}. The libkrun backend records StartMode intent at ~/.mvm/vms/<name>/mode.json; mvmctl status surfaces it.
Sealed vs accessible — the same flake works for both
Section titled “Sealed vs accessible — the same flake works for both”The mvm builder transparently determines whether the resulting image is sealed (production — no console attach) or accessible (dev — mvmctl console <vm> opens an interactive PTY over vsock). The decision is encoded in passthru.mvm.{accessible, sealed, entrypointKind} on the resulting derivation, and mvmctl reads that metadata to gate the console subcommand.
The default inference:
| Entrypoint form | Default mode |
|---|---|
entrypoint.shell = … | accessible (dev = true) |
entrypoint.command = … | sealed (dev = false) |
entrypoint.services = … | sealed (dev = false) |
Override either way with the explicit dev field:
# A shell entrypoint that's still sealed (no console attach allowed)mkGuest { entrypoint.shell = "/bin/bash"; dev = false; ... }
# A command entrypoint that's accessible for debuggingmkGuest { entrypoint.command = [ "..." ]; dev = true; ... }The same flake source is consumed in both dev and production builds — there’s no separate “dev flake” the user has to maintain. The difference is purely in the resulting image’s metadata + the host-side console gate.
The mkGuest library produces a busybox-as-PID-1 rootfs (no NixOS, no systemd) and emits an ext4 image directly. The boot path is: kernel → /init script → mounts /proc /sys /dev → execs your entrypoint. No service manager between the kernel and your code. mvm’s security overlay (per-service uids, seccomp tier, dm-verity, read-only /etc) layers on top in Phase 6 without changing this base.
Boot-time targets
Section titled “Boot-time targets”Floor: ≤ 300 ms cold p50 on every backend. A backend that can’t hit it is a backend we drop.
| Backend | Cold p50 | Snapshot-cloned p50 | Notes |
|---|---|---|---|
| Firecracker (Linux/KVM) | ≤ 300 ms | ≤ 30 ms | Default for typical workloads. |
| Cloud Hypervisor (Linux/KVM) | ≤ 300 ms | ≤ 50 ms | Tier-1 peer of FC. Adds VFIO/GPU, virtio-gpu, virtio-fs, larger guests. Opt-in via --hypervisor cloud-hypervisor. |
| libkrun / libkrun (Linux/KVM) | ≤ 300 ms | ≤ 30 ms | Cross-platform default; libkrun-backed. |
| libkrun / libkrun (macOS HVF) | ≤ 300 ms | ≤ 60 ms | macOS path; HVF adds ~100ms over KVM. |
| Apple Virtualization framework | ≤ 300 ms | ≤ 200 ms | Legacy ladder; superseded by libkrun per ADR-013. |
The numbers are surfaced on every mkGuest derivation as passthru.mvm.expectedBootMs so you can nix eval .#default.passthru.mvm.expectedBootMs to confirm. Phase 9 enforces with xtask perf --backend <name> --p50-ms 300 --runs 100. See ADR-013 §“Boot-time budget” for rationale.
The floor is achievable because the rootfs uses busybox-as-PID-1 with a custom /init (no NixOS, no systemd, no OpenRC). See ADR-013 for why this matters and the implementation breadcrumb.
What’s inside the mvm repository (and why you don’t touch it)
Section titled “What’s inside the mvm repository (and why you don’t touch it)”The repository’s nix/ directory contains:
nix/flake.nix— exposeslib.<system>.mkGuestfor your flake to consume.nix/profiles/minimal.nix— an internal test fixture used by mvm’s own smoke tests (tests/smoke_libkrun.rs,tests/nix_flake_structure.rs). Not a starter template.
The internal fixture lives under the internal- namespace in flake outputs (nixosConfigurations.internal-minimal-…, packages.<system>.internal-minimal-runner) so the boundary is mechanical: anything internal-* is for mvm developers, not for users.
Validating a change to your flake
Section titled “Validating a change to your flake”cd my-appnix flake check --no-buildmvmctl validate does the same with extra mvm.toml checks layered on.
Cross-platform notes
Section titled “Cross-platform notes”mvm runs Nix builds inside the project builder VM and copies the finished kernel/rootfs artifacts back to the host cache. You don’t need host-side Nix, and you don’t need to enter a dev shell before building.
- Linux: the builder VM provides the Linux build boundary and cache policy. Firecracker is the default runtime backend when
/dev/kvmis available. - macOS: the host
mvmctl buildcommand orchestrates a Linux builder VM. The resulting runtime image can then boot with Apple Virtualization (--hypervisor apple-container) or another available macOS runtime backend. - Windows: Tauri-only (the
mvm-studiodesktop app packages a WSL2-backed builder + runtime). See ADR-031.
Rootless workloads
Section titled “Rootless workloads”PID 1 must be uid 0 (kernel mandate). Everything else can — and by default in production does — run non-root. mkGuest’s uids knob controls the privilege drop:
| Process | Default uid | Role |
|---|---|---|
/init (PID 1) | 0 | Mounts pseudofs, forks the agent in the background, drops privs, exec’s the entrypoint |
mvm-guest-agent | 990 | Vsock RPC handler (never needs root); supervised by /init |
| Entrypoint (workload) | 0 in dev, 1000 in prod | Your service or shell |
Agent binary status: as of Phase 1 W6.1.1 the agent at
/usr/local/bin/mvm-guest-agentis a stub — a sh script that logs startup and sleeps. The supervision pattern is real (init forks it under uid 990 before setpriv-exec’ing the entrypoint); the vsock RPC surface lands when W6.1.2 swaps in the cross-compiled Rust binary. Every derivation surfacespassthru.mvm.agentBinary = "stub" | "real"so production deployments can refuse to boot a stub image.
The dev/prod default split is intentional:
- Dev keeps entrypoint as root because debug shells expect root:
apt install,mount,tcpdump. Forcing rootless dev would break those flows on first try. - Prod drops to uid 1000 by default per ADR-002 W2.1 — “no guest binary can elevate to uid 0.” A workload that isn’t root can’t be re-elevated.
/init uses setpriv --reuid=N --regid=N --clear-groups --no-new-privs -- to drop. --no-new-privs blocks setuid re-elevation, so even if the workload finds a SUID binary, it can’t reach uid 0.
Override
Section titled “Override”# Rootless dev shell — forces non-root even in dev mode.mkGuest { entrypoint.shell = "/bin/bash"; uids = { entrypoint = 1000; };}
# Rootful prod workload — explicit override, rarely the right call.mkGuest { entrypoint.command = [ "/usr/local/bin/serve" ]; uids = { entrypoint = 0; };}
# Non-default agent uid (e.g. to avoid collisions with host-side ranges).mkGuest { entrypoint.command = [ "/bin/x" ]; uids = { agent = 5000; };}The resolved values surface as passthru.mvm.uids = { agent; entrypoint; } and passthru.mvm.rootlessEntrypoint :: bool so mvmctl status can cross-check against /proc/<pid>/status at runtime.
Why no OCI
Section titled “Why no OCI”mvm is microVMs, not containers. Even though the underlying libkrun library exposes OCI image pulls (RootfsSource::Oci), mvm uses only the host-local disk-image path. The bridge between your Nix-built .ext4 rootfs and the runtime is a sibling .raw hard-link with fstype("ext4") — no registry, no auth, no pull cache, fully offline-by-default once your rootfs is built. ADR-013 §“Non-goal: OCI / container images” carries the full rationale.