Efficient NixOS Remote Deploys with Selective Closure Copying

6 min read

You run nixos-rebuild switch --target-host, wait for the build to finish, and then watch your machine upload what feels like the entire internet to a server that could have fetched most of those paths from cache.nixos.org on its own.

This is one of those Nix deployment problems that looks inevitable until you learn the flag.

If you are deploying to a remote NixOS host, the first thing to try is:

nixos-rebuild switch \
  --flake ".#myhost" \
  --target-host deploy@example.com \
  --use-remote-sudo \
  --use-substitutes

That single --use-substitutes changes the copy behavior in a way that can make deploys dramatically faster. Instead of blindly pushing the full closure from your laptop or CI runner, the remote host gets a chance to fetch missing store paths from its own configured substituters first. In practice, that usually means the remote pulls public dependencies from cache.nixos.org, and you only upload the paths that are actually private or unavailable in cache.

That is the whole trick.

The default behavior is wasteful

Without substitute-on-destination behavior, a remote deploy often looks like this:

  1. You build locally
  2. nixos-rebuild copies the resulting closure to the target host
  3. The target activates the new system

The problem is step two. Your local machine ends up pushing standard nixpkgs dependencies that the remote could have downloaded perfectly well on its own from a fast public cache.

So if your system closure includes systemd, nginx, openssl, glibc, and a hundred other ordinary paths, you are pointlessly using your own outbound bandwidth as a binary cache.

That hurts most on:

  • home uplinks with weak upload bandwidth
  • remote VPS deploys over higher-latency links
  • CI runners pushing to many hosts
  • large closures where only a tiny fraction is actually custom

What --use-substitutes actually does

The high-level flag is --use-substitutes on nixos-rebuild.

The lower-level behavior underneath it is nix copy --substitute-on-destination. The Nix reference manual describes that flag as telling the destination SSH store to try substitutes on the destination side. That is the mechanism you want.

So conceptually, you are switching from this:

my machine -> push everything -> remote host

to this:

my machine -> push only uncached paths -> remote host
remote host -> fetch public paths -> cache.nixos.org / other substituters

That is almost always the better network topology.

The one-flag version

For many setups, the simple version is enough:

nix run nixpkgs#nixos-rebuild -- switch \
  --flake ".#myhost" \
  --target-host deploy@example.com \
  --use-remote-sudo \
  --use-substitutes

That keeps your deploy wrapper short and gets you most of the benefit immediately.

If your current deploy script looks like this:

nix run nixpkgs#nixos-rebuild -- switch \
  --flake ".#myhost" \
  --target-host deploy@example.com \
  --use-remote-sudo

the diff is almost embarrassingly small:

 nix run nixpkgs#nixos-rebuild -- switch \
   --flake ".#myhost" \
   --target-host deploy@example.com \
-  --use-remote-sudo
+  --use-remote-sudo \
+  --use-substitutes

And yet that tiny diff can be the difference between “ship the whole closure over SSH” and “let the remote fetch the boring stuff itself.”

The trusted-user gotcha

This is the part many writeups skip.

If you deploy as a non-root user on the remote machine, that user needs to be trusted by the remote Nix daemon. Otherwise the substitute request can be ignored and you quietly fall back to the slow path.

On NixOS, that usually means:

{
  nix.settings.trusted-users = [ "root" "deploy" ];
}

If you deploy as root@host, root is already trusted. If you deploy as deploy@host and rely on --use-remote-sudo for activation, then deploy needs to be listed in trusted-users.

This matches the Nix documentation more broadly: using substituters through the daemon is a privileged capability, and the calling user needs to be trusted for it to work as intended.

If you forget this step, the failure mode is annoying because it often looks like nothing is wrong. The deploy still succeeds. It is just slow, and the target host behaves as though --use-substitutes was never there.

How to tell whether it is working

If your deploys are still dragging, check the basics:

# On the remote host
nix show-config | grep trusted-users

And pay attention to the copy phase. The fast path looks like the target is fetching public dependencies for itself. The slow path looks like one long stream of data being shoved from your local store to the remote store.

The easiest sanity check is often practical: if adding --use-substitutes did not reduce upload volume or deploy time, the trusted-user piece is the first thing to verify.

The explicit three-step version

Sometimes you want more control than nixos-rebuild gives you in one command. Maybe you want to pre-stage a system closure, inspect copy behavior, or separate build time from transfer time in your logs.

That is where the lower-level pattern is useful:

TOPLEVEL=$(nix build ".#nixosConfigurations.myhost.config.system.build.toplevel" \
  --print-out-paths \
  --no-link)

nix copy \
  --to "ssh-ng://deploy@example.com" \
  --substitute-on-destination \
  "$TOPLEVEL"

nixos-rebuild switch \
  --flake ".#myhost" \
  --target-host deploy@example.com \
  --use-remote-sudo

This gives you three distinct phases:

  1. Build the target system locally
  2. Copy the closure while allowing the destination to substitute what it can
  3. Switch the remote system

That is handy when you are debugging performance, staging to multiple machines, or just want a clearer operational story than “one giant command did a lot of things.”

You can also dry-run the copy phase:

nix copy \
  --to "ssh-ng://deploy@example.com" \
  --substitute-on-destination \
  --dry-run \
  "$TOPLEVEL"

That is a nice way to see whether you are really pushing only the private or uncached store paths.

When this helps most

This pattern is especially effective when your closure contains a small amount of private work layered on top of a lot of standard public dependencies.

Examples:

  • a mostly normal NixOS system plus a private application package
  • a public service stack plus one custom overlay
  • a host whose secrets are handled separately but whose system packages are mostly upstream

In those cases, letting the destination substitute public paths turns deploys from “upload gigabytes” into “ship the weird bits.”

When it does not help much

There are real limits.

If most of your closure is custom and uncached, then the remote still has to get that data from somewhere, and that somewhere is probably you. --use-substitutes does not magically make private builds public.

It also will not help if the remote host cannot reach its substituters, or if your cache configuration is incomplete.

So the pattern is best understood as:

  • use destination-side substitutes for public paths
  • use your own upload bandwidth for the private leftovers

That is still a huge win, just not infinite.

Pair it with a private cache

If you want the best version of this workflow, pair it with your own binary cache as well.

For example:

{
  nix.settings = {
    trusted-users = [ "root" "deploy" ];
    substituters = [
      "https://cache.nixos.org"
      "https://your-attic.example.com/main"
    ];
    trusted-public-keys = [
      "cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY="
      "main:your-attic-public-key-here"
    ];
  };
}

Now the remote can fetch both public dependencies and your private builds without waiting for a direct upload from the deployer. That is the point where remote deploys start feeling unfairly fast.

If you want to set that up yourself, I already wrote about self-hosting a Nix binary cache with Attic and Garage.

The practical rule

If you deploy to a remote NixOS host, add --use-substitutes first and verify that the remote deploy user is trusted.

That is the highest-leverage fix.

Once that is in place, move to the explicit nix build plus nix copy --substitute-on-destination pattern when you need more visibility or control. And if you deploy often, add a private cache so the “custom bits” stop being uploads too.

You do not need to keep re-uploading the world to machines that already know how to fetch it.