Deploying Elixir/Phoenix on NixOS — What Actually Works
8 min readSo this started, as these things usually do, with wanting to run staging and production on the same box. One Phoenix app, two instances, different ports, different databases, different secrets. NixOS should make this declarative and clean. And it does — eventually. But there is a surprising amount of ceremony involved in getting an Elixir release to build under Nix, and the multi-instance NixOS module pattern has enough moving parts that you will get at least three of them wrong on the first try.
This post covers the full path: building the Elixir release in Nix, structuring the NixOS module for multiple instances, wiring up PostgreSQL and secrets without shell script hacks, and the various traps along the way.
Building the release in Nix
Elixir releases under Nix use mixRelease from the BEAM package set. Pin your Erlang version explicitly — don’t let it float with nixpkgs:
beamPackages = pkgs.beam.packages.erlang_27;
Fetch mix dependencies offline with fetchMixDeps. This is Nix’s equivalent of mix deps.get, but reproducible and sandboxed:
mixFodDeps = beamPackages.fetchMixDeps {
pname = "my-app-mix-deps";
version = "0.1.0";
src = ./.;
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
};
The sha256 is a fixed-output derivation hash. Get the real value by running the build once with a fake hash and letting Nix tell you the correct one.
The Nix sandbox has no network access, which means esbuild and tailwind cannot self-install the way they normally do via mix. You have to hand them the binaries from nixpkgs:
packages.default = beamPackages.mixRelease {
pname = "my-app";
version = "0.1.0";
src = ./.;
inherit mixFodDeps;
MIX_ESBUILD_PATH = "${pkgs.esbuild}/bin/esbuild";
MIX_TAILWIND_PATH = "${pkgs.tailwindcss}/bin/tailwindcss";
postBuild = ''
mix assets.deploy --no-deps-check
'';
fixupPhase = ''
echo "my_app_cookie" > $out/releases/COOKIE
'';
};
The --no-deps-check on mix assets.deploy is important — deps are already compiled at that point and the check would fail in the sandbox. The cookie in fixupPhase is for Erlang distribution; set it to something deterministic so nodes can cluster.
One more thing in mix.exs — disable compile-time environment validation in your release config:
releases: [my_app: [validate_compile_env: false]]
The build environment and runtime environment are different machines with different env vars. Without this, the release will refuse to start because it detects a mismatch.
Heroicons: skip mix, wire it manually
If your Phoenix app uses heroicons, you have likely already discovered that it tries to do a git clone during compilation. That does not work in a Nix sandbox. The fix is to fetch heroicons separately and symlink it into deps/:
heroiconsSrc = pkgs.fetchFromGitHub {
owner = "tailwindlabs";
repo = "heroicons";
rev = "v2.2.0";
hash = "sha256-...";
};
In the package’s preBuild:
preBuild = ''
mkdir -p deps
ln -sf ${heroiconsSrc} deps/heroicons
'';
Mirror this in your devShell so local development sees the same thing:
shellHook = ''
mkdir -p deps
ln -sfn ${heroiconsSrc} deps/heroicons
'';
Both environments resolve heroicons identically. No hex install, no git clone, no network.
The multi-instance module
The core idea is an instances attrset where each key is an instance name and each value is a submodule with its own ports, database, secrets, and system user. One NixOS module produces N systemd services, N PostgreSQL databases, and N nginx vhosts from a single declaration.
Start with the root options:
options.services.myApp = {
instances = mkOption {
type = types.attrsOf (types.submodule instanceModule);
default = {};
};
package = mkOption { type = types.package; };
};
The instance submodule defines everything an instance needs to run:
instanceModule = { name, config, ... }: {
options = {
enable = mkOption { type = types.bool; default = true; };
package = mkOption { type = types.package; default = cfg.package; };
user = mkOption {
type = types.str;
default = "my_app_${sanitizeName name}";
};
group = mkOption { type = types.str; default = config.user; };
listenAddress = mkOption { type = types.str; default = "127.0.0.1"; };
port = mkOption { type = types.port; default = 4000; };
metricsPort = mkOption { type = types.nullOr types.port; default = null; };
host = mkOption { type = types.str; };
scheme = mkOption {
type = types.enum [ "http" "https" ];
default = "https";
};
secretKeyBaseFile = mkOption { type = types.path; };
database = {
host = mkOption { type = types.str; default = "/run/postgresql"; };
name = mkOption { type = types.str; default = config.user; };
passwordFile = mkOption {
type = types.nullOr types.path;
default = null;
};
};
autoMigrate = mkOption { type = types.bool; default = false; };
nginxHelper = {
enable = mkOption { type = types.bool; default = false; };
domain = mkOption { type = types.str; default = config.host; };
enableACME = mkOption { type = types.bool; default = false; };
};
};
};
Note the sanitizeName helper — NixOS system users cannot have dashes:
sanitizeName = name: builtins.replaceStrings [ "-" ] [ "_" ] name;
An instance named "production" gets user my_app_production. An instance named "us-east" gets user my_app_us_east. Without this, NixOS will reject the user creation and the error message will not point you anywhere useful.
Filtering instances
Not every piece of config applies to every instance. An instance using an external database should not trigger local PostgreSQL provisioning. An instance without nginx should not generate a vhost. Filter the instances into categories early:
enabledInstances = filterAttrs (_: icfg: icfg.enable) cfg.instances;
localPgInstances = filterAttrs (_: icfg:
icfg.database.passwordFile == null
) enabledInstances;
nginxInstances = filterAttrs (_: icfg:
icfg.nginxHelper.enable
) enabledInstances;
Then gate each config section on the relevant subset. This prevents one instance’s settings from dragging in services that another instance does not need.
One user per instance, no exceptions
Each instance gets its own system user and group. This is not optional. If staging and production share a user, a misconfigured secret path in staging can expose production credentials. systemd’s LoadCredential binds files to the service’s user context — sharing users breaks that isolation.
users.users = mapAttrs' (_: icfg:
nameValuePair icfg.user {
isSystemUser = true;
group = icfg.group;
home = "/var/lib/my-app";
}
) enabledInstances;
users.groups = mapAttrs' (_: icfg:
nameValuePair icfg.group {}
) enabledInstances;
Assertions catch mistakes at eval time
When you have multiple instances, the most common misconfiguration is duplicate values — two instances claiming the same port, database name, or domain. These should fail at nixos-rebuild time, not at 3am when the second service fails to bind:
assertions = let
allPorts = allValues (icfg: icfg.port);
allDomains = mapAttrsToList (_: icfg:
icfg.nginxHelper.domain
) nginxInstances;
allDbNames = allValues (icfg: icfg.database.name);
in [
{
assertion = length allPorts == length (unique allPorts);
message = "myApp: all port values must be unique across instances.";
}
{
assertion = length allDomains == length (unique allDomains);
message = "myApp: all nginx domains must be unique across instances.";
}
{
assertion = length allDbNames == length (unique allDbNames);
message = "myApp: all database names must be unique across instances.";
}
];
This turns a runtime mystery into a build-time error with a clear message. Add assertions for every value that must be unique — ports, gRPC ports, metrics ports, domains, database names.
Generating systemd services
One systemd service per instance, generated with mapAttrs':
systemd.services = mapAttrs' (name: icfg:
let
stateDir = "my-app/${name}";
runtimeDir = "my-app-${name}";
pkg = icfg.package;
in nameValuePair "my-app-${name}" {
description = "My App (${name})";
wantedBy = [ "multi-user.target" ];
after = [ "network.target" "postgresql.service" ];
wants = [ "postgresql.service" ];
environment = {
PHX_SERVER = "true";
PORT = toString icfg.port;
LISTEN_ADDRESS = icfg.listenAddress;
PHX_HOST = icfg.host;
RELEASE_NODE = "my_app_${sanitizeName name}";
RELEASE_TMP = "/tmp/my-app-${name}";
DATABASE_HOST = icfg.database.host;
DATABASE_NAME = icfg.database.name;
DATABASE_USER = icfg.user;
LANG = "en_US.UTF-8";
};
serviceConfig = {
Type = "exec";
User = icfg.user;
Group = icfg.group;
Restart = "on-failure";
RestartSec = 5;
WorkingDirectory = "/var/lib/${stateDir}";
StateDirectory = stateDir;
RuntimeDirectory = runtimeDir;
NoNewPrivileges = true;
ProtectSystem = "strict";
ProtectHome = true;
PrivateTmp = true;
PrivateDevices = true;
ProtectKernelTunables = true;
ProtectKernelModules = true;
ProtectControlGroups = true;
RestrictSUIDSGID = true;
RemoveIPC = true;
LoadCredential = lib.filter (x: x != null) [
"secret_key_base:${icfg.secretKeyBaseFile}"
(if icfg.database.passwordFile != null
then "db_password:${icfg.database.passwordFile}"
else null)
];
};
script = ''
export SECRET_KEY_BASE="$(< $CREDENTIALS_DIRECTORY/secret_key_base)"
${lib.optionalString (icfg.database.passwordFile != null) ''
export DATABASE_PASSWORD="$(< $CREDENTIALS_DIRECTORY/db_password)"
''}
${lib.optionalString icfg.autoMigrate ''
${pkg}/bin/my_app eval 'MyApp.Release.migrate()'
''}
exec ${pkg}/bin/my_app start
'';
}
) enabledInstances;
A few things worth calling out:
RELEASE_NODE must be unique per instance. Erlang uses this to identify nodes for clustering and remote shells. Two instances with the same node name on the same host will collide, and the second one will silently fail to start distributed Erlang — or worse, connect to the first one’s runtime.
StateDirectory and RuntimeDirectory are namespaced per instance. systemd creates these automatically with the correct ownership. No mkdir -p in shell scripts, no chown hacks.
Security hardening is applied uniformly. ProtectSystem = "strict" makes the filesystem read-only except for the state and runtime directories. PrivateTmp gives each service its own /tmp. This is free isolation — there is no reason not to use it.
Frontload everything in the start script
Elixir releases read config at boot via runtime.exs. You cannot inject environment variables after the BEAM starts — there is no hot-reload of system env in a running release.
Do not use systemd Environment= for secrets. Those values end up in the unit file and are visible in /proc. Use LoadCredential instead — systemd places the file contents in $CREDENTIALS_DIRECTORY, and the start script reads them into env vars before exec.
The pattern is: non-secret config goes in environment, secrets go through LoadCredential, and the script block ties them together. Migrations run in the same script block because they need the same env vars. Do not try to run migrations in a separate ExecStartPre — it will not have your database credentials.
All env, all secrets, all pre-start tasks — one script block. One context. Nothing leaking between phases.
PostgreSQL without shell script hacks
The temptation is to use ExecStartPre to run createuser, createdb, and psql -c "GRANT ...". This works until it does not. You end up fighting race conditions with the postgresql service, handling “already exists” errors with || true, and accumulating shell that breaks on the next NixOS upgrade.
The right approach: the system user the service runs under has the same name as the database. NixOS wires this correctly through ensureDatabases and ensureUsers:
services.postgresql = mkIf (localPgInstances != {}) {
ensureDatabases = mapAttrsToList (_: icfg:
icfg.database.name
) localPgInstances;
ensureUsers = let
dbUsers = unique (mapAttrsToList (_: icfg:
icfg.user
) localPgInstances);
in map (u: {
name = u;
ensureDBOwnership = true;
}) dbUsers;
};
With ensureDBOwnership = true, the PostgreSQL role automatically owns its database. The app connects over a Unix socket at /run/postgresql — peer authentication, no password needed for local connections, no pg_hba.conf hacks.
In runtime.exs:
db_host = System.get_env("DATABASE_HOST", "/run/postgresql")
host_opts =
if String.starts_with?(db_host, "/"),
do: [socket_dir: db_host],
else: [hostname: db_host]
This handles both the NixOS case (Unix socket) and external databases (TCP hostname) from the same code path. The gating on localPgInstances ensures that instances using a remote database URL do not trigger local PostgreSQL provisioning.
Nginx with content-type routing
When an instance enables nginxHelper, the module auto-generates an nginx vhost. If your app serves both HTTP and gRPC on different ports, you can route based on content type:
services.nginx = mkIf (nginxInstances != {}) {
enable = true;
virtualHosts = mapAttrs' (_: icfg:
nameValuePair icfg.nginxHelper.domain ({
forceSSL = icfg.nginxHelper.enableACME;
enableACME = icfg.nginxHelper.enableACME;
locations."~ ^/" = {
proxyPass = "http://${icfg.listenAddress}:${toString icfg.port}";
proxyWebsockets = true;
};
})
) nginxInstances;
};
proxyWebsockets = true is essential for Phoenix LiveView. Without it, the WebSocket upgrade fails and LiveView falls back to long-polling — or just breaks, depending on your configuration.
ACME is optional per instance. Staging might use a test CA or no TLS at all. Production uses Let’s Encrypt. The module does not assume.
Putting it together
Here is what a two-instance deployment looks like from the consumer’s side:
{
services.myApp = {
package = my-app.packages.${pkgs.system}.default;
instances = {
production = {
host = "app.example.com";
port = 4000;
secretKeyBaseFile = /run/secrets/prod-secret-key;
database.name = "my_app_prod";
autoMigrate = true;
nginxHelper.enable = true;
nginxHelper.enableACME = true;
};
staging = {
host = "app-staging.example.com";
port = 4001;
secretKeyBaseFile = /run/secrets/staging-secret-key;
database.name = "my_app_staging";
autoMigrate = true;
nginxHelper.enable = true;
nginxHelper.enableACME = true;
};
};
};
}
From this declaration, nixos-rebuild produces:
- Two systemd services:
my-app-production,my-app-staging - Two system users:
my_app_production,my_app_staging - Two state directories under
/var/lib/my-app/ - Two PostgreSQL databases with matching owners
- Two nginx vhosts with separate ACME certificates
- Assertions that prevent port, domain, or database collisions
Add a third instance and you get a third of everything. Remove one and NixOS cleans up the service and vhost. The database and user persist — NixOS does not drop databases on removal, which is the correct default.
What I would do differently
If I were starting over, I would use writeShellApplication instead of bare script blocks for the pre-start logic. writeShellApplication runs shellcheck and adds set -euo pipefail automatically — it catches the kind of bugs that only surface in production when a credential file is missing and the script silently continues with an empty variable.
I would also add health checks from day one. systemd supports Type = "notify" with a watchdog, and there are Elixir libraries that integrate with it. Without health checks, a service that starts but stops accepting connections will sit there in “active (running)” state until someone notices manually.
But those are refinements. The patterns above — typed instance submodules, mapAttrs' for service generation, LoadCredential for secrets, ensureDatabases for PostgreSQL, assertions for uniqueness — these are the load-bearing walls. Get them right and everything else is furniture.