Monitoring UPS Systems with NUT, Prometheus, and Grafana
7 min readPower failures are one of those problems that feel theoretical right up until a UPS starts beeping and something important goes dark. At that point you do not want to SSH into a host and manually run upsc just to figure out whether the battery is healthy, how much load the unit is carrying, or whether the mains voltage has been wobbling all afternoon.
I wanted the same thing I want from the rest of my infrastructure: one place to look, one scrape path, and graphs that tell me whether the UPS is quietly doing its job or about to ruin my evening.
The setup here monitors two Eaton units:
| Host | OS | UPS Model | Rated Power | Role |
|---|---|---|---|---|
fatty | FreeBSD 14.7 | Eaton Ellipse PRO 1600 | 1600 VA | Main server running bhyve VMs |
chronos | Arch Linux ARM | Eaton 3S 550 | 550 VA | Raspberry Pi GPS/NTP time server |
Both UPS units are USB-attached to their local host. NUT talks to the hardware, nut_exporter translates that into Prometheus metrics, Prometheus scrapes it, and Grafana turns it into something you can glance at in five seconds.
How the stack fits together
The data path is simple:
UPS --USB--> NUT driver + upsd + upsmon --TCP:3493--> nut_exporter --HTTP:9199/ups_metrics--> Prometheus --> Grafana
NUT is really three pieces:
usbhid-upstalks to the UPS over USB HID.upsdserves UPS state to clients on TCP port 3493.upsmonwatches battery state and handles shutdown behavior.
nut_exporter connects to upsd, asks for a defined set of variables, and exposes them as Prometheus metrics. Prometheus does not need to know anything about USB, Eaton, or NUT internals. It just scrapes HTTP.
NUT configuration
The configuration is almost the same on both hosts. The big FreeBSD wrinkle is pathing: on Linux the config usually lives under /etc/nut/, while on FreeBSD it typically lives under /usr/local/etc/nut/. The service names differ too: Linux usually splits the units out, while FreeBSD gives you nut and nut_upsmon.
The core device definition is minimal:
# /etc/nut/ups.conf on Linux
# /usr/local/etc/nut/ups.conf on FreeBSD
[ups]
driver = usbhid-ups
port = auto
desc = "Eaton UPS"
port = auto is usually enough for a single directly attached unit. The section name matters because that becomes the UPS identifier later, for example upsc ups@localhost.
Restrict upsd to localhost if the exporter runs on the same machine:
# upsd.conf
LISTEN 127.0.0.1 3493
And the basic upsmon user and monitor configuration looks like this:
# upsd.users
[monitor]
password = secret
upsmon master
# upsmon.conf
MONITOR ups@localhost 1 monitor secret master
SHUTDOWNCMD "/sbin/shutdown -h +0"
For a directly attached UPS on a single host, standalone mode is the right fit:
# nut.conf
MODE=standalone
That tells NUT to run the driver, upsd, and upsmon together.
Verifying NUT before you add Prometheus
Do not start with Grafana. Start with upsc.
These are the checks that matter most:
# Linux
systemctl status nut-server
systemctl status nut-monitor
systemctl status nut-driver
# FreeBSD
service nut status
service nut_upsmon status
# List UPS names known to upsd
upsc -l
# Dump all variables
upsc ups@localhost
# Spot check the useful ones
upsc ups@localhost battery.charge
upsc ups@localhost ups.status
upsc ups@localhost input.voltage
# Check direct driver state
upsdrvctl status
If this layer is broken, Prometheus is not the problem.
The common failure modes are the usual ones:
- USB permissions are wrong, so the driver cannot open the device.
- The wrong driver is configured in
ups.conf. - The UPS has gone stale and
upscreportsData stale. upsdis not listening where you think it is.- FreeBSD-specific config paths or service names got copied from Linux docs without adjustment.
For Eaton gear, usbhid-ups is usually the right driver. If the box has multiple USB UPS devices, you may need to add serial, vendorid, or productid instead of relying on autodetect.
Exporting NUT metrics
The exporter here is DRuggeri’s nut_exporter. It is a small Go binary that queries upsd and exposes UPS metrics over HTTP.
The key behavior to remember is that it serves UPS metrics on /ups_metrics, not the Prometheus default /metrics. Miss that detail and you will spend too long staring at empty scrape targets.
Installing the exporter
Example release install commands:
# Linux arm64 (Raspberry Pi)
curl -LO https://github.com/DRuggeri/nut_exporter/releases/download/2.5.2/nut_exporter-2.5.2.linux-arm64.tar.gz
tar xzf nut_exporter-2.5.2.linux-arm64.tar.gz
sudo cp nut_exporter-2.5.2.linux-arm64/nut_exporter /usr/local/bin/
# FreeBSD amd64
curl -LO https://github.com/DRuggeri/nut_exporter/releases/download/2.5.2/nut_exporter-2.5.2.freebsd-amd64.tar.gz
tar xzf nut_exporter-2.5.2.freebsd-amd64.tar.gz
sudo cp nut_exporter-2.5.2.freebsd-amd64/nut_exporter /usr/local/bin/
Running it
This is the flag set I care about:
/usr/local/bin/nut_exporter \
--web.listen-address=:9199 \
--metrics.namespace=network_ups_tools \
--nut.vars_enable=battery.charge,battery.voltage,battery.runtime,input.voltage,input.voltage.nominal,output.voltage,output.frequency,ups.load,ups.status,ups.power,ups.power.nominal,ups.realpower,ups.beeper.status
Those flags do three useful things:
--web.listen-address=:9199chooses the HTTP port.--metrics.namespace=network_ups_toolskeeps the metric names conventional.--nut.vars_enable=...whitelists the exact NUT variables worth scraping.
Without the variable whitelist, you end up exporting a lot of noise. UPS metrics are already niche enough; there is no need to make the series set bigger than necessary.
systemd service on Linux
[Unit]
Description=NUT Exporter
After=nut-server.service
[Service]
User=nobody
ExecStart=/usr/local/bin/nut_exporter \
--web.listen-address=:9199 \
--metrics.namespace=network_ups_tools \
--nut.vars_enable=battery.charge,battery.voltage,battery.runtime,input.voltage,input.voltage.nominal,output.voltage,output.frequency,ups.load,ups.status,ups.power,ups.power.nominal,ups.realpower,ups.beeper.status
Restart=always
[Install]
WantedBy=multi-user.target
FreeBSD startup
On FreeBSD I keep it simple in rc.conf:
nut_exporter_enable="YES"
nut_exporter_flags="--web.listen-address=:9199 --metrics.namespace=network_ups_tools --nut.vars_enable=battery.charge,battery.voltage,battery.runtime,input.voltage,input.voltage.nominal,output.voltage,output.frequency,ups.load,ups.status,ups.power,ups.power.nominal,ups.realpower,ups.beeper.status"
If you prefer a dedicated rc.d script, that works too. The important part is that the exporter comes up after NUT and points at the same UPS name that upsc uses.
Debugging the exporter
These checks tell you quickly whether the exporter layer is alive:
ps aux | grep nut_exporter
curl -s http://localhost:9199/ups_metrics | head -30
curl -s http://localhost:9199/ups_metrics | grep network_ups_tools_ups_status
curl -s http://localhost:9199/ups_metrics | grep network_ups_tools_battery_charge
curl -s http://localhost:9199/ups_metrics | grep network_ups_tools_device_info
And from the monitoring host:
curl -s http://fatty:9199/ups_metrics | grep network_ups_tools_ups_load
curl -s http://chronos:9199/ups_metrics | grep network_ups_tools_ups_load
If the endpoint is empty, either upsd is unreachable or the UPS name is wrong. If the metrics exist but values are stale or zero, NUT itself is usually the real problem.
The metrics worth caring about
The exporter produces a useful mix of identity, status, load, battery, and electrical metrics.
Device identity comes through as labels:
network_ups_tools_device_info{mfr="EATON",model="Ellipse PRO 1600",serial="0",type="ups"} 1
network_ups_tools_device_info{mfr="EATON",model="Eaton 3S 550",serial="Blank",type="ups"} 1
The most important metric family is the status flags:
| Metric | Meaning |
|---|---|
network_ups_tools_ups_status{flag="OL"} | Online, mains power present |
network_ups_tools_ups_status{flag="OB"} | On battery |
network_ups_tools_ups_status{flag="LB"} | Low battery |
network_ups_tools_ups_status{flag="CHRG"} | Charging |
network_ups_tools_ups_status{flag="DISCHRG"} | Discharging |
network_ups_tools_ups_status{flag="BOOST"} | Boosting low input voltage |
network_ups_tools_ups_status{flag="TRIM"} | Trimming high input voltage |
network_ups_tools_ups_status{flag="RB"} | Replace battery |
Under normal conditions, OL=1 and everything else is 0. During an outage you should see OB=1, OL=0, and usually DISCHRG=1.
The rest are the practical numbers:
| Metric | Unit | Why it matters |
|---|---|---|
network_ups_tools_ups_realpower | watts | Actual power draw |
network_ups_tools_ups_power | VA | Apparent power |
network_ups_tools_ups_power_nominal | VA | Rated UPS capacity |
network_ups_tools_ups_load | percent | Capacity percentage in use |
network_ups_tools_battery_charge | percent | Battery state of charge |
network_ups_tools_battery_runtime | seconds | Estimated remaining runtime |
network_ups_tools_battery_voltage | volts | Battery voltage |
network_ups_tools_input_voltage | volts | Incoming mains voltage |
network_ups_tools_output_voltage | volts | UPS output voltage |
network_ups_tools_output_frequency | hertz | Output frequency |
A handy derived value for dashboards is capacity utilization:
(network_ups_tools_ups_realpower * 100) / network_ups_tools_ups_power_nominal
That is often more honest than the vendor’s own load percentage, especially when you want a quick visual threshold against the unit’s rated capacity.
One cross-model wrinkle: not every UPS exposes the same variables. My Eaton Ellipse PRO 1600 reports things like input.voltage, output.frequency, and ups.power. The smaller Eaton 3S 550 does not expose all of those. That is normal. The exporter simply omits missing variables instead of inventing zeros.
Scraping it with Prometheus
This is the scrape configuration on the monitoring host:
{ network, ... }:
{
services.prometheus.scrapeConfigs = [
{
job_name = "nut";
honor_labels = true;
metrics_path = "/ups_metrics";
static_configs = [
{
targets = [ "${network.hosts.fatty.ip}:9199" ];
labels = { instance = "fatty"; };
}
{
targets = [ "${network.hosts.chronos-wired.ip}:9199" ];
labels = { instance = "chronos"; };
}
];
}
];
}
Again, the important line is metrics_path = "/ups_metrics". Everything else is ordinary Prometheus.
Building the Grafana dashboard
The dashboard I wanted was straightforward:
- status at the top so I can see online vs on-battery immediately
- current voltage, load, runtime, and beeper state as stat panels
- smoothed time series for power, battery charge, load, runtime, and voltages
- a variable for selecting one or more UPS instances
- a smoothing interval variable so noisy readings do not turn every graph into static
The useful template variables are:
$instance, populated fromlabel_values(network_ups_tools_ups_load, instance)$smoothing_interval, with values like1m,5m,15m,1h,1d, and1w
For the time series panels I use avg_over_time(...[$smoothing_interval]). UPS readings are often twitchy enough that raw charts are less helpful than a short moving average.
Here is the full dashboard JSON:
{
"annotations": { "list": [] },
"description": "UPS monitoring via NUT exporter (DRuggeri)",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"links": [],
"panels": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"mappings": [
{
"options": {
"0": { "text": "Offline", "color": "red" },
"1": { "text": "Online", "color": "green" }
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
}
},
"overrides": []
},
"gridPos": { "h": 4, "w": 6, "x": 0, "y": 0 },
"id": 1,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "network_ups_tools_ups_status{instance=~\"$instance\", flag=\"OL\"}",
"instant": true,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Status",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "volt"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 6, "y": 0 },
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "network_ups_tools_input_voltage{instance=~\"$instance\"}",
"instant": true,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Input Voltage",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"decimals": 0,
"max": 100,
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "green", "value": null },
{ "color": "#EAB839", "value": 50 },
{ "color": "red", "value": 75 }
]
},
"unit": "percent"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 10, "y": 0 },
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "network_ups_tools_ups_load{instance=~\"$instance\"}",
"instant": true,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Load",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "orange", "value": 300 },
{ "color": "green", "value": 600 }
]
},
"unit": "s"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 14, "y": 0 },
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "network_ups_tools_battery_runtime{instance=~\"$instance\"}",
"instant": true,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Runtime",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"mappings": [
{
"options": {
"0": { "text": "Disabled", "color": "red" },
"1": { "text": "Enabled", "color": "green" }
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
}
},
"overrides": []
},
"gridPos": { "h": 4, "w": 6, "x": 18, "y": 0 },
"id": 5,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "network_ups_tools_ups_beeper_status{instance=~\"$instance\"}",
"instant": true,
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Beeper Status",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"decimals": 1,
"unit": "watt"
},
"overrides": []
},
"gridPos": { "h": 10, "w": 24, "x": 0, "y": 4 },
"id": 6,
"options": {
"legend": {
"calcs": ["lastNotNull", "mean"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_ups_realpower{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Power Consumption [$smoothing_interval]",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"decimals": 0,
"max": 100,
"min": 0,
"unit": "percent"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 0, "y": 14 },
"id": 7,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_battery_charge{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Battery Charge [$smoothing_interval]",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "thresholds" },
"max": 100,
"min": 0,
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "green", "value": null },
{ "color": "red", "value": 80 }
]
},
"unit": "percent"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 12, "y": 14 },
"id": 8,
"options": {
"minVizHeight": 75,
"minVizWidth": 75,
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"showThresholdLabels": false,
"showThresholdMarkers": true,
"sizing": "auto"
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "(network_ups_tools_ups_realpower{instance=~\"$instance\"} * 100) / network_ups_tools_ups_power_nominal{instance=~\"$instance\"}",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Capacity Utilization",
"type": "gauge"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"max": 100,
"min": 0,
"unit": "percent"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 0, "y": 20 },
"id": 9,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_ups_load{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Load [$smoothing_interval]",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"unit": "s"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 12, "y": 20 },
"id": 10,
"options": {
"legend": {
"calcs": ["lastNotNull", "min"],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_battery_runtime{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Runtime [$smoothing_interval]",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"decimals": 1,
"unit": "volt"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 0, "y": 26 },
"id": 11,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_input_voltage{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Input Voltage [$smoothing_interval]",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"fieldConfig": {
"defaults": {
"color": { "mode": "palette-classic" },
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisPlacement": "auto",
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "opacity",
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"showPoints": "never",
"spanNulls": true,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"decimals": 1,
"unit": "volt"
},
"overrides": []
},
"gridPos": { "h": 6, "w": 12, "x": 12, "y": 26 },
"id": 12,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "none" }
},
"targets": [
{
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"expr": "avg_over_time(network_ups_tools_output_voltage{instance=~\"$instance\"}[$smoothing_interval])",
"legendFormat": "{{instance}}",
"refId": "A"
}
],
"title": "Output Voltage [$smoothing_interval]",
"type": "timeseries"
}
],
"preload": false,
"refresh": "10s",
"schemaVersion": 42,
"tags": [],
"templating": {
"list": [
{
"current": {},
"includeAll": false,
"multi": false,
"name": "datasource",
"query": "prometheus",
"refresh": 1,
"type": "datasource"
},
{
"current": { "text": ["All"], "value": ["$__all"] },
"datasource": { "type": "prometheus", "uid": "${datasource}" },
"definition": "label_values(network_ups_tools_ups_load, instance)",
"includeAll": true,
"label": "Instance",
"multi": true,
"name": "instance",
"query": { "query": "label_values(network_ups_tools_ups_load, instance)", "refId": "StandardVariableQuery" },
"refresh": 1,
"sort": 2,
"type": "query"
},
{
"current": { "text": "15m", "value": "15m" },
"includeAll": false,
"label": "Smoothing",
"name": "smoothing_interval",
"options": [
{ "selected": false, "text": "1m", "value": "1m" },
{ "selected": false, "text": "5m", "value": "5m" },
{ "selected": true, "text": "15m", "value": "15m" },
{ "selected": false, "text": "1h", "value": "1h" },
{ "selected": false, "text": "1d", "value": "1d" },
{ "selected": false, "text": "1w", "value": "1w" }
],
"query": "1m,5m,15m,1h,1d,1w",
"type": "custom"
}
]
},
"time": { "from": "now-7d", "to": "now" },
"timepicker": {},
"timezone": "",
"title": "NUT UPS",
"uid": "nut-ups"
}
Quick health checks
These are the commands I actually want handy when something looks wrong:
# On the UPS host
upsc ups@localhost
upsc ups@localhost ups.status
upsc ups@localhost battery.charge
upsc ups@localhost ups.load
curl -s http://localhost:9199/ups_metrics | grep network_ups_tools_ups_status
# From the monitoring host
curl -s http://fatty:9199/ups_metrics | grep ups_load
curl -s http://chronos:9199/ups_metrics | grep ups_load
# Check for on-battery state
curl -s http://fatty:9199/ups_metrics | grep 'flag="OB"'
If OB flips to 1, you are on battery. If LB joins it, stop admiring the dashboard and start caring about shutdown sequencing.
Wrapping up
This stack is not complicated, but it is full of little edges that are easy to forget six months later: FreeBSD path differences, NUT service naming, the exporter’s /ups_metrics path, and the fact that different UPS models expose different variables.
Once those are handled, though, UPS monitoring becomes just another Prometheus job. You get trend data for load, battery, and voltage, quick confirmation that a host is still online, and a much better answer than “the UPS is making noises again.”