Monitoring

The Monitoring page provides network device monitoring, uptime tracking, tag-based rollup alerting, and discovery tools. It includes ping, TCP port, and HTTP endpoint checks, plus ARP-based and port-range network scanning.

Open Monitoring

Licensing

Network Monitoring — uptime monitors, tag rollups, and network discovery (the Scan tab) — is part of the Network Monitoring module. Without it in your license, this page is locked and network discovery is disabled. Trials and development builds unlock it. See License.

Overview

The page is split into three tabs:

Monitors — create and manage uptime monitors
Scan — ARP sweep with optional TCP port-range scan
Tags — group monitors into rollup alert sets with shared health rules and dependency suppression

Monitors Tab

Stats Dashboard

A strip of tiles at the top of the tab summarizes fleet health:

Total — total monitors configured
Up — currently responding
Down — currently failing
Unknown — never checked or indeterminate
Skipped — not checked this cycle (dependency down, disabled, etc.)
Tags Down — number of Monitor Tags currently rolled up to down (with degraded count when applicable). Hidden when no tags are configured.
Avg Response — rolling average response time across all monitors (ms)

Stats auto-refresh on every state change, and a periodic poll (~30 s) keeps counts aligned with the server even if a subscription event is missed.

Filtering

Search box — filters by name, IP, state, or tags
Tag chips — click a tag to filter; multiple tags combine as OR. Clear removes all tag filters. Each chip carries a small dot showing the tag's current rollup state — green/red/amber/grey/dashed for up/down/degraded/unknown/inert.
Result count shows filtered / total.

Bulk Edit

Click Bulk Edit in the filter row to enter multi-select mode. Each card grows a checkbox; an action bar above the list offers:

Select all (filtered) / Clear — operate on the current visible filter set
Add Tag — add the chosen tags to every selected monitor
Remove Tag — strip the chosen tags from every selected monitor
Replace Tags — replace each selected monitor's tag set with exactly the chosen tags (empty picker strips all tags)

Tag pickers in the bulk modal pick by id from the live monitor_tag table. Click Done in the filter row to exit bulk mode.

Monitor Cards

Each card displays:

Name and optional linked device (D#123)
Paused badge when disabled
Type / Target / Interval summary line
Depends on and Power zone references (if configured)
Tag chips — colored by the live rollup state of each tag the monitor belongs to (up = green, down = red, degraded = amber, unknown = grey). A suppressed by <tag> chip appears when an upstream tag in the dependency chain is currently down.
Last check timestamp, response time, and uptime (for monitors currently up)
Error text if the last check failed
Meta icons — email recipient count, SMS recipient count, notification-profile override flag, and the last alert time (historical — from the pre-alarm-center sender; current sends appear in the alarm event's delivery log)
State badge — up, down, error, checking, or unknown
Alarm chip — a red alarm chip appears when the monitor (or a tag, on its card) has an open event in the alarm center; click it to jump to the Alarms page

When a monitor changes state, its card briefly flashes green (up) or red (down) so the transition is visible at a glance.

Card Actions

Icon buttons on each card:

Pause / Play — toggle monitor enabled state
Run — force a check immediately
History — open the response-time history modal (see below)
Edit — open the edit modal
Delete — remove the monitor

The History icon opens a full-size chart of the monitor's response times, populated from the persisted monitor_history table so the data survives server restarts.

The chart shows:

Blue line — response time per check (lower = faster)
Red dots — failed checks
Orange bells at the top with a dashed vertical line — checks that fired a notification under the pre-alarm-center sender (historical markers). Current notifications are logged on the alarm event instead.
X-axis — time labels across the window (includes date when the window spans more than one day)
Max label — top-left corner shows the peak response time in the window

Below the chart, four tiles summarize the period: total checks, success rate, average response time, and max response time. A running failure count is shown beneath the tiles.

History Retention

Monitor check results are kept for the window configured on the Data Retention page (monitor_history_retention_days, default 30 days) and pruned by the daily maintenance job. The in-memory cache keeps the most recent 100 checks for fast subscription replay; older data is served from the database.

Creating a Monitor

Click Add Monitor in the filter row, or use the Monitor button on a Scan tab row to prefill from discovery.

Monitor Types

Ping (ICMP) — fastest; tests reachability only
TCP — connects to a specific port; good for service-level checks
HTTP / HTTPS — issues a request and validates the response status

Basic Fields

Name — display label
Type — ping / tcp / http
Target — IP, hostname, or URL (HTTP monitors take a full URL)

Check Configuration

Check Interval — 30 s, 1 min, 5 min, 15 min, or 1 hour
Timeout — seconds before a single check is treated as failed (1–30)
Retries — consecutive failed checks required before the monitor is flagged down (1–10)

Type-Specific Fields

TCP — Port
HTTP — Method (GET/POST/HEAD) and Expected Status Code

Associations

Associate with Device — link to a GEM device (shows as D#id on the card)
Depends on Monitor — skip checks while a parent is down, and suppress alert storms for downstream outages
Power Zone (auto-reboot on down) — power-cycles the zone when the monitor stays down past its retry threshold

Notifications

State changes raise and clear a stateful event under the Network Monitor Down alarm definition; the alarm center sends the notifications (batched, acknowledge-aware, delivery-logged) to the recipients configured here. Configuring notify steps on that definition replaces these recipients entirely.

Email Recipients — pick from GEM users or type addresses directly; empty = system default address
SMS Recipients — pick from GEM users or type numbers directly
Bypass notification profile (always send) — ignores day/hour windows on the recipient's notification profile. Use for critical alerts that should page at any hour.

The alarm event opens once per outage and clears on recovery, so there is one notification per transition — no per-monitor throttle is needed anymore.

Send Test Notification

The Send Test Notification button at the bottom of the modal fires a real email/SMS to every configured recipient using the currently-entered settings (whether the monitor is saved or not). Results are shown inline per recipient so you can verify alerting is wired up before relying on it.

Real-Time Updates

The page subscribes to live monitor events, so the list updates without reloading:

check_result — refreshes the card state and last-response line
state_change — triggers the flash highlight and stats refresh
created / updated / deleted — keeps the list in sync across sessions

Scan Tab

Network discovery combining an ARP sweep with an optional TCP port-range scan. Use this tab to enumerate hosts on the LAN, identify services on remote subnets, and onboard devices or monitors in one pass.

Settings

Network Interface — auto-fills the IP range from the interface's CIDR
IP Range — CIDR (192.168.1.0/24) or range form
Port Range — comma and dash syntax, e.g. 23,80,443,5900-5902
Quick Scan (skip port scan) — skip the TCP port scan entirely and only discover hosts via ARP. Returns results in seconds; disables the Port Range field.

Click Start to scan, Stop to cancel a running port scan.

How the Scan Works

An ARP pre-pass enumerates every responder in the IP range. Each host row is populated with IP, MAC, vendor, and an initial driver suggestion.
If Quick Scan is unchecked, the TCP port scan runs afterward. Open ports stream in live and merge into the matching host's row, and the driver suggestion re-ranks as new ports arrive.
Hosts that respond to the port scan but not to ARP (cross-subnet, ARP blocked) appear as bare rows with no MAC or vendor.

Results

Each row shows:

Select — checkbox for bulk monitor creation; the header checkbox toggles all rows
Vendor — from the MAC OUI lookup; falls back to Unknown
IP — clickable when the host is already a GEM device; opens that device's editor in a modal in place (no navigation away from the scan)
MAC
Open Ports — port pills, added live as the scan progresses
Suggested Driver — top match with a confidence score and the primary reason. Low-confidence suggestions (<40%) render in muted grey. For hosts already in GEM the column shows the device's configured driver display name instead of a guess.
Actions — Import opens the device creator prefilled with IP / MAC / port; Monitor opens the monitor creator (port 80/443 prefill as HTTP, a single other port as TCP, no port prefill for multi-port hosts).

Hosts already linked to a GEM device are highlighted in green, show an Already in GEM pill with the device name/label and ID, and the Import button is disabled.

Reused on the Devices page

The same scanner is embedded as the Scan Network modal on the Devices page (with the bulk-monitor controls hidden) so you can onboard hardware without leaving the device admin.

Bulk Monitor Creation

Select multiple hosts via row checkboxes to create monitors for all of them in one click. A bar appears above the results table when any rows are selected:

Selected count — how many hosts will be monitored
Tags — comma-separated tags applied to every created monitor (defaults to discovered)
Create N Monitors — runs the same prefill logic as the per-row Monitor button for each selected host (HTTP for port 80/443, TCP for a single other port, otherwise ping) and saves them sequentially. A toast reports successes and failures.
Clear — deselects all rows without creating anything

Selections survive incremental port-scan updates but are wiped when a new scan starts.

Tags Tab

Monitor Tags group correlated monitors so a single rollup alert covers a shared outage instead of N member alerts. Each tag is a first-class record with its own health rule, dependency edges, and notification fan-out.

Why Tags

Without tags, every member monitor that goes down sends its own email/SMS — a flapping internet uplink can produce dozens of duplicate pages for the same root cause. Tags collapse that into one rollup alert per outage, and dependency edges suppress noise when a deeper outage is already firing.

Build a whole tag set in one ask

The AI Assistant's bulk_create_monitor_tags skill provisions many rollup tags in a single request — members, health rules, and upstream dependency edges together — e.g. "set up the standard rack rollup: network (all switches + APs), cameras anchored on the NVR, access, and av — with cameras/access/av depending on network." It validates the whole batch and rolls back if any tag fails. Create the underlying monitors first (bulk_create_monitors); tags group monitors that already exist.

Tag Cards

Each tag card shows:

Label (or Name when no label is set) and a state badge: up / down / degraded / unknown / inert
Rule — anchor monitor name, threshold percentage, or "all members down"
Members — count plus how many are currently down
Depends on — chip list of upstream tags
Recipient summary — N email · M sms, plus down/recovery macro chips
A red warning when the tag has no recipients and no macros (transitions are inert)
History, Edit, and Delete action buttons

The inert — needs setup badge means the tag is configured to never roll up to down (an anchor rule with no anchor, or a threshold rule with 0%). Inert tags are safe — they were created either by the legacy migration or by an unfinished edit, and they fire nothing until configured.

Creating or Editing a Tag

Click Add Tag (or the edit icon on a card). The modal exposes:

Identity

Name — lowercase identifier (filter-friendly)
Label — optional human-readable label shown on dashboards and emails
Description — optional free text

Health Rule

Anchor monitor — the tag mirrors a single anchor monitor's state. Tag goes down when the anchor goes down. Leaving the anchor empty makes the tag inert.
All members down — tag is down only when every member is down; degraded when some members are down; up when all members are up.
Threshold % of members down — tag goes down when the configured percentage of members are down. degraded while below threshold but with at least one member down.

Members — pick monitors via the searchable picker. Tags can have any number of members.

Depends On (Upstream Tags) — pick other tags this one depends on. If any transitive upstream tag is currently down, this tag's down/up macros are suppressed. Cycles are rejected at save time.

Trigger Macros on State Change

On Down — run macro — fires once per rollup transition into down. Suppressed when an upstream tag is also down.
On Recovery — run macro — fires once when the tag transitions down → up. Receives outage_duration_ms in context.

The macro context for both fields includes monitor_tag_id, monitor_tag_name, monitor_tag_label, previous_state, new_state, trigger_monitor_id / trigger_monitor_name / trigger_monitor_ip, and rollup_reason. See Macros — Context Simulator for previewing these values while authoring steps.

Rollup Notifications

Tag rollup transitions raise and clear a Monitor Tag Down alarm event; the alarm center sends the notifications to the recipients configured here (empty = system default). Down/up macros run independently of notifications.

Email Recipients / SMS Recipients — pick from GEM users or type in addresses/numbers
Alert Throttle — minimum time between rollup macro runs for this tag (no throttle, 5 min, 15 min, 1 h, 6 h, or 24 h default; per-direction). Notifications are per alarm-event transition and don't need throttling.
Bypass notification profile (always send) — same semantics as on per-monitor notifications

Suppression Semantics

Dependency suppression — a monitor whose upstream dependency is down is skipped, so it never transitions and never raises an alarm; only the root cause alarms.
Storm batching — when a shared outage downs many monitors at once, each raises its own alarm event, but the alarm center batches their notifications into one digest email over a short window (plus the tag rollup alarm, when a tag covers them).
Upstream tag suppression (macros) — when any transitive upstream tag is currently down, a dependent tag's down/up macros are muted. Only the deepest still-up→down transition runs macros.
Boot-time resync gate (macros) — a tag's recovery macro only fires if its down transition was seen in this session, so a tag that boots down doesn't emit a phantom recovery.

Tag History

Click the history icon on a tag card to open the rollup transition timeline:

A horizontal SVG timeline of state segments (up / down / degraded / unknown), color-coded
A table of every transition with timestamp, previous→new state, rollup reason, triggering monitor, and dispatch summary (macro id, email/sms counts)
Outage duration is computed from the cheap last_down_time column populated when the tag enters down/degraded — no history-walk in the hot path

History rows are pruned alongside monitor_history — one retention setting covers both tables (see Data Retention).

Legacy Tag Migration

Legacy comma-separated monitor.tags strings are migrated into monitor_tag rows on first boot after upgrade. Migrated tags default to the anchor rule with no anchor configured — meaning they're inert and won't fire any rollup alerts until an admin opens them and configures the rule. The migration can never increase alert volume.

The migration is idempotent and gated by a system attribute (monitor_tag_migration_done), so deleting a migrated tag and rebooting won't resurrect it from a stale legacy string.

Auto-Reboot on Failure

When a monitor has a Power Zone configured and the monitor stays down past its retry threshold, GEM power-cycles the zone to recover the device. Useful for:

Cameras and NVRs that hang
Routers and switches that need periodic reboots
Equipment without remote management

Don't auto-reboot critical life-safety equipment

Don't assign a power zone to monitors for HVAC controllers, access control panels, or other equipment where an unattended reboot could cause harm.

Monitor Dependencies

Depends-on relationships build a hierarchy:

Monitor 1: Internet Gateway (8.8.8.8)
Monitor 2: Local Router  (depends on Monitor 1)
Monitor 3: Camera        (depends on Monitor 2)

When a parent is down, dependent monitors are counted as skipped rather than alerting separately — eliminating alert storms during upstream outages.

When a monitor with dependents goes down, the alert email lists every downstream monitor affected, and the SMS includes a compact count (impacts 4 dependent monitors) so the recipient sees the blast radius at a glance. The same payload is recorded on the triggering history row and is visible as an orange marker on the History Modal chart.

Dependencies vs Tags

Per-monitor depends_on_monitor_id is a one-to-one parent/child link that pauses checks while the parent is down. Tag dependencies are tag-to-tag edges that suppress alerts (member emails and dependent tag rollups) without affecting the underlying check schedule. Use both together: dependency edges to model "if A is down, don't bother probing B"; tag edges to collapse a noisy outage into a single page.

Devices — device configuration
Device Health — historical uptime data
Data Retention — log retention settings
Notification Profiles — recipient day/hour windows and channels

Overview​

Monitors Tab​

Stats Dashboard​

Filtering​

Bulk Edit​

Monitor Cards​

Card Actions​

History Modal​

Creating a Monitor​

Monitor Types​

Basic Fields​

Check Configuration​

Type-Specific Fields​

Associations​

Tags​

Notifications​

Real-Time Updates​

Scan Tab​

Settings​

How the Scan Works​

Results​

Bulk Monitor Creation​

Tags Tab​

Why Tags​

Tag Cards​

Creating or Editing a Tag​

Suppression Semantics​

Tag History​

Legacy Tag Migration​

Auto-Reboot on Failure​

Monitor Dependencies​

Related Documentation​