Arq documentation
Onboarding and evaluation docs for Arq. Read these top-to-bottom on a first evaluation, or jump to the section you need.
On this page
1. What Arq is and when to use it
Arq turns PostgreSQL operational signals into implementable triage tickets — senior-DBA-level diagnosis, packaged for the teams that ship the fix. It is not a dashboard, not a metrics alerting system, and not a continuous monitor. It is a snapshot-driven analysis pipeline whose output is a finding you can hand to whoever owns the database.
Reach for Arq when…
- --You run PostgreSQL in production but don't have senior DBA capacity dedicated to it, or the DBAs you have are stretched too thin to triage every signal.
- --Your existing dashboards tell you something is wrong but don't tell you what to do.
- --You need findings that include evidence, recommended action, and risk framing — the kind of write-up a senior DBA would produce if you asked them to triage a specific signal.
- --Your environment is sensitive about data egress: no live query content leaving the network, no SaaS pipeline, license activation works offline.
Don't reach for Arq when…
- --You need real-time alerting on liveness or latency. Arq operates on snapshots; it's not a substitute for your metrics-and-alerts stack.
- --You want autonomous remediation. Arq produces tickets; applying the fix is operator-controlled by design.
- --Your team genuinely has senior DBA capacity covering every instance and finds those people under-utilised. Arq raises throughput; if you don't need throughput, the value won't be there.
For the full product narrative including who Arq is for and what the components do, see the Arq product page.
2. The Arq pipeline: Signals → Analyzer → Workbench
Arq is a platform with named components. Each component is independently understandable; together they form a single flow: collect evidence → diagnose → review → create ticket → validate fix.
Step 1 — Arq Signals collects evidence
Arq Signals is the open-source collector component (BSD-3-Clause). It runs against any PostgreSQL instance using a read-only role and gathers a structured set of telemetry — execution statistics, catalog views, configuration state, wait events, connection metrics — into a portable snapshot. No schema changes. No agents on the database host. No data leaves your network unless you choose to share the snapshot.
You can run Signals standalone — for in-house analysis, audit snapshotting, or feeding into your own tooling — without ever installing the rest of Arq. The snapshot format is documented; the source is on GitHub.
Step 2 — Arq Analyzer diagnoses
Arq Analyzer is the commercial diagnosis engine. It consumes a Signals snapshot, applies deterministic detection rules — vacuum lag, index gaps, query regressions, WAL retention, configuration drift, and more — and produces a list of findings. Each finding carries a problem statement, a recommended action, the cited evidence, and a calibrated severity and confidence.
Arq Insight is the LLM enrichment layer inside the Analyzer. Insight explains each rule hit and ranks recommendations — but never invents new findings, never escapes the cited evidence, and never promotes severity above what the rule emitted. The LLM is bounded by design.
Step 3 — Arq Workbench presents
Arq Workbench is the operator-facing web UI. Self-hosted, single-tenant — one Workbench per organization. Review findings, manage triage status, drill into the evidence for each one, and configure ticket-system integrations.
Step 4 — Push the ticket
When a finding is ready to act on, Workbench pushes it to your existing ticket system — GitHub Issues and Jira Cloud are the launch-validated integrations. Pushes are one-way and idempotent: the same finding pushed twice doesn't create two tickets, and once a ticket exists Workbench tracks the reference but does not write back into your tracker.
3. Safe deployment and the data-boundary model
Arq is built for environments where data egress matters. Everything below is verifiable by running Arq in your own infrastructure — none of it requires trust in marketing copy.
What stays in your network
- Signals snapshots — never sent to Elevarq.
- Analyzer output (findings.json, report.json).
- Workbench's entire datastore — operator accounts, integration credentials, audit log, finding state.
- License activation files (offline-activated).
What leaves only under your control
- Tickets you push to your tracker (GitHub Issues, Jira, etc.). The outbound destination is your operator-configured tracker, never Elevarq.
- Support bundles, if and when you choose to share one for troubleshooting — generated and reviewed locally first.
What never leaves
- PostgreSQL query text and result data — Signals collects telemetry from system views, not query content.
- Telemetry to Elevarq — no phone-home, no anonymous usage stats, no crash reports without explicit operator consent.
- License activation does not require internet at runtime.
For the full procurement / security-review surface, see the Security page — it covers authentication, audit logging, secret handling, compliance posture, and the procurement FAQ.
4. Understanding a finding
Each finding Arq emits today carries four content fields, plus a calibrated severity and confidence:
- Diagnosis · Problem
- What the signal indicates, stated plainly. References the specific table, query, or configuration value the rule fired on.
- Diagnosis · Why it matters
- The operational consequence. What gets worse if this isn't addressed, and on what timeline.
- Evidence
- Cited snapshot fields — the pg_stat_* rows, configuration values, and source-section references the rule used. Insight cannot escape the allowlist; every evidence line is traceable back to a snapshot fact.
- Recommended action
- The fix. Typically includes the relevant SQL or configuration change, with caveats (e.g. CREATE INDEX CONCURRENTLY's transaction restriction).
The header of each finding carries:
- Severity
- medium / high / critical / low / info. Calibrated by the detection rule; not editorialised by the LLM.
- Confidence
- A value between 0 and 1. Reflects how strongly the rule + evidence support the diagnosis. Lower numbers indicate findings worth a second look before acting.
The richer triage shape that downstream operators expect in their tracker — implementation steps, rollback notes, when to escalate — is part of the review workflow operators add in Workbench before push, not part of the analyzer's wire output today. That work is evolving; see Arq#547 for the roadmap.
For a full rendered example, see the sample finding on the Arq product page.
5. Evaluation prerequisites
The minimum to run an evaluation against your own database:
- A PostgreSQL instance (14–18) you can register against, with a role that has
pg_monitorprivileges. No superuser required. - A host that can run Docker (Compose or Kubernetes) for Workbench. The container image is pulled from a public registry at install time; thereafter no required outbound calls.
- Outbound network reachability to your ticket system of choice (GitHub or Jira), if you want to test the push path. Optional — the evaluation works without it.
- An Arq license file. Acquired through the standard packages described on the pricing page (or, until packages launch, via the evaluation contact path); activated offline.
6. A one-hour evaluation flow
The shape of an evaluation, end to end:
- Try Arq Signals on its own.Run the open-source collector against your own database for ~10 minutes. The snapshot it produces is exactly what Arq's Analyzer consumes. You confirm the safety model hands-on: read-only access, no schema changes, no data exfiltration.
- Provision Workbench. One container image, one persistent volume, one license activation. Setup wizard walks the rest.
- Point Arq at your snapshot. The analyzer runs against your real Signals data and emits findings. You see what the tickets look like with your data shapes, not a canned demo.
- Test a ticket push. Optionally — wire up GitHub Issues or Jira credentials in Workbench and push one finding to your tracker. Confirms the integration model works end-to-end before any commercial commitment.
7. Reading and using a generated ticket
Once Workbench pushes a finding to your tracker, the ticket that arrives carries the same four content fields described in section 4 — Problem, Why it matters, Evidence, Recommended action — plus the calibrated severity and confidence in the header, plus a link back to the finding in Workbench for the full evidence trail.
Reading a ticket in practice:
- Start with the Recommended action. That's the change you'd be making. If it's a one-liner SQL change, the rest of the ticket is the support for that change.
- Read the Evidence. The cited rows / configuration values let you verify the diagnosis is grounded in your actual snapshot, not a generic best-practice claim.
- Check the Confidence.A confidence below ~0.7 is a flag that the rule + evidence didn't strongly converge; second-opinion the finding before acting.
- Add operator context in your tracker. Implementation steps, validation, rollback, and escalation criteria — operators currently attach these during the review workflow in Workbench (or directly in the pushed ticket) before assigning the work out.
8. Troubleshooting common setup blockers
Three patterns we see often:
- Signals collection fails with permission errors
- Almost always: the role you're connecting with doesn't have pg_monitor (or pg_monitor-equivalent) privileges. Signals doesn't degrade silently — the error log names the specific view it couldn't read. Granting pg_monitor to the role usually resolves it in one step.
- Workbench setup wizard stuck on license activation
- Either the license file format is wrong (look for trailing whitespace if the file was copy-pasted), or the activation artefact doesn't match the deployment identity. The anti-reuse ledger explicitly rejects activations from a different deployment — reach out for re-issuance if the deployment was rebuilt.
- Ticket push fails with auth errors
- The credentials configured in Workbench's Integrations page expired or were rotated upstream (GitHub PATs, Jira API tokens). Workbench surfaces the auth-failure event in its audit log with the closed classification code DISPATCH_AUTH_FAILED.