Skip to content

Backup and restore

A self-hosted Piprio deployment ships with a backup and restore toolkit aimed at one job: bringing a deployment back from nothing. This page explains what a backup actually captures, how a restore runs, how an operator rehearses recovery, and the limits a buyer should plan around before signing off. The limits matter as much as the capabilities, so they are stated plainly rather than buried.

What a backup captures

One command produces a complete backup bundle:

bash deploy/scripts/backup.sh

The result is a timestamped directory under the deployment's backup folder, named for the moment it was taken in UTC:

deploy/backups/piprio_20260421T020000Z/
├── MANIFEST.txt         # timestamp, version, component sizes
├── database.sql.gz      # gzipped Postgres dump (schema + data)
├── minio/               # object-storage mirror (on-prem only)
└── config.tar.gz        # env file, secrets, certificates

A bundle captures three things plus a description of itself.

The database dump is a single gzipped Postgres export of the full database, schema and data together. Because every customer's data lives in its own tenant schema, one dump carries every tenant in the deployment. The dump records the schema exactly as it stands at backup time, so a restore reproduces that state without replaying the migration history. Migrations are tracked in a dedicated table inside the database, which is part of what the dump preserves, so the restored deployment knows which schema version it is on.

The object-storage mirror is a copy of the document bucket, where uploaded files and generated exports live. This part runs only on the on-premise deployment, which bundles its own object storage. A deployment that points at an external S3 provider gets no mirror, because backing up that bucket is the provider's job. The backup script detects this and skips the step.

The config archive is a compressed copy of the env file, the secrets directory, and the certificates directory. That archive holds the signing key, the credential encryption key, the database password, the object-storage keys, and any connector credentials kept in the env file. A restore is useless without these, because encrypted connector credentials in the restored database cannot be decrypted without the original encryption key. This is also why the bundle is sensitive, covered under Encryption of backups below.

The bundle's summary file (MANIFEST.txt) is a plain-text record of timestamp, host, deployment version, and the size of each component. It records what a bundle should contain. It is a description, not a guarantee. There is no checksum or cryptographic signature in this summary, so a corrupted or truncated component is not detected by inspecting it.

Snapshot strategy

Every backup is a full backup. There are no incremental or differential bundles. Each run dumps the entire database and re-mirrors the entire bucket, so each bundle stands alone and a restore never needs to chain several together. The tradeoff is storage and run time that scale with the size of the deployment, not with how much changed since the last run.

Bundles older than 30 days are pruned automatically at the end of each run, so a daily schedule keeps roughly a month of history without manual cleanup. The destination can be redirected by passing a root directory as the first argument.

A daily snapshot is a cron entry away:

# Daily at 02:00, logging to a file
0 2 * * * /path/to/deploy/scripts/backup.sh >> /var/log/piprio-backup.log 2>&1

There is no point-in-time recovery and no continuous transaction-log archiving. Recovery lands on the most recent completed snapshot, not on an arbitrary second. For a buyer, that sets the recovery point objective: the worst-case data loss is the time since the last successful backup. A team that cannot tolerate a day of loss should snapshot more often than daily. Continuous archiving is out of scope for the first release. A deployment that needs it would layer a Postgres base backup plus transaction-log shipping on top of the bundle, outside this toolkit.

Restore procedure

A restore reads a bundle and rebuilds the deployment from it. The operation is destructive on purpose: it replaces all current data with the contents of the bundle.

To see what is available, run the restore script with no argument:

bash deploy/scripts/restore.sh

To restore a specific bundle, pass its directory:

bash deploy/scripts/restore.sh deploy/backups/piprio_20260421T020000Z/

The script runs a fixed sequence. It first validates the bundle by confirming the summary file and the database dump are present, and refuses to continue if either is missing. This is an existence check, not an integrity check. It confirms the pieces are there, not that they are intact. It then prints the bundle summary and asks for confirmation, requiring the operator to type the deployment domain rather than a generic yes, a deliberate speed bump because the next step destroys data. After confirmation it stops the application services while leaving the database, cache, and object storage running, drops and recreates the database, and loads the dump back in. If the bundle carries an object-storage mirror and the deployment runs its own object storage, the bucket is mirrored back. The config archive is handled with care: secrets and certificates are restored only after a second prompt, and the env file is left untouched unless an explicit override flag is passed, with a diff shown first so an operator can see what would change. The override flag is named in the restore command:

bash deploy/scripts/restore.sh deploy/backups/piprio_20260421T020000Z/ --force-config

Once everything is in place the script restarts all services and runs the health check, so the operator sees pass or fail per service before declaring the restore done.

DR drills

The toolkit supports rehearsal, not a formal recovery program. A drill is the same restore an operator would run in a real outage, pointed at a non-production deployment so it does no harm.

The rehearsal is straightforward. Stand up a separate deployment, copy a recent bundle to it, run the restore against it, then run the health check:

bash deploy/scripts/health-check.sh

The health check confirms each service is running, that the database and cache respond, and that the application answers its readiness probes, and it reports the database size and free disk. A drill that ends with a clean health check has proven two things at once: that the bundle is restorable, and that the operator knows the steps. Because the bundle summary carries no integrity check, a periodic drill is the only reliable way to catch a bundle that looks complete on disk but does not restore. A buyer evaluating recovery readiness should treat the drill cadence, not the existence of the script, as the real control.

Encryption of backups

Backups are not encrypted at rest by the toolkit. The backup script writes the bundle to disk in the clear, including the config archive that holds the signing key, the credential encryption key, the database password, and the object-storage keys. Encrypting a bundle is the operator's responsibility, and it is not optional for anything leaving the host.

Three rules follow from that, and a procurement reviewer should confirm all three are in the runbook.

A bundle holds the same secrets as the deployment's secrets directory, so it earns the same controls: tight file permissions, restricted hosts, and audited access.

A bundle leaving the host must be encrypted first. Symmetric encryption before upload is enough:

gpg --symmetric --cipher-algo AES256 piprio_20260421T020000Z.tar.gz
# or
age -p -o bundle.age piprio_20260421T020000Z.tar.gz

A bundle never belongs in version control or attached to a ticket, where it would outlive its access controls.

Offsite replication is the customer's own infrastructure. The toolkit writes bundles locally and prunes old ones. Copying them to durable storage in another location, with whatever encryption and retention policy the customer requires, is a backup job pointed at the bundle directory, not a feature of Piprio. A deployment with no offsite copy is one disk failure away from losing both the data and every backup of it.