aca.github.io

WARNING: I’m not a database expert, and I’ve never run a database seriously in production.

Requirements

Postgres is the de-facto standard for relational databases, but I’ve found that running it as a cluster is surprisingly hard to get right. There are too many “postgres replication” or “postgres cluster” options, most either too complex to configure or unclear on license and long-term maintenance. I started with k8s + CloudNativePG, but I’ve been burned enough to not trust the “kubernetes operator” thing.

I just wanted something to back my personal tools. The setup I wanted was:

may run on free-tier VMs (~1c1g), mixed with different arch(amd64/arm64)
HA with real-time replication
few extra components for replication and clustering
work over a tailnet (tailscale)
simple maintenance, simple architecture

In the end, I went with AWS’s pgactive, and it’s worked out pretty well:

runs on NixOS with containers
active/active async replication in a mesh topology
nothing extra to run: pgactive is just a postgres extension, no etcd, k8s …
each app runs next to postgres and connects over localhost, no proxy
pgactive is built on logical replication, which has limitations.

Setup

First, postgres itself. All my servers run NixOS, so that’s where it runs.

NixOS has a services.postgresql module and I started there, but tying postgres to the system’s nixpkgs caused problems. The binary cache misbehaves on updates, pinning a version is painful once you depend on extensions, and you can’t easily run more than one instance. Containers were simpler to maintain.

I built the image with a nix for reproducibility and installed extensions (pgvector, pg_rational, timescaledb, pgactive) and settings. With that, each node only needs this system config:

virtualisation.oci-containers = {
  containers.pg = {
    image = "ghcr.io/aca/containers/postgres-18:latest-arm64";
    extraOptions = [ "--network=host" ];
    volumes = [ "/var/lib/pg:/var/lib/postgresql/data" ];
  };
};

pgactive

With a postgres container running on two nodes (oci-xnzm-a1, oci-impx-a1), start with the seed node (oci-xnzm-a1):

ssh root@oci-xnzm-a1 'podman exec -i pg psql -v ON_ERROR_STOP=1 -U postgres -d postgres' <<'SQL'
CREATE EXTENSION IF NOT EXISTS pgactive;

CREATE SERVER "oci-xnzm-a1" FOREIGN DATA WRAPPER pgactive_fdw
  OPTIONS (host 'oci-xnzm-a1', port '5432', dbname 'postgres');
CREATE USER MAPPING FOR postgres SERVER "oci-xnzm-a1" OPTIONS (user 'postgres');

CREATE SERVER "oci-impx-a1" FOREIGN DATA WRAPPER pgactive_fdw
  OPTIONS (host 'oci-impx-a1', port '5432', dbname 'postgres');
CREATE USER MAPPING FOR postgres SERVER "oci-impx-a1" OPTIONS (user 'postgres');

SELECT pgactive.pgactive_create_group(
  node_name := 'pg-oci-xnzm-a1',
  node_dsn  := 'user_mapping=postgres pgactive_foreign_server=oci-xnzm-a1'
);
SELECT pgactive.pgactive_wait_for_node_ready();
SELECT node_name, node_status FROM pgactive.pgactive_nodes ORDER BY node_name;
SQL

Then, on oci-impx-a1, join the seed (oci-xnzm-a1) with join_using_dsn. This pulls the whole database over, including any data that already exists at join time.

ssh root@oci-impx-a1 'podman exec -i pg psql -v ON_ERROR_STOP=1 -U postgres -d postgres' <<'SQL'
CREATE EXTENSION IF NOT EXISTS pgactive;

CREATE SERVER "oci-xnzm-a1" FOREIGN DATA WRAPPER pgactive_fdw
  OPTIONS (host 'oci-xnzm-a1', port '5432', dbname 'postgres');
CREATE USER MAPPING FOR postgres SERVER "oci-xnzm-a1" OPTIONS (user 'postgres');

CREATE SERVER "oci-impx-a1" FOREIGN DATA WRAPPER pgactive_fdw
  OPTIONS (host 'oci-impx-a1', port '5432', dbname 'postgres');
CREATE USER MAPPING FOR postgres SERVER "oci-impx-a1" OPTIONS (user 'postgres');

SELECT pgactive.pgactive_join_group(
  node_name      := 'pg-oci-impx-a1',
  node_dsn       := 'user_mapping=postgres pgactive_foreign_server=oci-impx-a1',
  join_using_dsn := 'user_mapping=postgres pgactive_foreign_server=oci-xnzm-a1'
);
SELECT pgactive.pgactive_wait_for_node_ready();
SELECT node_name, node_status FROM pgactive.pgactive_nodes ORDER BY node_name;
SQL

verify replication

Write to one node, read it back from the other, both directions:

$ psql "postgres://postgres@oci-impx-a1:5432/postgres" -c "CREATE TABLE xxx (id SERIAL PRIMARY KEY, name TEXT); INSERT INTO xxx (name) VALUES ('alice'); SELECT * FROM xxx;"
CREATE TABLE
INSERT 0 1
 id | name
----+-------
  1 | alice
(1 row)

$ psql "postgres://postgres@oci-xnzm-a1:5432/postgres" -c 'select * from xxx'
 id | name
----+-------
  1 | alice
(1 row)

$ psql "postgres://postgres@oci-xnzm-a1:5432/postgres" -c "INSERT INTO xxx (name) VALUES ('john');"
INSERT 0 1

$ psql "postgres://postgres@oci-impx-a1:5432/postgres" -c 'select * from xxx'
 id | name
----+-------
  1 | alice
  2 | john

And that’s it! I haven’t tested every failure scenario, but it’s held up fine so far. Worth a look if your needs are similar.