From: | Klaus Darilion <klaus(dot)mailinglists(at)pernau(dot)at> |
---|---|
To: | Michael Lewis <mlewis(at)entrata(dot)com> |
Cc: | PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Logical replication lag in seconds |
Date: | 2020-03-04 17:33:50 |
Message-ID: | 264f1acc-f79b-961f-72b4-dda3ea6c9834@pernau.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
For the records - with a simple script I hacked a solution which is
purely based on the server.
1. Create a table to track the timestamp of an lsn:
CREATE TABLE lsn2date(
lsn pg_lsn PRIMARY KEY,
seen timestamp NOT NULL DEFAULT NOW()
);
CREATE ROLE replication_lag_user WITH LOGIN PASSWORD 'xxx';
GRANT ALL ON TABLE lsn2date TO replication_lag_user;
2. Create a script which populates the table:
# cat /etc/systemd/system/calculate_logical_replication_lag.service
[Unit]
Description=Start and auto restart service
[Install]
WantedBy=multi-user.target
[Service]
ExecStart=/usr/bin/php /path/to/calculate_logical_replication_lag.php
Restart=always
RestartSec=10
# cat calculate_logical_replication_lag.php
<?php
$dbuser="replication_lag_user";
$dbpass="xxx";
if (!$dbconn = pg_pconnect('host=127.0.0.1 dbname=mydb user='.$dbuser.'
password='.$dbpass)) {
print "Sorry, database connection failed";
exit;
}
$accuracy = 10; // in seconds
while (1) {
$dbq = pg_query("INSERT INTO lsn2date (lsn) VALUES
(pg_current_wal_lsn())");
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error: ".pg_last_error()."\n");
exit(1);
}
$dbq = pg_query("DELETE FROM lsn2date WHERE lsn < (".
"SELECT lsn FROM lsn2date WHERE lsn < (".
"SELECT confirmed_flush_lsn FROM
pg_replication_slots ORDER BY confirmed_flush_lsn ASC LIMIT 1".
") ORDER BY lsn DESC LIMIT 1".
")"
);
if ($dbq === FALSE) {
mylog(LOG_ERROR, "SQL query error: ".pg_last_error()."\n");
exit(1);
}
sleep($accuracy);
}
3. Get the lag, using a function which compares the lsn of the
replication_slots with the lsn/timestamp in the lsn2date table:
CREATE OR REPLACE FUNCTION get_replication_lag() RETURNS TABLE
(subscriber name, lag bigint) AS
$BODY$
DECLARE
subscriber name;
BEGIN
FOR subscriber IN
SELECT slot_name FROM pg_replication_slots
LOOP
RETURN QUERY SELECT slot_name, EXTRACT(EPOCH FROM
NOW()-seen)::bigint lag from lsn2date,pg_replication_slots WHERE
slot_name=subscriber AND lsn < confirmed_flush_lsn ORDER BY lsn DESC
LIMIT 1;
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql;
# SELECT * FROM get_replication_lag() ;
subscriber | lag
------------+-----
reg_sjc1 | 0
reg_ffm1 | 0
reg_tst2 | 0
reg_mia1 | 0
reg_jbg1 | 0
reg_ams1 | 0
reg_syy1 | 0
reg_wie1 | 0
reg_hkg1 | 0
reg_gnf1 | 0
reg_tor1 | 0
reg_sea1 | 0
reg_chi1 | 0
reg_dfw1 | 0
reg_sgp1 | 0
reg_lhr1 | 0
regards
Klaus
From | Date | Subject | |
---|---|---|---|
Next Message | postggen2020 s | 2020-03-04 18:38:55 | Suggestion to reduce COPY command output to csv file |
Previous Message | Adrian Klaver | 2020-03-04 16:19:41 | Re: pg_dump and public schema |