streaming slaves can't keep up?

From: Ben Chobot <bench(at)silentmedia(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: streaming slaves can't keep up?
Date: 2020-03-31 17:52:27
Message-ID: ba006cb5-407e-cb3f-c9f3-48978d2d8d19@silentmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

We have a few busy 9.5 dbs, both streaming to a few slaves each. The
master and slaves are identical hardware and are getting no small amount
of load - about 45k transactions/s on the master and ~36k transactions/s
on the slave actively serving clients. During these busy times, queries
are all fairly responsive (mostly well under <1s) on both master and
slave, and according to pg_stat_replication, replication is mostly good
- the flush_location for all slaves seems quite up to date. But the
replay_location on those busy slaves falls behind by quite a lot (over
an hour behind), and this is a problem. On the slaves which aren't
taking client load, their replay_location remains close to the
flush_location.

Does it make sense that the reason this is happening is because all
those queries, which are quick but quite numerous, are causing the
replay to slow down? If so, my hope is that we can simply throw more
slaves at the problem, reducing the amount of queries and therefore
allowing the replication to not get blocked as often. But if that theory
is nonsense, I'm going to need a different solution.

Browse pgsql-general by date

  From Date Subject
Next Message Alastair McKinley 2020-03-31 18:53:24 Index selection issues with RLS using expressions
Previous Message Amitabh Kant 2020-03-31 17:30:57 Re: Is PostgreSQL SQL Database Command Syntax Similar to MySQL/MariaDB?