From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: WAL prefetch (another approach) |
Date: | 2021-02-04 00:40:26 |
Message-ID: | c5d52837-6256-0556-ac8c-d6d3d558820a@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I did a bunch of tests on v15, mostly to asses how much could the
prefetching help. The most interesting test I did was this:
1) primary instance on a box with 16/32 cores, 64GB RAM, NVMe SSD
2) replica on small box with 4 cores, 8GB RAM, SSD RAID
3) pause replication on the replica (pg_wal_replay_pause)
4) initialize pgbench scale 2000 (fits into RAM on primary, while on
replica it's about 4x RAM)
5) run 1h pgbench: pgbench -N -c 16 -j 4 -T 3600 test
6) resume replication (pg_wal_replay_resume)
7) measure how long it takes to catch up, monitor lag
This is nicely reproducible test case, it eliminates influence of
network speed and so on.
Attached is a chart showing the lag with and without the prefetching. In
both cases we start with ~140GB of redo lag, and the chart shows how
quickly the replica applies that. The "waves" are checkpoints, where
right after a checkpoint the redo gets much faster thanks to FPIs and
then slows down as it gets to parts without them (having to do
synchronous random reads).
With master, it'd take ~16000 seconds to catch up. I don't have the
exact number, because I got tired of waiting, but the estimate is likely
accurate (judging by other tests and how regular the progress is).
With WAL prefetching enabled (I bumped up the buffer to 2MB, and
prefetch limit to 500, but that was mostly just arbitrary choice), it
finishes in ~3200 seconds. This includes replication of the pgbench
initialization, which took ~200 seconds and where prefetching is mostly
useless. That's a damn pretty improvement, I guess!
In a way, this means the tiny replica would be able to keep up with a
much larger machine, where everything is in memory.
One comment about the patch - the postgresql.conf.sample change says:
#recovery_prefetch = on # whether to prefetch pages logged with FPW
#recovery_prefetch_fpw = off # whether to prefetch pages logged with FPW
but clearly that comment is only for recovery_prefetch_fpw, the first
GUC enables prefetching in general.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
image/png | 19.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2021-02-04 00:43:49 | Re: Correct comment in StartupXLOG(). |
Previous Message | Bruce Momjian | 2021-02-04 00:21:25 | Re: Multiple full page writes in a single checkpoint? |