Quick Links

Re: Use fadvise in wal replay

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc:	Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshke(at)double(dot)cloud>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Use fadvise in wal replay
Date:	2022-06-21 13:12:17
Message-ID:	CAA4eK1LabUpGjJmJDA9ojgq0iymgLUHc7TwQFxZyXfLzCmgKwQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 21, 2022 at 5:41 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Tue, Jun 21, 2022 at 4:55 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 21, 2022 at 3:18 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> > >
> > > > On 21 Jun 2022, at 12:35, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > I wonder if the newly introduced "recovery_prefetch" [1] for PG-15 can
> > > > help your case?
> > >
> > > AFAICS recovery_prefetch tries to prefetch main fork, but does not try to prefetch WAL itself before reading it. Kirill is trying to solve the problem of reading WAL segments that are our of OS page cache.
> > >
> >
> > Okay, but normally the WAL written by walreceiver is read by the
> > startup process soon after it's written as indicated in code comments
> > (get_sync_bit()). So, what is causing the delay here which makes the
> > startup process perform physical reads?
>
> That's not always true. If there's a huge apply lag and/or
> restartpoint is infrequent/frequent or there are many reads on the
> standby - in all of these cases the OS cache can replace the WAL from
> it causing the startup process to hit the disk for WAL reading.
>

It is possible that due to one or more these reasons startup process
has to physically read the WAL. I think it is better to find out what
is going on for the OP. AFAICS, there is no mention of any other kind
of reads on the problematic standby. As per the analysis shared in the
initial email, the replication lag is due to disk reads, so there
doesn't seem to be a very clear theory as to why the OP is seeing disk
reads.

--
With Regards,
Amit Kapila.

In response to

Re: Use fadvise in wal replay at 2022-06-21 12:11:12 from Bharath Rupireddy

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Aleksander Alekseev	2022-06-21 13:22:03	Re: Support load balancing in libpq
Previous Message	houzj.fnst@fujitsu.com	2022-06-21 12:19:15	RE: Support logical replication of DDLs