Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Date: 2021-12-29 19:04:12
Message-ID: CAHg+QDe=4pE3=7db5SUzsPFFpz97O74Mk0riJgk=aGJ44WdLtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen, thank you!

On Wed, Dec 29, 2021 at 5:46 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> * SATYANARAYANA NARLAPURAM (satyanarlapuram(at)gmail(dot)com) wrote:
> > On Sat, Dec 25, 2021 at 9:25 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> wrote:
> > > On Sun, Dec 26, 2021 at 10:36 AM SATYANARAYANA NARLAPURAM <
> > > satyanarlapuram(at)gmail(dot)com> wrote:
> > >>> Actually all the WAL insertions are done under a critical section
> > >>> (except few exceptions), that means if you see all the references of
> > >>> XLogInsert(), it is always called under the critical section and
> that is my
> > >>> main worry about hooking at XLogInsert level.
> > >>>
> > >>
> > >> Got it, understood the concern. But can we document the limitations of
> > >> the hook and let the hook take care of it? I don't expect an error to
> be
> > >> thrown here since we are not planning to allocate memory or make file
> > >> system calls but instead look at the shared memory state and add
> delays
> > >> when required.
> > >>
> > >>
> > > Yet another problem is that if we are in XlogInsert() that means we are
> > > holding the buffer locks on all the pages we have modified, so if we
> add a
> > > hook at that level which can make it wait then we would also block any
> of
> > > the read operations needed to read from those buffers. I haven't
> thought
> > > what could be better way to do this but this is certainly not good.
> > >
> >
> > Yes, this is a problem. The other approach is adding a hook at
> > XLogWrite/XLogFlush? All the other backends will be waiting behind the
> > WALWriteLock. The process that is performing the write enters into a busy
> > loop with small delays until the criteria are met. Inability to process
> the
> > interrupts inside the critical section is a challenge in both approaches.
> > Any other thoughts?
>
> Why not have this work the exact same way sync replicas do, except that
> it's based off of some byte/time lag for some set of async replicas?
> That is, in RecordTransactionCommit(), perhaps right after the
> SyncRepWaitForLSN() call, or maybe even add this to that function? Sure
> seems like there's a lot of similarity.
>

I was thinking of achieving log governance (throttling WAL MB/sec) and also
providing RPO guarantees. In this model, it is hard to throttle WAL
generation of a long running transaction (for example copy/select into).
However, this meets my RPO needs. Are you in support of adding a hook or
the actual change? IMHO, the hook allows more creative options. I can go
ahead and make a patch accordingly.

> Thanks,
>
> Stephen
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2021-12-29 19:16:07 Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Previous Message SATYANARAYANA NARLAPURAM 2021-12-29 18:54:06 Re: Report checkpoint progress in server logs