From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, kuroda(dot)hayato(at)fujitsu(dot)com, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Assertion failure in SnapBuildInitialSnapshot() |
Date: | 2023-01-31 01:19:13 |
Message-ID: | CAD21AoDzpQa+ptq03S2G-0o0=2LPpTp2qg+YFcYRxdbYgBc6rw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jan 30, 2023 at 8:24 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Dec 8, 2022 at 8:17 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > One idea to fix this issue is that in
> > ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
> > while holding both ProcArrayLock and ReplicationSlotControlLock, and
> > release only ReplicationSlotsControlLock before updating the
> > replication_slot_xmin. I'm concerned it will increase the contention
> > on ProcArrayLock but I've attached the patch for discussion.
> >
>
> But what kind of workload are you worried about? This will be called
> while processing XLOG_RUNNING_XACTS to update
> procArray->replication_slot_xmin/procArray->replication_slot_catalog_xmin
> only when required. So, if we want we can test some concurrent
> workloads along with walsenders doing the decoding to check if it
> impacts performance.
>
I was slightly concerned about holding ProcArrayLock while iterating
over replication slots especially when there are many replication
slots in the system. But you're right; we need it only when processing
XLOG_RUNINNG_XACTS and it's not frequent. So it doesn't introduce
visible overhead or negligible overhead.
> What other way we can fix this? Do you think we can try to avoid
> retreating xmin values in ProcArraySetReplicationSlotXmin() to avoid
> this problem? Personally, I think taking the lock as proposed by your
> patch is a better idea.
Agreed.
> BTW, this problem seems to be only logical
> replication specific, so if we are too worried then we can change this
> locking only for logical replication.
Yes, but I agree that there won't be a big overhead by this fix.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Amin | 2023-01-31 02:01:08 | Scan buffercache for a table |
Previous Message | Anton A. Melnikov | 2023-01-31 01:09:58 | Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" |