From: | Amul Sul <sulamul(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: CheckpointLock needed in CreateCheckPoint()? |
Date: | 2021-01-07 07:51:25 |
Message-ID: | CAAJ_b95tQ4F5g=+mrCQqmdvAKG-FEBSqXOsihqL=HZNxsicPCw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jan 7, 2021 at 12:45 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Thu, Jan 7, 2021 at 11:32 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > Snip from CreateCheckPoint():
> > --
> > /*
> > * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
> > * (This is just pro forma, since in the present system structure there is
> > * only one process that is allowed to issue checkpoints at any given
> > * time.)
> > */
> > LWLockAcquire(CheckpointLock, LW_EXCLUSIVE);
> > --
> >
> > As per this comment, it seems to be that we don't really need this LW lock. We
> > could have something else instead if we are afraid of having multiple
> > checkpoints at any given time which isn't possible, btw.
>
> Looks like CheckpointLock() can also be called in standalone backends
> (RequestCheckpoint->CreateCheckPoint) along with the checkpointer
> process. Having said that, I'm not quite sure whether it can get
> called at the same time from a backend(may be with checkpoint;
> command) and the checkpointer process.
>
> /*
> * If in a standalone backend, just do it ourselves.
> */
> if (!IsPostmasterEnvironment)
> {
> /*
> * There's no point in doing slow checkpoints in a standalone backend,
> * because there's no other backends the checkpoint could disrupt.
> */
> CreateCheckPoint(flags | CHECKPOINT_IMMEDIATE);
>
> See the below comment for IsPostmasterEnvironment:
>
> /*
> * IsPostmasterEnvironment is true in a postmaster process and any postmaster
> * child process; it is false in a standalone process (bootstrap or
> * standalone backend).
>
I am not sure I understood your point completely. Do you mean there could be
multiple bootstrap or standalone backends that could exist at a time and can
perform checkpoint?
> > This CheckpointLock is acquired almost at the start of CreateCheckPoint() and
> > released at the end. The in-between code can take a while to be complete. All
> > that time interrupt will be on hold which doesn't seem to be a great idea but
> > that wasn't problematic until now. I am having trouble due to this interrupt
> > hold for a long since I was trying to add code changes where the checkpointer
> > process is suppose to respond to the system read-write state change request as
> > soon as possible[1]. That cannot be done if the interrupt is disabled
> > since read-write state change uses the global barrier mechanism to convey system
> > state changes to other running processes.
> >
> > So my question is, can we completely get rid of the CheckpointLock need in
> > CreateCheckPoint()?
> >
> > Thoughts/Comments/Suggestions?
>
> I don't think we can completely get rid of CheckpointLock in
> CreateCheckPoint given the fact that it can get called from multiple
> processes.
>
How?
> How about serving that ASRO barrier request alone for checkpointer
> process in ProcessInterrupts even though the CheckpointLock is held by
> the checkpointer process? And also while the checkpointing is
> happening, may be we should call CHECK_FOR_INTERRUPTS to see if the
> ASRO barrier has arrived. This may sound bit hacky, but just a
> thought. I'm thinking that in ProcessInterrupts we can get to know the
> checkpointer process type via MyAuxProcType, and whether the lock is
> held or not using CheckpointLock, and then we can handle the ASRO
> global barrier request.
>
I am afraid that this will easily get rejected by the community. Note that this
is a very crucial code path of the database server. There are other options
too, but don't want to propose them until clarity on the CheckpointLock
necessity.
Regards,
Amul
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2021-01-07 07:52:21 | Re: Single transaction in the tablesync worker? |
Previous Message | Heikki Linnakangas | 2021-01-07 07:51:00 | Re: Incorrect allocation handling for cryptohash functions with OpenSSL |