Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2021-09-05 08:52:51
Message-ID: CAFiTN-t1tYG3pgTmw63J5LqpcQfcf+D7XPmMz6nXpaYJLPV5Jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 4, 2021 at 3:24 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2021-09-03 14:25:10 +0530, Dilip Kumar wrote:
> > Yeah, we can surely lock the relation as described by Robert, but IMHO,
> > while creating the database we are already holding the exclusive lock on
> > the database and there is no one else allowed to be connected to the
> > database, so do we actually need to bother about the lock for the
> > correctness?
>
> The problem is that checkpointer, bgwriter, buffer reclaim don't care about
> the database of the buffer they're working on... The exclusive lock on the
> database doesn't change anything about that.

But these directly operate on the buffers and In my patch, whether we are
reading the pg_class for identifying the relfilenode or we are copying the
relation block by block we are always holding the lock on the buffer.

> Perhaps you can justify it's safe
> because there can't be any dirty buffers or such though.
>
>
> > I think we already have such a code in multiple places where we bypass
> the
> > shared buffers for copying the relation
> > e.g. index_copy_data(), heapam_relation_copy_data().
>
> That's not at all comparable. We hold an exclusive lock on the relation at
> that point, and we don't have a separate implementation of reading tuples
> from
> the table or something like that.
>

Okay, but my example was against the point Robert raised that he feels that
bypassing the shared buffer anywhere is hackish. But yeah, I agree his
point might be that even if we are using it in existing code we can not
justify it.

For moving forward I think the main open concerns we have as of now are

1. Special purpose code of scanning pg_class, so that we can solve it by
scanning the source database directory, I think Robert doesn't like this
approach because we are directly scanning to directory and bypassing the
shared buffers? But this is not any worse than what we have now right? I
mean now also we are scanning the directory directly, so only change will
be instead of copying files directly we will read file and copy block by
block.

2. Another problem is, while copying the relation we are accessing the
relation buffers but we are not holding the relation lock, but we are
already holding the buffer so I am not sure do we really have a problem
here w.r.t checkpointer, bgwriter? But if we have the problem then also we
can create the lock tag and acquire the relation lock.

3. While copying the relation whether to use the bufmgr or directly use the
smgr?

If we use the bufmgr then maybe we can avoid flushing some of the buffers
to the disk and save some I/O but in general we copy from the template
database so there might not be a lot of dirty buffers and we might not save
anything, OTOH, if we directly use the smgr for copying the relation data
we can reuse some existing code RelationCopyStorage() and the patch will be
simpler. Other than just code simplicity or IO there is also a concern by
Robert that he doesn't like to bypass the bufmgr, and that will be
applicable to the point #1 as well as #3.

Thoughts?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (PN Japan FSIP) 2021-09-05 11:42:47 RE: New predefined roles- 'pg_read/write_all_data'
Previous Message Mario Emmenlauer 2021-09-05 08:29:24 Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0