From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade |
Date: | 2024-06-07 09:10:25 |
Message-ID: | CAEze2WicbrOx6JWy0hK9yySm3gHFqbYE4qcDB+vaYzDCkpnL1Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 7 Jun 2024 at 10:28, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Jun 7, 2024 at 11:57 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
>>
>> On Fri, 7 Jun 2024 at 07:18, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>>>
>>> On Wed, Jun 5, 2024 at 10:59 PM Matthias van de Meent
>>> <boekewurm+postgres(at)gmail(dot)com> wrote:
>>>
>>> I agree with you that we introduced the WAL_LOG strategy to avoid
>>> these force checkpoints. However, in binary upgrade cases where no
>>> operations are happening in the system, the FILE_COPY strategy should
>>> be faster.
>>
>> While you would be correct if there were no operations happening in
>> the system, during binary upgrade we're still actively modifying
>> catalogs; and this is done with potentially many concurrent jobs. I
>> think it's not unlikely that this would impact performance.
>
> Maybe, but generally, long checkpoints are problematic because they
> involve a lot of I/O, which hampers overall system performance.
> However, in the case of a binary upgrade, the concurrent operations
> are only performing a schema restore, not a real data restore.
> Therefore, it shouldn't have a significant impact, and the checkpoints
> should also not do a lot of I/O during binary upgrade, right?
My primary concern isn't the IO, but the O(shared_buffers) that we
have to go through during a checkpoint. As I mentioned upthread, it is
reasonably possible the new cluster is already setup with a good
fraction of the old system's shared_buffers configured. Every
checkpoint has to scan all those buffers, which IMV can get (much)
more expensive than the IO overhead caused by the WAL_LOG strategy. It
may be a baseless fear as I haven't done the performance benchmarks
for this, but I wouldn't be surprised if shared_buffers=8GB would
measurably impact the upgrade performance in the current patch (vs the
default 128MB).
I'll note that the documentation for upgrading with pg_upgrade has the
step for updating postgresql.conf / postgresql.auto.conf only after
pg_upgrade has run already, but that may not be how it's actually
used: after all, we don't have full control in this process, the user
is the one who provides the new cluster with initdb.
>> If such a change were implemented (i.e. no checkpoints for FILE_COPY
>> in binary upgrade, with a single manual checkpoint after restoring
>> template1 in create_new_objects) I think most of my concerns with this
>> patch would be alleviated.
>
> Yeah, I think that's a valid point. The second checkpoint is to ensure
> that the XLOG_DBASE_CREATE_FILE_COPY never gets replayed. However, for
> binary upgrades, we don't need that guarantee because a checkpoint
> will be performed during shutdown at the end of the upgrade anyway.
Indeed.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Xiaoran Wang | 2024-06-07 09:14:29 | XACT_EVENT for 'commit prepared' |
Previous Message | Amit Kapila | 2024-06-07 09:09:49 | Re: Ambiguous description on new columns |