From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, osumi(dot)takamichi(at)fujitsu(dot)com |
Cc: | ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, tsunakawa(dot)takay(at)fujitsu(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Disable WAL logging to speed up data loading |
Date: | 2020-10-02 01:56:21 |
Message-ID: | 93135c87-eba4-f614-e124-e9ec254029fa@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020/10/02 10:06, Kyotaro Horiguchi wrote:
> At Thu, 1 Oct 2020 08:14:42 +0000, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
>> Hi, Horiguchi-San and Fujii-San.
>>
>>
>> Thank you so much both of you.
>>>> the table needs to be rewriitten. One idea for that is to improve that
>>>> command so that it skips the table rewrite if wal_level=minimal.
>>>> Of course, also you can change wal_level after marking the table as
>>>> unlogged.
>>>
>>> tablecmd.c:
>> The idea is really interesting.
>> I didn't come up with getting rid of the whole copy of
>> the ALTER TABLE UNLOGGED/LOGGED commands
>> only when wal_level='minimal'.
>>
>>>> * There are two reasons for requiring a rewrite when changing
>>>> * persistence: on one hand, we need to ensure that the buffers
>>>> * belonging to each of the two relations are marked with or without
>>>> * BM_PERMANENT properly. On the other hand, since rewriting creates
>>>> * and assigns a new relfilenode, we automatically create or drop an
>>>> * init fork for the relation as appropriate.
>> Thanks for sharing concrete comments in the source code.
>>
>>> According to this comment, perhaps we can do that at least for
>>> wal_level=minimal.
>> When I compare the 2 ideas,
>> one of the benefits of this ALTER TABLE 's improvement
>> is that we can't avoid the downtime
>> while that of wal_level='none' provides an easy and faster
>> major version up via output file of pg_dumpall.
>
> The speedup has already been achieved with higher durability by
> wal_level=minimal in that case.
I was thinking the same, i.e., wal_level=minimal + wal_skip_threshold would
speed up that initial data loading.
> Or maybe you should consider using
> pg_upgrade instead. Even inducing the time to take a backup copy of
> the whole cluster, running pg_upgrade would be far faster than
> pg_dumpall then loading.
>
>> Both ideas have good points.
>> However, actually to modify ALTER TABLE's copy
>> looks far more difficult than wal_level='none' and
>> beyond my current ability.
>> So, I'd like to go forward with the direction of wal_level='none'.
>> Did you have strong objections for this direction ?
No, I have no strong objection against your trial. But I was thinking
that it's not so easy to design and implement wal_level=none.
For example, there are some functions and commands depending on
the existence of WAL, like pg_switch_wal(), PREPARE TRANSACTION
and COMMIT PREPARED. Probably you need to define how they should
work in wal_level=none, e.g., emit an error.
> For fuel(?) of the discussion, I tried a very-quick PoC for in-place
> ALTER TABLE SET LOGGED/UNLOGGED and resulted as attached. After some
> trials of several ways, I drifted to the following way after poking
> several ways.
Nice!
> 1. Flip BM_PERMANENT of active buffers
> 2. adding/removing init fork
> 3. sync files,
> 4. Flip pg_class.relpersistence.
>
> It always skips table copy in the SET UNLOGGED case,
Even in wal_level != minimal?
What happens in the standby side when SET UNLOGGED is executed without
the table rewrite in the primary? The table data should be truncated
in the standby?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Ian Barwick | 2020-10-02 02:14:19 | Re: Improving connection scalability: GetSnapshotData() |
Previous Message | Fujii Masao | 2020-10-02 01:21:02 | Re: New statistics for tuning WAL buffer size |