Quick Links

Re: In-placre persistance change of a relation

From:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To:	Jakub(dot)Wartak(at)tomtom(dot)com
Cc:	tsunakawa(dot)takay(at)fujitsu(dot)com, osumi(dot)takamichi(at)fujitsu(dot)com, sfrost(at)snowman(dot)net, masao(dot)fujii(at)oss(dot)nttdata(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: In-placre persistance change of a relation
Date:	2021-12-22 06:13:27
Message-ID:	20211222.151327.439673660364783186.horikyota.ntt@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello, Jakub.

At Tue, 21 Dec 2021 13:07:28 +0000, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> wrote in
> So what's suspicious is that 122880 -> 0 file size truncation. I've investigated WAL and it seems to contain TRUNCATE records
> after logged FPI images, so when the crash recovery would kick in it probably clears this table (while it shouldn't).

Darn.. It is too silly that I wrongly issued truncate records for the
target relation of the function (rel) instaed of the relation on which
we're currently operating at that time (r).

> However if I perform CHECKPOINT just before crash the WAL stream contains just RUNNING_XACTS and CHECKPOINT_ONLINE
> redo records, this probably prevents truncating. I'm newbie here so please take this theory with grain of salt, it can be
> something completely different.

It is because the WAL records are inconsistent with the on-disk state.
After a crash before a checkpoint after the SET LOGGED, recovery ends with
recoverying the broken WAL records, but after that the on-disk state
is persisted and the broken WAL records are not replayed.

The following fix works.

--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -5478,7 +5478,7 @@ RelationChangePersistence(AlteredTableInfo *tab, char persistence,
xl_smgr_truncate xlrec;

xlrec.blkno = 0;
- xlrec.rnode = rel->rd_node;
+ xlrec.rnode = r->rd_node;
xlrec.flags = SMGR_TRUNCATE_ALL;

I made another change in this version. Previously only btree among all
index AMs was processed in the in-place manner. In this version we do
that all AMs except GiST. Maybe if gistGetFakeLSN behaved the same
way for permanent and unlogged indexes, we could skip index rebuild in
exchange of some extra WAL records emitted while it is unlogged.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment	Content-Type	Size
v11-0001-In-place-table-persistence-change.patch	text/x-patch	75.3 KB
v11-0002-New-command-ALTER-TABLE-ALL-IN-TABLESPACE-SET-LO.patch	text/x-patch	11.2 KB

In response to

RE: In-placre persistance change of a relation at 2021-12-21 13:07:28 from Jakub Wartak

Responses

RE: In-placre persistance change of a relation at 2021-12-22 08:42:14 from Jakub Wartak

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Smith	2021-12-22 07:31:37	Re: row filtering for logical replication
Previous Message	Amit Kapila	2021-12-22 05:54:52	Re: row filtering for logical replication