Re: In-placre persistance change of a relation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: michael(at)paquier(dot)xyz, nathandbossart(at)gmail(dot)com, postgres(at)jeltef(dot)nl, smithpb2250(at)gmail(dot)com, vignesh21(at)gmail(dot)com, jakub(dot)wartak(at)enterprisedb(dot)com, stark(dot)cfm(at)gmail(dot)com, barwick(at)gmail(dot)com, jchampion(at)timescale(dot)com, pryzby(at)telsasoft(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, rjuju123(at)gmail(dot)com, jakub(dot)wartak(at)tomtom(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, bharath(dot)rupireddyforpostgres(at)gmail(dot)com
Subject: Re: In-placre persistance change of a relation
Date: 2024-09-05 06:34:03
Message-ID: 20240905.153403.524835779720783047.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

Thank you for the response.

At Sun, 1 Sep 2024 22:15:00 +0300, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in
> On 31/08/2024 19:09, Kyotaro Horiguchi wrote:
> > - UNDO log(0002): This handles file deletion during transaction aborts,
> > which was previously managed, in part, by the commit XLOG record at
> > the end of a transaction.
> > - Prevent orphan files after a crash (0005): This is another use-case
> > of the UNDO log system.
>
> Nice, I'm very excited if we can fix that long-standing issue! I'll
> try to review this properly later, but at a quick 5 minute glance, one
> thing caught my eye:
>
> This requires fsync()ing the per-xid undo log file every time a
> relation is created. I fear that can be a pretty big performance hit
> for workloads that repeatedly create and drop small tables. Especially

I initially thought that one additional file manipulation during file
creation wouldn't be an issue. However, the created storage file isn't
being synced, so your concern seems valid.

> if they're otherwise running with synchronous_commit=off. Instead of
> flushing the undo log file after every write, I'd suggest WAL-logging
> the undo log like regular relations and SLRUs. So before writing the
> entry to the undo log, WAL-log it. And with a little more effort, you
> could postpone creating the files altogether until a checkpoint
> happens, similar to how twophase state files are checkpointed
> nowadays.

I thought that an UNDO log file not flushed before the last checkpoint
might not survive a system crash. However, including UNDO files in the
checkpointing process resolves that concern. Thansk you for the
suggestion.

> I wonder if the twophase state files and undo log files should be
> merged into one file. They're similar in many ways: there's one file
> per transaction, named using the XID. I haven't thought this fully
> through, just a thought..

Precisely, UNDO log files are created per subtransaction, unlike
twophase files. It might be possible if we allow the twophase files
(as they are currently named) to be overwritten or modified at every
subcommit. If ULOG contents are WAL-logged, these two things will
become even more similar. However, I'm not planning to include that in
the next version for now.

> > +static void
> > +undolog_set_filename(char *buf, TransactionId xid)
> > +{
> > + snprintf(buf, MAXPGPATH, "%s/%08x", SIMPLE_UNDOLOG_DIR, xid);
> > +}
>
> I'd suggest using FullTransactionId. Doesn't matter much, but seems
> like a good future-proofing.

Agreed. Will fix it in the next vesion.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mats Kindahl 2024-09-05 06:45:18 Re: Use streaming read API in ANALYZE
Previous Message Guillaume Lelarge 2024-09-05 06:19:13 Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes