Re: [Feature Request] INSERT FROZEN to Optimize Large Cold Data Imports and Migrations

From: Sébastien <bokanist(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Feature Request] INSERT FROZEN to Optimize Large Cold Data Imports and Migrations
Date: 2025-02-13 16:37:46
Message-ID: CANtq+vSy1VY9hDo+epGHX4kz_Ot0QFb293mV03smSLysZ0t8mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for your answer TOM.
I'd like to have this option only for initial loading of huge amounts of
data, where atomicity consistency is needed at all. Maybe an option on
startup command just for initial import mode, would be nice.

I had huge problems on server 3 weeks after a 6 TB migration from other DB.
I think it's sad to rewrite all data twice.

Le jeu. 13 févr. 2025 à 16:08, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> a écrit :

> =?UTF-8?Q?S=C3=A9bastien?= <bokanist(at)gmail(dot)com> writes:
> > Implementation details:
>
> > - A new INSERT FROZEN option could be introduced, similar to COPY
> FREEZE,
> > allowing direct insertion of tuples in a frozen state.
> > - This would likely require changes in heap storage logic to ensure
> > tuples are written with a frozen XID at insert time.
> > - Consideration should be given to transaction semantics and WAL
> logging
> > to ensure consistency and crash recovery integrity.
>
> That last is exactly why this won't happen. A frozen tuple would be
> considered committed and visible the instant it appears in the table,
> thus completely breaking both atomicity and integrity of the
> transaction.
>
> There has been work going on recently to reduce the impact of freezing
> massive amounts of data by spreading the work more effectively [1].
> I don't say that that particular commit has completely solved the
> problem, but I think that continued effort in that direction is more
> likely to yield usable results than what you're suggesting.
>
> BTW, this might or might not be usable in your particular workflow,
> but: there have long been some optimizations for data load into a
> table created in the same transaction. The idea there is that if the
> transaction rolls back, the table will never have been visible to any
> other transaction at all, so that maintaining atomicity/integrity of
> its contents is moot.
>
> regards, tom lane
>
> [1]
> https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=052026c9b
>

--
Sébastien Caunes
+33 6 7 229 229 7

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2025-02-13 16:39:32 Re: Get rid of WALBufMappingLock
Previous Message Tomas Vondra 2025-02-13 16:28:53 Re: BitmapHeapScan streaming read user and prelim refactoring