Re: Fast COPY FROM based on batch insert

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, tanghy(dot)fnst(at)fujitsu(dot)com, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, houzj(dot)fnst(at)fujitsu(dot)com
Subject: Re: Fast COPY FROM based on batch insert
Date: 2022-07-27 05:42:28
Message-ID: ed89807e-0c15-02ac-b0aa-c01cdc5a2b2e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/22/22 13:14, Etsuro Fujita wrote:
> On Fri, Jul 22, 2022 at 3:39 PM Andrey Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>> Analyzing multi-level heterogeneous partitioned configurations I
>> realized, that single write into a partition with a trigger will flush
>> buffers for all other partitions of the parent table even if the parent
>> haven't any triggers.
>> It relates to the code:
>> else if (insertMethod == CIM_MULTI_CONDITIONAL &&
>> !CopyMultiInsertInfoIsEmpty(&multiInsertInfo))
>> {
>> /*
>> * Flush pending inserts if this partition can't use
>> * batching, so rows are visible to triggers etc.
>> */
>> CopyMultiInsertInfoFlush(&multiInsertInfo, resultRelInfo);
>> }
>>
>> Why such cascade flush is really necessary, especially for BEFORE and
>> INSTEAD OF triggers?
>
> BEFORE triggers on the chosen partition might query the parent table,
> not just the partition, so I think we need to do this so that such
> triggers can see all the rows that have been inserted into the parent
> table until then.
if you'll excuse me, I will add one more argument.
It wasn't clear, so I've made an experiment: result of a SELECT in an
INSERT trigger function shows only data, existed in the parent table
before the start of COPY.
So, we haven't tools to access newly inserting rows in neighboring
partition and don't need to flush tuple buffers immediately.
Where am I wrong?

--
Regards
Andrey Lepikhov
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-07-27 06:06:12 Re: pgsql: Remove the restriction that the relmap must be 512 bytes.
Previous Message Alexander Korotkov 2022-07-27 05:36:56 Re: Custom tuplesorts for extensions