Re: Multi Inserts in CREATE TABLE AS - revived patch

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Luc Vlaming <luc(at)swarm64(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Paul Guo <guopa(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Multi Inserts in CREATE TABLE AS - revived patch
Date: 2020-11-26 01:54:01
Message-ID: CALj2ACX0b_WxmPHUELBY1ntW603h_GQ=yE0BX05QQ+VHHPPRDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 25, 2020 at 2:11 PM Luc Vlaming <luc(at)swarm64(dot)com> wrote:
>
> Thanks for reviving the patch! I did unfortunately have to shift my
> priorities somewhat and did not find much time to work on open source
> things the last week(s).
>

Thanks for the comments.

>
> I'm wondering about the use of the GetTupleSize function. As far as I
> understand the idea is to limit the amount of buffered data, presumably
> to not write too much data at once for intorel_flush_multi_insert.
> If I understood correctly how it all works, the table slot can however
> be of different type than the source slot, which makes that the call to
> CopySlot() potentially stores a different amount of data than computed
> by GetTupleSize(). Not sure if this is a big problem as an estimation
> might be good enough?
>

Yeah. The tuple size may change after ExecCopySlot(). For instance, create
table t2 as select a1 from t1; where t1 has two integer columns a1, b1. I'm
creating t2 with single column a1 from t1 which makes the source slot
virtual.

Source slot is virtual and the size calculated with GetTupleSize() is 8
bytes:
(gdb) p *slot
$18 = {type = T_TupleTableSlot, tts_flags = 16, tts_nvalid = 1,
tts_ops = 0x562c592652c0 <TTSOpsVirtual>,
tts_tupleDescriptor = 0x562c5a0409f0, tts_values = 0x562c5a040b50,
tts_isnull = 0x562c5a040b58, tts_mcxt = 0x562c5a040320, tts_tid = {
ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid
= 0}
(gdb) call GetTupleSize(slot, 65535)
$24 = 8

After ExecCopySlot(batchslot, slot), destination slot changes to
TTSOpsBufferHeapTuple and the GetTupleSize() gives 28 bytes now.
(gdb) p *batchslot
$19 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 0,
tts_ops = 0x562c592653e0 <TTSOpsBufferHeapTuple>,
tts_tupleDescriptor = 0x7f063fbeecd0, tts_values = 0x562c5a05daa8,
tts_isnull = 0x562c5a05dab0, tts_mcxt = 0x562c5a040320, tts_tid = {
ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid
= 0}
(gdb) call GetTupleSize(batchslot, 65535)
$25 = 28

I think your suggestion to call GetTupleSize() on the destination slot
after ExecCopySlot() is right. I changed it in the v4 patch.

>
> Some other solutions/implementations would be:
> - compute the size after doing CopySlot. Maybe the relation never wants
> a virtual tuple and then you can also simplify GetTupleSize?
>

I think we need to have TTSOpsVirtual code in GetTupleSize() because
table_slot_create() which gets called before ExecCopySlot() may create
virtual slots for cases such as views and partitioned tables. Though we can
not insert into views or partitioned tables using CTAS, I want
GetTupleSize() to be a generic function. Right now, I can not find other
use cases where GetTupleSize() can be used.

>
> - after CopySlot ask for the memory consumed in the slot using
> MemoryContextMemAllocated.
>

MemoryContextMemAllocated of the slot's tts_mcxt will always have extra
bytes and those extra bytes are way more compared to the actual tuple
bytes. And most of the time, ExecCopySlot() will just point the src slot
tts_mcxt to dest slot tts_mcxt. For instance, for a single row with a
single integer column of 8 bytes, the mem_allocated is 49232 bytes. This is
the reason we can not rely on mem_allocated.

(gdb) p slot->tts_mcxt -----> source slot
$22 = (MemoryContext) 0x562c5a040320
(gdb) p *slot->tts_mcxt
$20 = {type = T_AllocSetContext, isReset = false, allowInCritSection =
false,
*mem_allocated = 49232*, methods = 0x562c5926d560 <AllocSetMethods>,
parent = 0x562c59f97820, firstchild = 0x562c5a042330, prevchild = 0x0,
nextchild = 0x0, name = 0x562c590d3554 "ExecutorState", ident = 0x0,
reset_cbs = 0x0}

(gdb) p batchslot->tts_mcxt -----> destination slot after
ExecCopySlot().
$23 = (MemoryContext) 0x562c5a040320
(gdb) p *batchslot->tts_mcxt
$21 = {type = T_AllocSetContext, isReset = false, allowInCritSection =
false,
*mem_allocated = 49232*, methods = 0x562c5926d560 <AllocSetMethods>,
parent = 0x562c59f97820, firstchild = 0x562c5a042330, prevchild = 0x0,
nextchild = 0x0, name = 0x562c590d3554 "ExecutorState", ident = 0x0,
reset_cbs = 0x0}

>
> Some small things to maybe change are:
> ===========
> + if (myState->mi_slots[myState->mi_slots_num] == NULL)
> + {
> + batchslot = table_slot_create(myState->rel, NULL);
> + myState->mi_slots[myState->mi_slots_num] =
batchslot;
> + }
> + else
> + batchslot =
myState->mi_slots[myState->mi_slots_num];
>
> Alternative:
> + if (myState->mi_slots[myState->mi_slots_num] == NULL)
> + myState->mi_slots[myState->mi_slots_num] =
> table_slot_create(myState->rel, NULL);
> + batchslot = myState->mi_slots[myState->mi_slots_num];
>

Changed.

> ==============
>
> + sz = att_align_nominal(sz, att->attalign);
> This could be moved out of the if statement?
>
> ==============

I don't think we can change it. If we were to move it, then sz =
att_addlength_datum(sz, att->attlen, val); which takes aligned sz may have
problems like below:
Say att_align_nominal sets sz to 4 bytes, then att_addlength_datum takes
this 4 bytes adds attlen to it. If we move att_align_nominal(sz,
att->attalign) out, then att_addlength_datum(sz, att->attlen, val) will not
consider the aligned bytes. We might have to add up the aligned bytes
separately for the else case. And also note that this code is derived from
ts_virtual_materialize(), where we have the att_align_nominal inside both
if and else blocks. I may be wrong here.

Attaching v4 patch. Consider it for further review.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v4-0001-Multi-Inserts-in-Create-Table-As.patch application/octet-stream 12.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hou, Zhijie 2020-11-26 02:17:21 RE: Parallel Inserts in CREATE TABLE AS
Previous Message tsunakawa.takay@fujitsu.com 2020-11-26 01:48:08 RE: POC: postgres_fdw insert batching