Re: Multi Inserts in CREATE TABLE AS - revived patch

From: Luc Vlaming <luc(at)swarm64(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Paul Guo <guopa(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Multi Inserts in CREATE TABLE AS - revived patch
Date: 2020-11-26 12:04:31
Message-ID: 0b726341-9ffe-2c1e-6953-aa030ce62c42@swarm64.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26-11-2020 12:36, Bharath Rupireddy wrote:
> Few things:
>
> IIUC Andres mentioned similar kinds of APIs earlier in [1].
>
> [1] -
> https://www.postgresql.org/message-id/20200924024128.kyk3r5g7dnu3fxxx%40alap3.anarazel.de
> <https://www.postgresql.org/message-id/20200924024128.kyk3r5g7dnu3fxxx%40alap3.anarazel.de>
>
> I would like to add some more info to one of the API:
>
> typedef struct MultiInsertStateData
> {
>     MemoryContext         micontext; /* A temporary memory context for
> multi insert. */
>     BulkInsertStateData *bistate;   /* Bulk insert state. */
>     TupleTableSlot      **mislots; /* Array of buffered slots. */
>     uint32              nslots; /* Total number of buffered slots. */
>     int64              nbytes; /* Flush buffers if the total tuple size
> >= nbytes. */
>     int32              nused; /* Number of current buffered slots for a
> multi insert batch. */
>     int64              nsize; /* Total tuple size for a multi insert
> batch. */
> } MultiInsertStateData;
>
> /* Creates a temporary memory context, allocates the
> MultiInsertStateData, BulkInsertStateData and initializes other members. */
>     void        (*begin_multi_insert) (Relation rel,
> MultiInsertStateData **mistate, uint32 nslots, uint64 nbytes);
>
> /* Buffers the input slot into mistate slots, computes the size of the
> tuple, and adds it total buffer tuple size, if this size crosses
> mistate->nbytes, flush the buffered tuples into table. For heapam,
> existing heap_multi_insert can be used. Once the buffer is flushed, then
> the micontext can be reset and buffered slots can be cleared. *If nbytes
> i.e. total tuple size of the batch is not given, tuple size is not
> calculated, tuples are buffered until all the nslots are filled and then
> flushed.* */
>     void        (*do_multi_insert) (Relation rel, MultiInsertStateData
> *mistate, struct TupleTableSlot *slot, CommandId cid, int options);
>
> /* Flush the buffered tuples if any. For heapam, existing
> heap_multi_insert can be used. Deletes temporary memory context and
> deallocates mistate. */
>     void        (*end_multi_insert) (Relation rel, MultiInsertStateData
> *mistate, CommandId cid, int options);
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com>

Looks all good to me, except for the nbytes part.
Could you explain to me what use case that supports? IMHO the tableam
can best decide itself that its time to flush, based on its
implementation that e.g. considers how many pages to flush at a time and
such, etc? This means also that most of the fields of
MultiInsertStateData can be private as each tableam would return a
derivative of that struct (like with the destreceivers).

One thing I'm wondering is in which memory context the slots end up
being allocated. I'd assume we would want to keep the slots around
between flushes. If they are in the temporary context this might prove
problematic however?

Regards,
Luc

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-11-26 12:22:11 Re: [PATCH] Add features to pg_stat_statements
Previous Message Dean Rasheed 2020-11-26 11:50:34 Re: proposal: possibility to read dumped table's name from file