Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date: 2024-08-27 13:44:13
Message-ID: CAEze2WgiW89oQkG5TEQ-qmvUdSUG2OQYaxrdn=6HJKehEO1puw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 27 Aug 2024 at 07:42, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Mon, 2024-08-26 at 23:59 +0200, Matthias van de Meent wrote:
> > Specifically, I'm having trouble seeing how this could be used to
> > implement ```INSERT INTO ... SELECT ... RETURNING ctid``` as I see no
> > returning output path for the newly inserted tuples' data, which is
> > usually required for our execution nodes' output path. Is support for
> > RETURN-clauses planned for this API? In a previous iteration, the
> > flush operation was capable of returning a TTS, but that seems to
> > have
> > been dropped, and I can't quite figure out why.
>
> I'm not sure where that was lost, but I suspect when we changed
> flushing to use a callback. I didn't get to v23-0003 yet, but I think
> you're right that the current flushing mechanism isn't right for
> returning tuples. Thank you.
>
> One solution: when the buffer is flushed, we can return an iterator
> over the buffered tuples to the caller. The caller can then use the
> iterator to insert into indexes, return a tuple to the executor, etc.,
> and then release the iterator when done (freeing the buffer).

I think that would work, but it'd need to be accomodated in the
table_modify_buffer_insert path too, not just the _flush path, as the
heap AM flushes the buffer when inserting tuples and its internal
buffer is full, so not only at the end of modifications.

> That control flow is less convenient for most callers, though, so
> perhaps that should be optional?

That would be OK with me.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Sabino Mullane 2024-08-27 13:44:39 Re: Enable data checksums by default
Previous Message Matthias van de Meent 2024-08-27 13:28:12 Re: Parallel CREATE INDEX for GIN indexes