Quick Links

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject:	Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date:	2024-08-26 21:18:21
Message-ID:	d079c042180a698179384d6a7a3109cd8310e285.camel@j-davis.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, 2024-08-26 at 11:09 +0530, Bharath Rupireddy wrote:
> On Wed, Jun 5, 2024 at 12:42 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > Please find the v22 patches with the above changes.
>
> Please find the v23 patches after rebasing 0005 and adapting 0004 for
> 9758174e2e.

Thank you.

0001 API design:

* Remove TableModifyState.modify_end_callback.

* This patch means that we will either remove or deprecate
TableAmRoutine.multi_insert and finish_bulk_insert. Are there any
strong opinions about maintaining support for multi-insert, or should
we just remove it outright and force any new AMs to implement the new
APIs to maintain COPY performance?

* Why do we need a separate "modify_flags" and "options"? Can't we just
combine them into TABLE_MODIFY_* flags?

Alexander, you had some work in this area as well, such b1484a3f19. I
believe 0001 covers this use case in a different way: rather than
giving complete responsibility to the AM to insert into the indexes,
the caller provides a callback and the AM is responsible for calling it
at the time the tuples are flushed. Is that right?

The design has been out for a while, so unless others have suggestions,
I'm considering the major design points mostly settled and I will move
forward with something like 0001 (pending implementation issues).

Note: I believe this API will extend naturally to updates and deletes,
as well.

0001 implementation issues:

* We need default implementations for AMs that don't implement the new
APIs, so that the AM will still function even if it only defines the
single-tuple APIs. If we need to make use of the AM's multi_insert
method (I'm not sure we do), then the default methods would need to
handle that as well. (I thought a previous version had these default
implementations -- is there a reason they were removed?)

* I am confused about how the heap implementation manages state and
resets it. mistate->mem_cxt is initialized to a new memory context in
heap_modify_begin, and then re-initialized to another new memory
context in heap_modify_buffer_insert. Then the mistate->mem_cxt is also
used as a temp context for executing heap_multi_insert, and it gets
reset before calling the flush callback, which still needs the slots.

* Why materialize the slot at copyfrom.c:1308 if the slot is going to
be copied anyway (which also materializes it; see
tts_virtual_copyslot()) at heapam.c:2710?

* After correcting the memory issues, can you get updated performance
numbers for COPY?

Regards,
Jeff Davis

In response to

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-08-26 05:39:28 from Bharath Rupireddy

Responses

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-08-26 21:59:28 from Matthias van de Meent
Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-08-27 21:43:59 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Nathan Bossart	2024-08-26 21:27:15	Re: Fix memory counter update in reorderbuffer
Previous Message	Thomas Munro	2024-08-26 21:06:29	Re: Segfault in jit tuple deforming on arm64 due to LLVM issue