Quick Links

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

From:	Jingtang Zhang <mrdrivingduck(at)gmail(dot)com>
To:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM
Date:	2024-10-28 08:43:09
Message-ID:	CAPsk3_BWX1xTa2jiWenq=AKs2QKFBg59jJ5EG8qNY4j6Kj8Cig@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi! Glad to see update in this thread.

Little question about v24 0002 patch: would it be better to move the
implementation of TableModifyIsMultiInsertsSupported to somewhere for table
AM
level? Seems it is a common function for future use, not a specific one for
matview.

---

Regards, Jingtang

Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> 于2024年10月26日周六
21:31写道：

> Hi,
>
> Thanks for looking into this.
>
> On Thu, Aug 29, 2024 at 12:29 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> >
> > I believe we need the branching in the caller anyway:
> >
> > 1. If there is a BEFORE row trigger with a volatile function, the
> > visibility rules[1] mean that the function should see changes from all
> > the rows inserted so far this command, which won't work if they are
> > still in the buffer.
> >
> > 2. Similarly, for an INSTEAD OF row trigger, the visibility rules say
> > that the function should see all previous rows inserted.
> >
> > 3. If there are volatile functions in the target list or WHERE clause,
> > the same visibility semantics apply.
> >
> > 4. If there's a "RETURNING ctid" clause, we need to either come up with
> > a way to return the tuples after flushing, or we need to use the
> > single-tuple path. (Similarly in the future when we support UPDATE ...
> > RETURNING, as Matthias pointed out.)
> >
> > If we need two paths in each caller anyway, it seems cleaner to just
> > wrap the check for tuple_modify_buffer_insert in
> > table_modify_buffer_enabled().
> >
> > We could perhaps use a one path and then force a batch size of one or
> > something, which is an alternative, but we have to be careful not to
> > introduce a regression (and it still requires a solution for #4).
>
> I chose to branch in the caller e.g. if there's a volatile function
> SELECT query of REFRESH MATERIALIZED VIEW, the caller goes
> table_tuple_insert() path, else multi-insert path.
>
> I am posting the new v24 patch set organized as follows: 0001
> introducing the new table AM, 0002 optimizing CTAS, CMV and RMV, 0003
> using the new table AM for COPY ... FROM. I, for now, discarded the
> INSERT INTO ... SELECT and Logical Replication Apply patches, the idea
> is to take the basic stuff forward.
>
> I reworked structure names, members and function names, reworded
> comments, addressed review comments in the v24 patches. Please have a
> look.
>
> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>

In response to

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-10-26 13:30:50 from Bharath Rupireddy

Responses

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM at 2024-10-28 14:48:34 from Jingtang Zhang

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tender Wang	2024-10-28 08:45:17	Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Previous Message	Joel Jacobson	2024-10-28 08:30:15	Re: New "raw" COPY format