Re: Custom table AMs need to include heapam.h because of BulkInsertState

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Custom table AMs need to include heapam.h because of BulkInsertState
Date: 2019-06-24 11:12:49
Message-ID: CAKJS1f_ExpxTnP2FOCgFmQb_gu3a1OkRCb7721_6KczbaEp2GQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 24 Jun 2019 at 22:16, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Jun 15, 2019 at 12:25:12PM +1200, David Rowley wrote:
> > With the attached version I'm just calling table_finish_bulk_insert()
> > once per partition at the end of CopyFrom(). We've got an array with
> > all the ResultRelInfos we touched in the proute, so it's mostly a
> > matter of looping over that array and calling the function on each
> > ResultRelInfo's ri_RelationDesc. However, to make it more complex,
> > PartitionTupleRouting is private to execPartition.c so we can't do
> > this directly... After staring at my screen for a while, I decided to
> > write a function that calls a callback function on each ResultRelInfo
> > in the PartitionTupleRouting.
>
> Don't take me bad, but I find the solution of defining and using a new
> callback to call the table AM callback not really elegant, and keeping
> all table AM callbacks called at a higher level than the executor
> makes the code easier to follow. Shouldn't we try to keep any calls
> to table_finish_bulk_insert() within copy.c for each partition
> instead?

I'm not quite sure if I follow you since the call to
table_finish_bulk_insert() is within copy.c still.

The problem was that PartitionTupleRouting is private to
execPartition.c, and we need a way to determine which of the
partitions we routed tuples to. It seems inefficient to flush all of
them if only a small number had tuples inserted into them and to me,
it seems inefficient to add some additional tracking in CopyFrom(),
like a hash table to store partition Oids that we inserted into. Using
PartitionTupleRouting makes sense. It's just a question of how to
access it, which is not so easy due to it being private.

I did suggest a few other ways that we could solve this. I'm not so
clear on which one of those you're suggesting or if you're thinking of
something new.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2019-06-24 11:31:07 Re: GiST "choose subtree" support function to inline penalty
Previous Message Michael Paquier 2019-06-24 10:16:16 Re: Custom table AMs need to include heapam.h because of BulkInsertState