Re: Custom table AMs need to include heapam.h because of BulkInsertState

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>,Michael Paquier <michael(at)paquier(dot)xyz>,Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Custom table AMs need to include heapam.h because of BulkInsertState
Date: 2019-06-13 19:53:46
Message-ID: 92CCF1B7-5F59-4BD6-9AF8-703AAE25643B@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On June 12, 2019 6:42:11 PM PDT, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>On Mon, 10 Jun 2019 at 11:45, David Rowley
><david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>>
>> On Sat, 8 Jun 2019 at 04:51, Andres Freund <andres(at)anarazel(dot)de>
>wrote:
>> > David, any opinions on how to best fix this? It's not extremely
>obvious
>> > how to do so best in the current setup of the partition actually
>being
>> > hidden somewhere a few calls away (i.e. the table_open happening in
>> > ExecInitPartitionInfo()).
>>
>> That's been overlooked. I agree it's not a bug with heap, since
>> heapam_finish_bulk_insert() only does anything there when we're
>> skipping WAL, which we don't do in copy.c for partitioned tables.
>> However, who knows what other AMs will need, so we'd better fix that.
>>
>> My proposed patch is attached.
>>
>> I ended up moving the call to CopyMultiInsertInfoCleanup() down to
>> after we call table_finish_bulk_insert for the main table. This might
>> not be required but I lack imagination right now to what AMs might
>put
>> in the finish_bulk_insert call, so doing this seems safer.
>
>Andres, do you want to look at this before I look again?
>
>Do you see any issue with calling table_finish_bulk_insert() when the
>partition's CopyMultiInsertBuffer is evicted from the
>CopyMultiInsertInfo rather than at the end of the copy? It can mean
>that we call the function multiple times per partition. I assume the
>function is only really intended to flush bulk inserted tuple to the
>storage, so calling it more than once would just mean an inefficiency
>rather than a bug.
>
>Let me know your thoughts.

I'm out on vacation until Monday (very needed, pretty damn exhausted). So I can't really give you a in depth answer right now.

Off the cuff, I'd say it's worthwhile to try somewhat hard to avoid superfluous finish calls. They can be quite expensive (fsync), and we ought to have nearly all the state for doing it only as much as necessary. Possibly we need one bool per partition to track whether any rows where inserted, but thats peanuts in comparison to all the other state.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-06-13 20:04:19 REINDEX locking
Previous Message Andres Freund 2019-06-13 18:35:32 Re: PG 11 JIT deform failure