Quick Links

Re: COPY FROM WHEN condition

From:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: COPY FROM WHEN condition
Date:	2019-04-02 01:06:52
Message-ID:	CAKJS1f8vjLPCpb-YqqixzZcKnjku2iWdZWpkionAiARN4h9s6w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 2 Apr 2019 at 13:59, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2019-04-02 13:41:57 +1300, David Rowley wrote:
> > On Tue, 2 Apr 2019 at 05:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > Thanks! I'm not quite clear whether you planning to continue working on
> > > this, or whether this is a handoff? Either is fine with me, just trying
> > > to avoid unnecessary work / delay.
> >
> > I can, if you've not. I was hoping to gauge if you thought the
> > approach was worth pursuing.
>
> I think it's worth pursuing, with the caveats below. I'm going to focus
> on docs the not-very-long rest of today, but I definitely could work on
> this afterwards. But I also would welcome any help. Let me know...

I'm looking now. I'll post something when I get it into some better
shape than it us now.

> > > It still seems wrong to me to just perform a second hashtable search
> > > here, givent that we've already done the partition dispatch.
> >
> > The reason I thought this was a good idea is that if we use the
> > ResultRelInfo to buffer the tuples then there's no end to how many
> > tuple slots can exist as the code in copy.c has no control over how
> > many ResultRelInfos are created.
>
> To me those aren't contradictory - we're going to have a ResultRelInfo
> for each partition either way, but there's nothing preventing copy.c
> from cleaning up subsidiary data in it. What I was thinking is that
> we'd just keep track of a list of ResultRelInfos with bulk insert slots,
> and occasionally clean them up. That way we avoid the secondary lookup,
> while also managing the amount of slots.

The problem that I see with that is you can't just add to that list
when the partition changes. You must check if the ResultRelInfo is
already in the list or not since we could change partitions and change
back again. For a list with just a few elements checking
list_member_ptr should be pretty cheap, but I randomly did choose that
we try to keep just the last 16 partitions worth of buffers. I don't
think checking list_member_ptr in a 16 element list is likely to be
faster than a hash table lookup, do you?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: COPY FROM WHEN condition at 2019-04-02 00:59:16 from Andres Freund

Responses

Re: COPY FROM WHEN condition at 2019-04-02 01:11:26 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2019-04-02 01:11:26	Re: COPY FROM WHEN condition
Previous Message	Andres Freund	2019-04-02 00:59:16	Re: COPY FROM WHEN condition