Re: COPY FROM WHEN condition

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: COPY FROM WHEN condition
Date: 2019-04-02 00:59:16
Message-ID: 20190402005916.z3f4q6xaya7r47px@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-04-02 13:41:57 +1300, David Rowley wrote:
> On Tue, 2 Apr 2019 at 05:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > On 2019-04-02 04:23:20 +1300, David Rowley wrote:
> > > On Mon, 1 Apr 2019 at 08:05, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > > I'll work on pushing all the other pending tableam patches today -
> > > > leaving COPY the last non pluggable part. You'd written in a private
> > > > email that you might try to work on this on Monday, so I think I'll give
> > > > this a shot on Tuesday if you've not gotten around till then? I'd like
> > > > to push this sooner than the exact end of the freeze...
> > >
> > > I worked on this, but I've not got the patch finished yet.
> >
> > Thanks! I'm not quite clear whether you planning to continue working on
> > this, or whether this is a handoff? Either is fine with me, just trying
> > to avoid unnecessary work / delay.
>
> I can, if you've not. I was hoping to gauge if you thought the
> approach was worth pursuing.

I think it's worth pursuing, with the caveats below. I'm going to focus
on docs the not-very-long rest of today, but I definitely could work on
this afterwards. But I also would welcome any help. Let me know...

> > > +static inline void
> > > +CopyMultiInsertBuffer_RemoveBuffer(CopyMultiInsertInfo *miinfo, CopyMultiInsertBuffer *buffer)
> > > +{
> > > + Oid relid = buffer->relid;
> > > + int i = 0;
> > > +
> > > + while (buffer->slots[i] != NULL)
> > > + ExecClearTuple(buffer->slots[i++]);
> > > + pfree(buffer->slots);
> > > +
> > > + hash_search(miinfo->multiInsertBufferTab, (void *) &relid, HASH_REMOVE,
> > > + NULL);
> > > + miinfo->nbuffers--;
> > > +}
> >
> > Hm, this leaves dangling slots, right?
>
> Probably. I didn't quite finish figuring out how a slot should have
> all its memory released.

ExecDropSingleTupleTableSlot() ought to do the job (if not added to a
tuptable/estate - which we shouldn't as currently one-by-one removal
from therein is expensive).

> > It still seems wrong to me to just perform a second hashtable search
> > here, givent that we've already done the partition dispatch.
>
> The reason I thought this was a good idea is that if we use the
> ResultRelInfo to buffer the tuples then there's no end to how many
> tuple slots can exist as the code in copy.c has no control over how
> many ResultRelInfos are created.

To me those aren't contradictory - we're going to have a ResultRelInfo
for each partition either way, but there's nothing preventing copy.c
from cleaning up subsidiary data in it. What I was thinking is that
we'd just keep track of a list of ResultRelInfos with bulk insert slots,
and occasionally clean them up. That way we avoid the secondary lookup,
while also managing the amount of slots.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-04-02 01:06:52 Re: COPY FROM WHEN condition
Previous Message Andres Freund 2019-04-02 00:53:52 Re: Pluggable Storage - Andres's take