Re: COPY FROM WHEN condition

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: COPY FROM WHEN condition
Date: 2019-04-02 00:41:57
Message-ID: CAKJS1f_JynmaEZWMzCnT5Zuf=7ZOJrjK5NroF+GqHUvSvGEoKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2 Apr 2019 at 05:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2019-04-02 04:23:20 +1300, David Rowley wrote:
> > On Mon, 1 Apr 2019 at 08:05, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > I'll work on pushing all the other pending tableam patches today -
> > > leaving COPY the last non pluggable part. You'd written in a private
> > > email that you might try to work on this on Monday, so I think I'll give
> > > this a shot on Tuesday if you've not gotten around till then? I'd like
> > > to push this sooner than the exact end of the freeze...
> >
> > I worked on this, but I've not got the patch finished yet.
>
> Thanks! I'm not quite clear whether you planning to continue working on
> this, or whether this is a handoff? Either is fine with me, just trying
> to avoid unnecessary work / delay.

I can, if you've not. I was hoping to gauge if you thought the
approach was worth pursuing.

> > Of course, it is possible that some of the bugs account for some of
> > the improved time... but the rows did seem to be in the table
> > afterwards.
>
> The speedup likely seems to be from having much larger batches, right?
> Unless we insert into enough partitions that batching doesn't ever
> encounter two tuples, we'll now efficiently use multi_insert even if
> there's no consecutive tuples.

I imagine that's part of it, but I was surprised that the test that
inserts into the same partition also was improved fairly
significantly.

> > +static inline void
> > +CopyMultiInsertBuffer_RemoveBuffer(CopyMultiInsertInfo *miinfo, CopyMultiInsertBuffer *buffer)
> > +{
> > + Oid relid = buffer->relid;
> > + int i = 0;
> > +
> > + while (buffer->slots[i] != NULL)
> > + ExecClearTuple(buffer->slots[i++]);
> > + pfree(buffer->slots);
> > +
> > + hash_search(miinfo->multiInsertBufferTab, (void *) &relid, HASH_REMOVE,
> > + NULL);
> > + miinfo->nbuffers--;
> > +}
>
> Hm, this leaves dangling slots, right?

Probably. I didn't quite finish figuring out how a slot should have
all its memory released.

> It still seems wrong to me to just perform a second hashtable search
> here, givent that we've already done the partition dispatch.

The reason I thought this was a good idea is that if we use the
ResultRelInfo to buffer the tuples then there's no end to how many
tuple slots can exist as the code in copy.c has no control over how
many ResultRelInfos are created.

> > if (proute)
> > insertMethod = CIM_MULTI_CONDITIONAL;
> > else
> > insertMethod = CIM_MULTI;
>
> Hm, why do we still have CIM_MULTI and CIM_MULTI_CONDITIONAL?

We don't know what partitions will be foreign or have triggers that
don't allow us to multi-insert. Multi-inserts are still conditional
based on that.

> Not for now / this commit, but I kinda feel like we should determine
> whether it's safe to multi-insert at a more centralized location.

Yeah, I was thinking that might be nice but teaching those
Multi-insert functions about that means they'd need to handle
non-multi inserts too. That might only be a problem that highlights
that the functions I added are not named very well.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-04-02 00:53:52 Re: Pluggable Storage - Andres's take
Previous Message Haribabu Kommi 2019-04-02 00:39:57 Re: Pluggable Storage - Andres's take