Re: COPY FROM WHEN condition

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: COPY FROM WHEN condition
Date: 2019-03-27 05:49:23
Message-ID: 20190327054923.t3epfuewxfqdt22e@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-01-29 07:22:16 -0800, Andres Freund wrote:
> On January 29, 2019 6:25:59 AM PST, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >On 1/29/19 8:18 AM, David Rowley wrote:
> >> ...
> >> Here are my performance tests of with and without your change to the
> >> memory contexts (I missed where you posted your results).
> >>
> >> $ cat bench.pl
> >> for (my $i=0; $i < 8912891; $i++) {
> >> print "1\n1\n2\n2\n";
> >> }
> >> 36a1281f86c: (with your change)
> >>
> >> postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> >> COPY 35651564
> >> Time: 26825.142 ms (00:26.825)
> >> Time: 27202.117 ms (00:27.202)
> >> Time: 26266.705 ms (00:26.267)
> >>
> >> 4be058fe9ec: (before your change)
> >>
> >> postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> >> COPY 35651564
> >> Time: 25645.460 ms (00:25.645)
> >> Time: 25698.193 ms (00:25.698)
> >> Time: 25737.251 ms (00:25.737)
> >>
> >
> >How do I reproduce this? I don't see this test in the thread you
> >linked,
> >so I'm not sure how many partitions you were using, what's the
> >structure
> >of the table etc.
>
> I think I might have a patch addressing the problem incidentally. For
> pluggable storage I slotified copy.c, which also removes the first
> heap_form_tuple. Quite possible that nothing more is needed. I've
> removed the batch context altogether in yesterday's rebase, there was
> no need anymore.

Here's a version that applies onto HEAD. It needs a bit more cleanup
(I'm not sure I like the addition of ri_batchInsertSlots to store slots
for each partition, there's some duplicated code around slot array and
slot creation, docs for the multi insert callback).

When measuring inserting into unlogged partitions (to remove the
overhead of WAL logging, which makes it harder to see the raw
performance difference), I see a few percent gains of the slotified
version over the current code. Same with insertions that have fewer
partition switches.

Sorry for posting this here - David, it'd be cool if you could take a
look anyway. You've played a lot with this code.

Btw, for v13, I hope that either I or somebody else cleans up CopyFrom()
a bit - it's gotten really hard to understand. I think we need to split
it into a few pieces.

Greetings,

Andres Freund

Attachment Content-Type Size
v23-0001-tableam-multi_insert-and-slotify-COPY.patch text/x-diff 25.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-27 05:49:49 Re: speeding up planning with partitions
Previous Message Haribabu Kommi 2019-03-27 05:42:31 Re: [HACKERS] Block level parallel vacuum