From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: COPY FROM WHEN condition |
Date: | 2019-03-27 05:49:23 |
Message-ID: | 20190327054923.t3epfuewxfqdt22e@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-01-29 07:22:16 -0800, Andres Freund wrote:
> On January 29, 2019 6:25:59 AM PST, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >On 1/29/19 8:18 AM, David Rowley wrote:
> >> ...
> >> Here are my performance tests of with and without your change to the
> >> memory contexts (I missed where you posted your results).
> >>
> >> $ cat bench.pl
> >> for (my $i=0; $i < 8912891; $i++) {
> >> print "1\n1\n2\n2\n";
> >> }
> >> 36a1281f86c: (with your change)
> >>
> >> postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> >> COPY 35651564
> >> Time: 26825.142 ms (00:26.825)
> >> Time: 27202.117 ms (00:27.202)
> >> Time: 26266.705 ms (00:26.267)
> >>
> >> 4be058fe9ec: (before your change)
> >>
> >> postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> >> COPY 35651564
> >> Time: 25645.460 ms (00:25.645)
> >> Time: 25698.193 ms (00:25.698)
> >> Time: 25737.251 ms (00:25.737)
> >>
> >
> >How do I reproduce this? I don't see this test in the thread you
> >linked,
> >so I'm not sure how many partitions you were using, what's the
> >structure
> >of the table etc.
>
> I think I might have a patch addressing the problem incidentally. For
> pluggable storage I slotified copy.c, which also removes the first
> heap_form_tuple. Quite possible that nothing more is needed. I've
> removed the batch context altogether in yesterday's rebase, there was
> no need anymore.
Here's a version that applies onto HEAD. It needs a bit more cleanup
(I'm not sure I like the addition of ri_batchInsertSlots to store slots
for each partition, there's some duplicated code around slot array and
slot creation, docs for the multi insert callback).
When measuring inserting into unlogged partitions (to remove the
overhead of WAL logging, which makes it harder to see the raw
performance difference), I see a few percent gains of the slotified
version over the current code. Same with insertions that have fewer
partition switches.
Sorry for posting this here - David, it'd be cool if you could take a
look anyway. You've played a lot with this code.
Btw, for v13, I hope that either I or somebody else cleans up CopyFrom()
a bit - it's gotten really hard to understand. I think we need to split
it into a few pieces.
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
v23-0001-tableam-multi_insert-and-slotify-COPY.patch | text/x-diff | 25.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2019-03-27 05:49:49 | Re: speeding up planning with partitions |
Previous Message | Haribabu Kommi | 2019-03-27 05:42:31 | Re: [HACKERS] Block level parallel vacuum |