From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Make executor's Range Table an array instead of a List |
Date: | 2018-08-23 14:26:50 |
Message-ID: | CA+HiwqEwxv6QOrwjNLgQB8aGrQNeTwQUaaLeVMBqmVOPVnMkPw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
(sorry David for the same email twice, I didn't hit Reply All at first)
On Thu, Aug 23, 2018 at 7:58 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Continuing along on my adventures of making performance improvements
> for partitioned tables, I discovered that getrelid is quite slow in
> InitPlan() during UPDATEs and DELETEs when there are many
> resultRelations to process. A List is a pretty poor choice for this
> data structure since we already know how big it is and we pretty much
> only ever perform lookups by the List's Nth element.
>
> Converting this to an array obviously improves that O(n) List lookup
> into an O(1) operation.
>
> Benchmark against today's master with 300 partitions using prepared statements:
>
> CREATE TABLE partbench (id BIGINT NOT NULL, i1 INT NOT NULL, i2 INT
> NOT NULL, i3 INT NOT NULL, i4 INT NOT NULL, i5 INT NOT NULL) PARTITION
> BY RANGE (id);
> \o /dev/null
> select 'CREATE TABLE partbench' || x::text || ' PARTITION OF partbench
> FOR VALUES FROM (' || (x*100000)::text || ') TO (' ||
> ((x+1)*100000)::text || ');' from generate_Series(0,299) x;
> \gexec
> \o
>
> ALTER SYSTEM SET plan_cache_mode = 'force_generic_plan'; -- minimise
> planning overhead.
>
> update.sql:
>
> \set p_id 29999999
> update partbench set i1 = i1+1 where id = :p_id;
>
> pgbench -n -M prepared -T 60 -f update.sql postgres
>
> Unpatched:
>
> tps = 558.708604 (excluding connections establishing)
> tps = 556.128341 (excluding connections establishing)
> tps = 569.096393 (excluding connections establishing)
>
> Patched:
>
> tps = 683.271356 (excluding connections establishing)
> tps = 678.693578 (excluding connections establishing)
> tps = 694.446564 (excluding connections establishing)
>
> (about 22% improvement)
>
> Of course, it would be much nicer if PlannedStmt->rtable was already
> an array and we didn't have to convert it to one during executor
> startup. We'd need to build an array that's a Node type for that to
> work. I did suggest something ([1]) last year in this regard, but I
> felt like it might be a bit of an uphill struggle to gain enough
> community support to make that work. So I feel like this patch is
> quite worthwhile in the meantime.
>
> The (fairly small) patch is attached.
One of the patches I sent last week does the same thing, among a
couple of other things with regard to handling relations in the
executor. On a cursory look at the patch, your take of it looks
better than mine. Will test tomorrow. Here is a link to my email:
https://www.postgresql.org/message-id/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
4th of my patches implements the array'fication of executor's range table.
Thanks,
Amit
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2018-08-23 14:29:53 | Re: Removing useless DISTINCT clauses |
Previous Message | Surafel Temesgen | 2018-08-23 14:11:04 | Re: Conflict handling for COPY FROM |