From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: POC: converting Lists into arrays |
Date: | 2019-07-21 13:32:00 |
Message-ID: | CAKJS1f8ZcsLVgkF4wOfRyMYTcPgLFiUAOedFC+U2vK_aFZk-BA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 17 Jul 2019 at 11:06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> (Actually, I doubt that any of these changes will really move the
> performance needle in the real world. It's more a case of wanting
> the code to present good examples not bad ones.)
In spirit with the above, I'd quite like to fix a small bad example
that I ended up with in nodeAppend.c and nodeMergeappend.c for
run-time partition pruning.
The code in question performs a loop over a list and checks
bms_is_member() on each element and only performs an action if the
member is present in the Bitmapset.
It would seem much more efficient just to perform a bms_next_member()
type loop then just fetch the list item with list_nth(), at least this
is certainly the case when only a small number of the list items are
indexed by the Bitmapset. With these two loops in particular, when a
large number of list items are in the set the cost of the work goes up
greatly, so it does not seem unreasonable to optimise the case for
when just a few match.
A quick test shows that it's hardly groundbreaking performance-wise,
but test 1 does seem measurable above the noise.
-- Setup
plan_cache_mode = force_generic_plan
max_locks_per_transaction = 256
create table ht (a int primary key, b int, c int) partition by hash (a);
select 'create table ht' || x::text || ' partition of ht for values
with (modulus 8192, remainder ' || (x)::text || ');' from
generate_series(0,8191) x;
\gexec
-- Test 1: Just one member in the Bitmapset.
test1.sql:
\set p 1
select * from ht where a = :p
Master:
$ pgbench -n -f test1.sql -T 60 -M prepared postgres
tps = 297.267191 (excluding connections establishing)
tps = 298.276797 (excluding connections establishing)
tps = 296.264459 (excluding connections establishing)
tps = 298.968037 (excluding connections establishing)
tps = 298.575684 (excluding connections establishing)
Patched:
$ pgbench -n -f test1.sql -T 60 -M prepared postgres
tps = 300.924254 (excluding connections establishing)
tps = 299.360196 (excluding connections establishing)
tps = 300.197024 (excluding connections establishing)
tps = 299.741215 (excluding connections establishing)
tps = 299.748088 (excluding connections establishing)
0.71% faster
-- Test 2: when all list items are found in the Bitmapset.
test2.sql:
select * from ht;
Master:
$ pgbench -n -f test2.sql -T 60 -M prepared postgres
tps = 12.526578 (excluding connections establishing)
tps = 12.528046 (excluding connections establishing)
tps = 12.491347 (excluding connections establishing)
tps = 12.538292 (excluding connections establishing)
tps = 12.528959 (excluding connections establishing)
Patched:
$ pgbench -n -f test2.sql -T 60 -M prepared postgres
tps = 12.503670 (excluding connections establishing)
tps = 12.516133 (excluding connections establishing)
tps = 12.404925 (excluding connections establishing)
tps = 12.514567 (excluding connections establishing)
tps = 12.541484 (excluding connections establishing)
0.21% slower
With that removed the slowness of test 1 is almost entirely in
AcquireExecutorLocks() and ExecCheckRTPerms(). We'd be up close to
about 30k tps instead of 300 tps if there was some solution to those
problems. I think it makes sense to remove the inefficient loops and
leave the just final two bottlenecks, in the meantime.
Patch attached.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
list_fixups_list_nth.patch | application/octet-stream | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-07-21 13:47:31 | Re: Compiler warnings with MinGW |
Previous Message | Andrew Dunstan | 2019-07-21 13:25:55 | Re: Compile from source using latest Microsoft Windows SDK |