From: | Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> |
---|---|
To: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Removing unneeded self joins |
Date: | 2018-06-25 15:26:28 |
Message-ID: | d44ef452-10b8-7680-2984-84897283fa97@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David,
I tried to implement your suggestions, here are the patches.
The first patch mostly moves some code around without changing
functionality. It modifies innerrel_is_unique to not only cache the fact
that the relation is unique, but also
cache the index that guarantees uniqueness.
The second patch adds the unique self join removal code. It goes along
the lines of your plan. I didn't even have to examine join clauses
because the constraints imposed by innerrel_is_unique are strong enough
to guarantee that when it finds the same unique index for both
relations, the join can be removed. Namely, it requires mergejoinable
equality clauses for all columns of a unique index.
As a simple benchmark, I measured the duration of query_planner and
remove_useless_self_joins with clock_gettime() on the regression tests.
The following table shows average times in microseconds, median over 11
runs. First row is with this patch, and the second row doesn't call
remove_useless_self_joins and just calls clock_gettime to measure its
overhead.
query_planner remove_useless_self_joins
with removal 39.61 0.61
no removal 39.45 0.38
So, on this workload, unique self join removal adds about 0.2 mcs, or
0.6% of total time, to query_planner. I also tried a query that joins 26
relations, remove_useless_self_joins takes about 40 mcs. Still, this
time grows quadratically with number of relations we have to process, so
in the final patch I limit it to join_collapse_limit, which brings the
time down to 15 mcs. This is negligible compared to the total
query_planner time, which for 8 relations is about 3 ms, that is, 3
orders of magnitude higher.
These benchmarks mostly measure the path where we don't remove any
joins. I didn't time the join removal itself, because it wins quite some
time by allowing to plan one relation and one join less.
--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
0002-Remove-unique-self-joins-v2.patch | text/x-patch | 18.7 KB |
0001-Preparatory-refactoring-v2.patch | text/x-patch | 26.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeremy Finzel | 2018-06-25 15:37:18 | Some pgq table rewrite incompatibility with logical decoding? |
Previous Message | Andrew Dunstan | 2018-06-25 14:41:41 | Re: Using JSONB directly from application |