From: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Calculation of param_source_rels in add_paths_to_joinrel |
Date: | 2016-11-11 07:00:40 |
Message-ID: | CAFjFpRdpjbQzDAA++S1uDHbPOX7oxSS+YncDQZO64nKWRU-aVA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Nov 5, 2016 at 2:16 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> writes:
>> There's code in add_paths_to_joinrel() which computes the set of
>> target relations that should overlap parameterization of any proposed
>> join path.
>> ...
>> The calculations that follow are based on joinrel->relids (baserels
>> covered by the join) and SpecialJoinInfo list in PlannerInfo. It is
>> not based on specific combination of relations being joined or the
>> paths being generated. We should probably do this computation once and
>> store the result in the joinrel and use it multiple times. That way we
>> can avoid computing the same set again and again for every pair of
>> joining relations and their order. Any reasons why we don't do this?
>
> I'm not terribly excited about this. The issue is strictly local to
> add_paths_to_joinrel, but putting that set in a global data structure
> makes it nonlocal, and makes it that much harder to tweak the algorithm
> if we think of a better way. (In particular, I think it's not all that
> obvious that the set must be independent of which two subset relations
> we are currently joining.)
Right now it appears that for every subset of relations, we have
different param_source_rels, which is clearly not. It takes a bit of
time to understand that. Adding it to a global data structure will at
least make the current implementation clear i.e param_source_rels does
not change with subset of relations being joined.
>
> If you can show a measurable performance improvement from this change,
> then maybe those downsides are acceptable. But I do not think we should
> commit it without a demonstrated performance benefit from the added
> complexity and loss of flexibility.
I couldn't find a measurable time difference with or without my patch,
so multiple computations of param_source_rels aren't taking noticeable
time. I used following queries to measure the planning time through
explain analyze.
create view pc_view as select c1.oid c1o, c2.oid c2o, c3.oid c3o from
pg_class c1, pg_class c2 left join pg_class c3 on (c2.oid = c3.oid)
where c1.oid = c2.oid and c1.oid = c3.oid and c1.relname = c3.relname;
select v1, v2, v3 from pc_view v1, pc_view v2 left join pc_view v3 on
(v2.c3o = v3.c1o), pc_view v4 where v1.c3o = v2.c2o and v1.c2o =
v4.c3o limit 0;
>
>> Also, the way this code has been written, the declaration of variable
>> sjinfo masks the earlier declaration with the same name. I am not sure
>> if that's intentional, but may be we should use another variable name
>> for the inner sjinfo. I have not included that change in the patch.
>
> Hmm, yeah, that's probably not terribly good coding practice.
Attached a patch to fix this.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachment | Content-Type | Size |
---|---|---|
pg_param_source_rels_v2.patch | text/x-patch | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tsunakawa, Takayuki | 2016-11-11 07:03:49 | Re: Patch: Implement failover on libpq connect level. |
Previous Message | Masahiko Sawada | 2016-11-11 06:38:15 | Re: Transactions involving multiple postgres foreign servers |