From: | RK Korlapati <rkkorlapati(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Costing bug in hash join logic for semi joins |
Date: | 2018-07-11 17:09:04 |
Message-ID: | CAHupG3gMjnOFpnL=zPaMhOibJ5J4qJ5yH6GjY0dJV5uQAdJcRA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
There is a costing bug in hash join logic seems to have been introduced by
the patch related to inner_unique enhancements(commit:
9c7f5229ad68d7e0e4dd149e3f80257893e404d4). Specifically, "hashjointuples"
which tracks the number of matches for hash clauses is computed wrong for
inner unique scenario. This leads to lot of semi-joins incorrectly turn to
inner joins with unique on inner side. Function "final_cost_hashjoin" has
special handling to cost semi/anti joins to account for early stop after
the first match. This is enhanced by the above said commit to be used for
inner_unique scenario also. However, "hashjointuples" computation is not
fixed to handle this new scenario which is incorrectly stepping into the
anti join logic and assuming unmatched rows. Fix is to handle inner_unique
case when computing "hashjointuples". Here is the outline of the code that
shows the bug.
void
final_cost_hashjoin(PlannerInfo *root, HashPath *path,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra)
{
.....
/* CPU costs */
* if (path->jpath.jointype == JOIN_SEMI ||
path->jpath.jointype == JOIN_ANTI || extra->inner_unique)*
{
......
* /* Get # of
tuples that will pass the basic join */ if
(path->jpath.jointype == JOIN_SEMI) hashjointuples =
outer_matched_rows; else hashjointuples =
outer_path_rows - outer_matched_rows; *
}
else
{
.....
}
}
Thanks, RK (Salesforce)
From | Date | Subject | |
---|---|---|---|
Next Message | Emre Hasegeli | 2018-07-11 17:13:15 | Re: [PATCH] Improve geometric types |
Previous Message | Heikki Linnakangas | 2018-07-11 17:07:46 | Re: cursors with prepared statements |