Re: CTE with JOIN of two tables is much faster than a regular query

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: kpi6288(at)gmail(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: CTE with JOIN of two tables is much faster than a regular query
Date: 2018-08-18 15:29:06
Message-ID: 18425.1534606146@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * kpi6288(at)gmail(dot)com (kpi6288(at)gmail(dot)com) wrote:
>> The CTE mentioned below completes the query in 4.5 seconds while the regular
>> query takes 66 seconds.

> Unfortunately, we don't currently pay attention to things like average
> string length when considering the cost of performing an 'ilike', so we
> figure that doing the filtering first and then the join will be faster,
> but that obviously falls over in some cases, like this one. Using the
> CTE forces PG to (today, at least) do the join first, but that isn't
> really good to rely on.

Well, it's simpler than that: filter quals are always evaluated at
the lowest possible plan level. One of the Berkeley PhD theses that
we ripped out ages ago tried to be smarter about that, but the
cost/benefit/complexity ratio just wasn't very good, mainly because
it's so darn hard to estimate the selectivity of quals on subsets
of relations.

It's not very apparent why the results are so bad in this case,
either. One of the plans has the ILIKE being applied to circa 32600
rows, and the other one runs it on circa 126000 rows. That should
produce less than a 4x penalty, not 14x. Do the rows removed by
the join have significantly-longer-on-average sztext fields?
(If so, the odds that the planner would ever recognize such a
correlation seem pretty small.)

In any case, given that the ILIKE selects so few rows (and the planner
knows it!), finding a way to index that is clearly the right answer.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message kpi6288 2018-08-18 15:43:23 AW: CTE with JOIN of two tables is much faster than a regular query
Previous Message Oleksii Kliukin 2018-08-18 15:12:43 Re: regex match and special characters