From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: nested loop semijoin estimates |
Date: | 2015-05-30 21:46:58 |
Message-ID: | 556A2FD2.5020302@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 05/30/15 23:16, Tom Lane wrote:
> I wrote:
>> So what this seems to mean is that for SEMI/ANTI join cases, we have to
>> postpone all of the inner scan cost determination to final_cost_nestloop,
>> so that we can do this differently depending on whether
>> has_indexed_join_quals() is true. That's a little bit annoying because it
>> will mean we take the shortcut exit less often; but since SEMI/ANTI joins
>> aren't that common, it's probably not going to be a big planning time hit.
>
> Attached is a draft patch for that. It fixes the problem for me:
>
> Nested Loop Semi Join (cost=0.99..9.09 rows=1 width=74) (actual time=0.591..1.554 rows=2 loops=1)
> -> Index Scan using term_facttablename_columnname_idx on term t (cost=0.55..8.57 rows=1 width=74) (actual time=0.022..0.025 rows=2 loops=1)
> Index Cond: (((facttablename)::text = 'facttable_stat_fta4'::text) AND ((columnname)::text = 'berechnungsart'::text))
> -> Index Only Scan using facttable_stat_fta4_berechnungsart_idx on facttable_stat_fta4 f (cost=0.43..143244.98 rows=5015134 width=2) (actual time=0.759..0.759 rows=1 loops=2)
> Index Cond: (berechnungsart = (t.term)::text)
> Heap Fetches: 0
> Planning time: 0.545 ms
> Execution time: 1.615 ms
Seems to be working OK, but I still do get a Bitmap Heap Scan there (but
more about that later).
Do you plan to push that into 9.5, or 9.6? I assume it's a behavior
change so that no back-patching, right?
>
>> Not sure yet about your other point about the indexscan getting
>> rejected too soon. That doesn't seem to be happening for me, at
>> least not in HEAD.
>
> I do see something of the sort if I turn off enable_indexonlyscan.
> Not sure about a good fix for that aspect. It may not be terribly
> critical, since AFAICS index-only scans generally ought to apply
> in these cases.
Hmmm, a VACUUM FREEZE fixed that for me. The reason is that right after
loading the testcase, I do get this:
Index Only Scan using facttable_stat_fta4_berechnungsart_idx
on facttable_stat_fta4 f (cost=0.43..280220.51 rows=5000016 width=2)
and after VACUUM FREEZE I do get this:
Index Only Scan using facttable_stat_fta4_berechnungsart_idx
on facttable_stat_fta4 f (cost=0.43..142344.43 rows=5000000 width=2)
and the Bitmap Heap Scan case looks like this:
Bitmap Heap Scan on facttable_stat_fta4 f
(cost=93594.56..200342.76 rows=5000016 width=2)
so it's cheaper (total cost) than the index only scan before freezing,
and more expensive than index only scan after freezing.
I still think this is wrong (or rather "suboptimal") - there are
probably cases where even the "freezed" index only scan is more
expensive than a bitmap heap scan, and in that case the it won't be
used, although it'd be much faster.
Another example is a query with a plain index scan, e.g. consider this
slight modification of the query:
SELECT facttablename, columnname, term FROM term t WHERE
facttablename='facttable_stat_fta4' AND columnname='berechnungsart' AND
EXISTS (SELECT 1 FROM facttable_stat_fta4 f WHERE
f.berechnungsart=t.term AND einheit IS NOT NULL);
This will result in bitmap index scan no matter the visibility.
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2015-05-30 22:12:17 | Re: Free indexed_tlist memory explicitly within set_plan_refs() |
Previous Message | Andres Freund | 2015-05-30 21:23:36 | Re: [CORE] postpone next week's release |