From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | oleg(at)sai(dot)msu(dot)su, teodor(at)sigaev(dot)ru, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | intarray planning/exeuction problem with multiple operators |
Date: | 2015-07-13 06:12:41 |
Message-ID: | CAMkU=1yx952u=tc8o1Ed1LZgLceopJ3MR5n0OAP84tPWn_jNtg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've found an interesting performance problem in the intarray extension
module. It doesn't seem to be version dependent, I've verified it in 9.4.4
and 9.6devel.
If I do a query that involves both an = op and a @@ op ANDed together, it
gives a high cost estimate, which is justified by the slow runtime. If I
omit the @@ it gives a low estimate, also justified. If I trick it into
thinking it cannot use the index to satisfy the @@ op, then it gives a low
estimate and low runtime, applying the @@ in the filter step and only the
fast = in the bitmap index scan.
Now it could use the index for the fast = operation and apply the @@ in the
recheck, rather than the filter. But I guess it doesn't think of that,
despite knowing that applying the @@ in index operation is slow.
So it seems to be missing a trick someplace, but I don't if it reasonable
to expect it to find that trick, or how easy/hard it would be to implement.
The proposed selectivity estimate improvement patch for @@ does not fix
this (and evaluating that patch was how I found this issue.)
Set up:
create table foobar as
select (
select
array_agg(floor(sqrt(random()*10000000+g.y/1000000+f.x/10000000))::int)
from generate_series(1,10) g(y)
) foo
from generate_series(1,10000000) f(x);
create index on foobar using gin(foo gin__int_ops);
analyze;
You can knock an order of magnitude off from the table size and should
still be able to see the problem.
example:
explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo @@
'!1'::query_int;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Seq Scan on foobar (cost=0.00..263637.00 rows=1 width=61) (actual
time=9717.957..9717.957 rows=0 loops=1)
Filter: ((foo @@ '!1'::query_int) AND (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[]))
Rows Removed by Filter: 10000000
Buffers: shared hit=64 read=113573
I/O Timings: read=361.402
Planning time: 0.101 ms
Execution time: 9717.991 ms
(7 rows)
explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foobar (cost=112.01..116.02 rows=1 width=61) (actual
time=0.027..0.027 rows=0 loops=1)
Recheck Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
-> Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1
width=0) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
Planning time: 0.097 ms
Execution time: 0.061 ms
If I trick it into thinking the @@ operator cannot be used in th eindex,
then it gets really fast again:
explain(analyze, buffers) select * from foobar where foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}' and foo||'{}'
@@ '!1'::query_int;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foobar (cost=112.01..116.03 rows=1 width=61) (actual
time=0.082..0.082 rows=0 loops=1)
Recheck Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Filter: ((foo || '{}'::integer[]) @@ '!1'::query_int)
Buffers: shared hit=21
-> Bitmap Index Scan on foobar_foo_idx (cost=0.00..112.01 rows=1
width=0) (actual time=0.080..0.080 rows=0 loops=1)
Index Cond: (foo =
'{22046,26347,6816,21401,18802,31318,30628,8027,22217,20984}'::integer[])
Buffers: shared hit=21
Planning time: 0.139 ms
Execution time: 0.129 ms
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2015-07-13 07:16:37 | Re: optimizing vacuum truncation scans |
Previous Message | Michael Paquier | 2015-07-13 04:25:44 | Re: A little RLS oversight? |