From: | Nick Cleaton <nick(at)cleaton(dot)net> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | A limit clause can cause a poor index choice |
Date: | 2020-05-19 19:00:04 |
Message-ID: | CAFgz3ktdjz32scbduyY4kJDt1Bz5O2RXXev07MAE5_O=o0PV7g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
The attached script builds a 10G test table which demonstrates a
problem that we have in production with postgresql 12.3-1.pgdg18.04+1
on ubuntu linux. Indexes:
test_orders_o_date_idx btree(o_date)
test_orders_customer_id_o_date_idx btree(customer_id, o_date)
We query for the most recent orders for sets of customers, and
sometimes none of those customers have any orders and the results are
empty:
explain analyze select * from test_orders where
customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
order by o_date desc;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=24848.96..24870.67 rows=8686 width=1839) (actual
time=1.101..1.102 rows=0 loops=1)
Sort Key: o_date DESC
Sort Method: quicksort Memory: 25kB
-> Index Scan using test_orders_customer_id_o_date_idx on
test_orders (cost=0.43..17361.20 rows=8686 width=1839) (actual
time=1.047..1.047 rows=0 loops=1)
Index Cond: (customer_id = ANY
('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
Planning Time: 3.821 ms
Execution Time: 1.174 ms
(7 rows)
So far so good. But if we add a limit clause to the query then the
plan goes very wrong:
explain analyze select * from test_orders where
customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
order by o_date desc limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..1660.98 rows=10 width=1839) (actual
time=4990.424..4990.424 rows=0 loops=1)
-> Index Scan Backward using test_orders_o_date_idx on test_orders
(cost=0.43..1442355.43 rows=8686 width=1839) (actual
time=4990.423..4990.423 rows=0 loops=1)
Filter: (customer_id = ANY
('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
Rows Removed by Filter: 5000000
Planning Time: 0.063 ms
Execution Time: 4990.435 ms
Is there something we can adjust to get it to prefer
test_orders_customer_id_o_date_idx even when there's a limit clause ?
Attachment | Content-Type | Size |
---|---|---|
testdata.py | text/x-python | 1.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mohamed Wael Khobalatte | 2020-05-19 20:56:43 | Re: A limit clause can cause a poor index choice |
Previous Message | Albrecht Dreß | 2020-05-19 17:43:14 | Re: Unique index on hash of jsonb value - correct solution? |