From: | Alan Hodgson <ahodgson(at)lists(dot)simkin(dot)ca> |
---|---|
To: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT |
Date: | 2021-12-06 17:57:53 |
Message-ID: | ac6b2c9214684d7645554b0673d4a7148c0a52e9.camel@lists.simkin.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 2021-12-06 at 18:20 +0100, Francisco Olarte wrote:
>
> Can you post an explain analyze? To me it seems like the planner
> thinks shipment_import_id is randomly distributed and the table is
> well correlated with it's PK, so scanning it for the first id
> should
> be fast.
#explain analyze SELECT "shipment_import_records".* FROM
shipment_import_records" WHERE
shipment_import_records"."shipment_import_id" = 5090609 ORDER BY
"shipment_import_records"."id" ASC LIMIT 1;
------------------------------------------------------------------
---
------------------------------------------------------------------
---
-------------------------------------
Limit (cost=0.44..873.08 rows=1 width=243) (actual
time=31689.725..31689.726 rows=1 loops=1)
-> Index Scan using shipment_import_records_pkey on
shipment_import_records (cost=0.44..5122405.71 rows=5870
width=243)
(actual time=31689.723..31689.724 rows=1 loops=1)
Filter: (shipment_import_id = 5090609)
Rows Removed by Filter: 28710802
Planning Time: 0.994 ms
Execution Time: 31689.744 ms
(6 rows)
The biggest one (but yes "earlier"):
# explain analyze SELECT "shipment_import_records".* FROM
"shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id" = 1247888 ORDER BY
"shipment_import_records"."id" ASC LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------
---------------------------------------------------------------------
----------------------------
--------
Limit (cost=0.44..426.59 rows=1 width=243) (actual
time=8007.069..8007.070 rows=1 loops=1)
-> Index Scan using shipment_import_records_pkey on
shipment_import_records (cost=0.44..5126628.40 rows=12030 width=243)
(actual time=8007.068..8007.068 rows=1 l
oops=1)
Filter: (shipment_import_id = 1247888)
Rows Removed by Filter: 10929193
Planning Time: 0.584 ms
Execution Time: 8007.086 ms
(6 rows)
And the smallest/latest, which actually uses the "right" index:
# explain analyze SELECT "shipment_import_records".* FROM
"shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id" = 5116174 ORDER BY
"shipment_import_records"."id" ASC LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------
---------------------------------------------------------------------
----------------------------
------------------------
Limit (cost=145.44..145.44 rows=1 width=243) (actual
time=0.018..0.018 rows=1 loops=1)
-> Sort (cost=145.44..145.64 rows=79 width=243) (actual
time=0.017..0.018 rows=1 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 26kB
-> Index Scan using
index_shipment_import_records_on_shipment_import_id on
shipment_import_records (cost=0.44..145.05 rows=79 width=243)
(actual time=0.013
..0.014 rows=1 loops=1)
Index Cond: (shipment_import_id = 5116174)
Planning Time: 0.104 ms
Execution Time: 0.032 ms
(8 rows)
>
> But from the names of the field you may have correlation between
> shipment_import_id and id hidden somewhere ( like they are two
> serial
> growing together, you query for the latest shipment ids and it
> scans
> all the table ). An explain analyze should show that ( or three,
> one
> for that shipment import id, one for 1, one for a really big one )
This is definitely the case. And we are generally looking for newer
data for most operations.
Thanks for looking at it.
From | Date | Subject | |
---|---|---|---|
Next Message | Alan Hodgson | 2021-12-06 18:02:44 | Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT |
Previous Message | Alan Hodgson | 2021-12-06 17:41:46 | Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT |