Quick Links

Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT

From:	Alan Hodgson <ahodgson(at)lists(dot)simkin(dot)ca>
To:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT
Date:	2021-12-06 17:03:02
Message-ID:	bc47e9d0c490942eebb01dc1129097012005e1d2.camel@lists.simkin.ca
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

I keep running into problems like these:

Devs are using an ORM. It really likes to produce queries like:

SELECT "shipment_import_records".* FROM "shipment_import_records"
WHERE shipment_import_records"."shipment_import_id" = 5090609 ORDER
BY "shipment_import_records"."id" ASC LIMIT 1;

I don't know why they do this. Usually it's more like 50 for
pagination which make more sense. But for whatever reason this keeps
coming up.

The table has nearly 29 million records. 5069 of them match
shipment_import_id = 5090609. There is an index
on shipment_import_id, which the planner happily uses without the
LIMIT specifically. Yet with it the query planner will always do
something like:

# explain SELECT "shipment_import_records".* FROM
"shipment_import_records" WHERE
"shipment_import_records"."shipment_import_id" = 5090609 ORDER BY
"shipment_import_records"."id" ASC LIMIT 1;
                                                        QUERY PLAN

---------------------------------------------------------------------
--------------------------------------------------------
Limit (cost=0.44..873.35 rows=1 width=243)
  -> Index Scan using shipment_import_records_pkey on
shipment_import_records (cost=0.44..5122227.70 rows=5868 width=243)
        Filter: (shipment_import_id = 5090609)

.. which takes minutes.

I know I can work around this. Generally I would just drop the index
on shipment_import_id and create one on shipment_import_id,id. Or if
I can get the devs to wrap their query in an inner select with a fake
offset to fool the query planner that works too. But both seem hacky.

Just wondering if there's a knob I can turn to make these more likely
to work without constantly implementing workarounds?

Thanks for any help.

Responses

Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT at 2021-12-06 17:18:11 from Michael Lewis
Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT at 2021-12-06 17:19:09 from Rob Sargent
Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT at 2021-12-06 17:20:44 from Francisco Olarte

Browse pgsql-general by date

	From	Date	Subject
Next Message	Michael Lewis	2021-12-06 17:18:11	Re: Query planner issue with preferring primary key over a better index when using ORDER BY and LIMIT
Previous Message	Avi Weinberg	2021-12-06 14:50:35	Are Foreign Key Disabled During Logical Replication Initial Sync?