Re: BUG #18834: Query planer is choosing the sub-optimal plan when limit is present

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: michael(at)joincandidhealth(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18834: Query planer is choosing the sub-optimal plan when limit is present
Date: 2025-03-07 23:36:39
Message-ID: 1144176.1741390599@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> We observed **poor query performance** when using an **`ORDER BY`** clause
> combined with **`LIMIT`**, which led us to investigate further.

This is a well-known syndrome. The fundamental problem is the poor
selectivity estimate:

> -> Bitmap Index Scan on ix_fn_ledger_provenance
> (cost=0.00..1228.15 rows=8020 width=0) (actual time=2.923..2.924 rows=1
> loops=1)
> Index Cond: (provenance @> '{"locator":
> "52ca2f03-8184-448f-9b9e-7a0ed99e6922"}'::jsonb)

If the thing thinks there are 8020 matches for the @> condition when
there's only one, it's really pure luck if you get an optimal plan.
All its cost estimates will be off by a factor of more than 1000,
and these decisions are by no means linear. They're particularly
not linear in the presence of LIMIT, but query plans can go far
astray even without that.

So what you need to do about this is get a better selectivity
estimate. You might be able to get somewhere with custom statistics
(see CREATE STATISTICS), or just by cranking up the statistics target
for the "provenance" column to the maximum. But I'm not sure how
much that will help. Fundamentally, putting stuff into unstructured
JSON storage and expecting to get efficient searches of it is an
antipattern. A more reliable answer is to change your query.
You could, for example, make an expression index on
(provenance ->> 'locator') and then write

WHERE (provenance ->> 'locator') = '52ca2f03-8184-448f-9b9e-7a0ed99e6922'

That will provide both fast searches and reasonably trustworthy stats
(once ANALYZE has seen the index).

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Masahiko Sawada 2025-03-07 23:47:14 Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string
Previous Message PG Bug reporting form 2025-03-07 21:57:20 BUG #18836: Cannot import foreign schema under different schema name when foreign table uses row types