From: | Michael Lewis <mlewis(at)entrata(dot)com> |
---|---|
To: | Mohamed Wael Khobalatte <mkhobalatte(at)grubhub(dot)com> |
Cc: | Nick Cleaton <nick(at)cleaton(dot)net>, PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: A limit clause can cause a poor index choice |
Date: | 2020-05-19 21:15:57 |
Message-ID: | CAHOFxGreoSAfz0dwoCGPCFJzbLv1h_aP=ojhDSudOhGEAZG8pw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
What does pg_stats say about column customer_id? Specifically, how many
ndistinct, and what is the sum of the most common values? If you have 1000
distinct customer_id values, and the (default 100) most common values only
cover 2% of the total rows, then the optimizer will assume that any given
customer_id will yield approx reltuples * .98 / ( 5000 - 100 ) rows. So if
your table has 1 million rows, your estimate might be that there should be
200 rows in the table per customer_id in your array.
Looking at your query plan, the optimizer expects rows=8686 for those
customer_id and it knows you only want 10 of the most recent ones. It made
the right call based on the information it has.
Increase default_statistics_target, at least on that column, and see if you
get a much much better plan. I don't know where I got this query from
online, but here ya go. I'd be curious how frac_MCV in this changes
when default_statistics_target is more like 250 or 500 and the table is
analyzed again to reflect that change.
SELECT
( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
tablename,
attname,
inherited,
null_frac,
n_distinct,
array_length(most_common_vals,1) n_mcv,
array_length(histogram_bounds,1) n_hist,
correlation,
*
FROM pg_stats
WHERE
schemaname = 'public'
AND tablename='test_orders'
AND attname='customer_id'
ORDER BY 1;
From | Date | Subject | |
---|---|---|---|
Next Message | Alfonso | 2020-05-19 21:56:23 | Q: Comparing PostgreSQL and Oracle |
Previous Message | Mohamed Wael Khobalatte | 2020-05-19 20:56:43 | Re: A limit clause can cause a poor index choice |