From: | "Alexander Staubo" <alex(at)purefiction(dot)net> |
---|---|
To: | "Richard Huxton" <dev(at)archonet(dot)com> |
Cc: | "Postgresql General" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Selecting max(pk) is slow on empty set |
Date: | 2008-01-22 13:16:33 |
Message-ID: | 88daf38c0801220516o7e4d828djda659033824a8aca@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 1/22/08, Richard Huxton <dev(at)archonet(dot)com> wrote:
> Hmm, but with an estimated cost of 3646 (vs.633 for the max(*) which
> uses the wrong index). That explains why it's walking backwards through
> the pkey index, it thinks that it's 8 times cheaper.
[...]
> Have a look at most_common_vals,most_common_freqs in pg_stats for
> tbl=user_messages, att=user perhaps.
# select histogram_bounds from pg_stats where
tablename='user_messages' and attname='user_id';
histogram_bounds
-------------------------------------------------------
{1,489,1097,1824,2555,3452,4488,5679,6879,8637,13448}
# select null_frac, n_distinct, most_common_vals, most_common_freqs
from pg_stats where tablename='user_messages' and attname='user_id';
null_frac | n_distinct | most_common_vals
| most_common_freqs
-----------+------------+--------------------------------------------------+-------------------------------------------------------------------------------------------------------
0 | 2652 |
{5826,484,1206,780,823,4085,4157,5852,1962,6453} |
{0.00933333,0.00766667,0.00666667,0.00633333,0.006,0.00566667,0.00566667,0.00533333,0.005,0.00466667}
> Then see if an ALTER TABLE SET
> STATISTICS 100 makes a difference.
So it does:
# explain analyze select max(id) from user_messages where user_id = 13604;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1320.52..1320.53 rows=1 width=4) (actual
time=13.640..13.640 rows=1 loops=1)
-> Index Scan using user_messages_user on user_messages
(cost=0.00..1319.62 rows=358 width=4) (actual time=13.631..13.631
rows=0 loops=1)
Index Cond: (user_id = 13604)
Total runtime: 13.712 ms
Thank you! That solves my performance problem, at least.
But it's worrying that PostgreSQL should be so off in planning the
query. Does this behaviour qualify as a bug, or is this -- that is,
the need to tweak statistics parameters -- just your garden-variety
application-specific optimization?
Alexander.
From | Date | Subject | |
---|---|---|---|
Next Message | Rhys Stewart | 2008-01-22 13:22:42 | Re: (un)grouping question |
Previous Message | Josh Harrison | 2008-01-22 13:14:18 | Re: Online Oracle to Postgresql data migration |