From: | "Nick Fankhauser" <nickf(at)ontko(dot)com> |
---|---|
To: | "Manfred Koizar" <mkoi-pg(at)aon(dot)at>, "Pgsql-Performance(at)Postgresql(dot) Org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Seeking help with a query that takes too long |
Date: | 2003-11-12 15:55:48 |
Message-ID: | NEBBLAAHGLEEPCGOBHDGKEJFJGAA.nickf@ontko.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
>(actual time=37.62..677.44 rows=3501 loops=1)
^^^^^^^^^
> Nick, can you find out why this row count estimation is so far off?
It's actually correct:
prod1=# select count(actor_id) from actor where actor_full_name_uppercase
like 'SANDERS%';
count
-------
3501
(1 row)
Of course, I merely chose "SANDERS" arbitrarily as a name that falls
somewhere near the middle of the frequency range for names. SMITH or JONES
would represent a worst-case, and something like KOIZAR would probably be
unique.
Here are the stats:
prod1=# SELECT * FROM pg_stats
prod1-# WHERE tablename='actor' AND attname='actor_full_name_uppercase';
-[ RECORD
1 ]-----+-------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------------------------------------------
schemaname | public
tablename | actor
attname | actor_full_name_uppercase
null_frac | 0.000333333
avg_width | 21
n_distinct | 24215
most_common_vals | {"STATE OF INDIANA","INDIANA DEPARTMENT OF
REVENUE","BARTH CONS SCHOOL CORP","HOWARD COUNTY CLERK","ADVANCED RECOVERY
SERVICES","STATE OF INDIANA-DEPT OF REVENUE","ALLIED COLLECTION SERVICE
INC","CREDIT BUREAU OF LAPORTE","MIDWEST COLLECTION SVC INC","NCO FINANCIAL
SYSTEMS INC"}
most_common_freqs |
{0.0153333,0.0143333,0.00433333,0.00433333,0.004,0.00366667,0.00333333,0.003
33333,0.00266667,0.00266667}
histogram_bounds | {"(POE) ESTELLE, DENISE","BRIEN, LIISI","COTTRELL,
CAROL","FAMILY RENTALS","HAYNES, TAMIKA","KESSLER, VICTORIA","MEFFORD,
VERNON L","PHILLIPS, GERALD L","SHELTON, ANTOINETTE","TRICARICO, MELISSA
SUE","ZUEHLKE, THOMAS L"}
correlation | -0.00147395
I think this means that the average is 357 per actor. As you can see, the
range of assignments varies from people with a single parking ticket to
"State of Indiana", which is party to many thousands of cases.
> BTW, there seem to be missing cases:
> > -> Nested Loop (cost=0.00..2214.66 rows=2 width=115)
> > (actual time=59.05..119929.71 rows=5879 loops=1)
> ^^^^
> > -> Nested Loop (cost=0.00..2205.26 rows=3 width=76)
> > (actual time=51.46..66089.04 rows=5882 loops=1)
This is expected- We actually aggregate data from many county court
databases, with varying levels of data "cleanliness".
Regards,
-Nick
From | Date | Subject | |
---|---|---|---|
Next Message | Nick Fankhauser | 2003-11-12 16:05:10 | Re: Seeking help with a query that takes too long |
Previous Message | scott.marlowe | 2003-11-12 15:53:56 | Re: Value of Quad vs. Dual Processor machine |