From: | Rajesh Kumar Mallah <mallah(at)trade-india(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: planner question.. |
Date: | 2003-04-17 15:10:06 |
Message-ID: | 200304172040.06654.mallah@trade-india.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Thursday 17 Apr 2003 7:56 pm, you wrote:
> Rajesh Kumar Mallah <mallah(at)trade-india(dot)com> writes:
> > For a distribution of data like below why does the planner
> > choses to do an index scan by default for source = 'REGIS' when > 50%
> > of the rows are having source='REGIS'.
>
> Are there a huge number of dead rows in the table? ("vacuum verbose"
> would give some info)
>
Yes as you predicted after a vacuum verbose the planner did
switch to seq scan and was only marginally slower than index
scan. (do u think we shud live with it?)
so whats the moral ?
ANALYZE itself is not sufficient in updating pg_statistic,
we must do VACUUM ANALYZE always ,
if so why would a seperate ANALYZE command exist ?
here is the info (in the same order as commands were executed)
tradein_clients=# explain analyze SELECT count(*) from general.profile_master where source='REGIS';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate (cost=18225.48..18225.48 rows=1 width=0) (actual time=6295.33..6295.34 rows=1 loops=1) |
| -> Index Scan using profile_master_index_source on profile_master (cost=0.00..18213.31 rows=4871 width=0) (actual time=14.54..6132.61 rows=102090 loops=1) |
| Index Cond: (source = 'REGIS'::character varying) |
| Total runtime: 6295.41 msec |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
(4 rows)
tradein_clients=# VACUUM VERBOSE general.profile_master;
INFO: --Relation general.profile_master--
INFO: Index profile_master_profile_id_pkey: Pages 426; Tuples 178031: Deleted 11802.
CPU 0.02s/0.12u sec elapsed 0.24 sec.
INFO: Index profile_master_index_userid: Pages 431; Tuples 178031: Deleted 8940.
CPU 0.00s/0.13u sec elapsed 0.14 sec.
INFO: Index profile_master_index_creation_date: Pages 420; Tuples 178031: Deleted 8940.
CPU 0.02s/0.12u sec elapsed 0.20 sec.
INFO: Index profile_master_index_company_id: Pages 419; Tuples 178031: Deleted 8940.
CPU 0.01s/0.10u sec elapsed 0.15 sec.
INFO: Index profile_master_index_eyp_list_id: Pages 419; Tuples 178031: Deleted 8940.
CPU 0.01s/0.11u sec elapsed 0.45 sec.
INFO: Index profile_master_index_iid_list_id: Pages 420; Tuples 178031: Deleted 8940.
CPU 0.01s/0.10u sec elapsed 0.46 sec.
INFO: Index profile_master_index_ip_list_id: Pages 421; Tuples 178031: Deleted 8940.
CPU 0.00s/0.11u sec elapsed 0.12 sec.
INFO: Index profile_master_index_catalog_company_id: Pages 419; Tuples 178031: Deleted 8940.
CPU 0.00s/0.12u sec elapsed 0.46 sec.
INFO: Index profile_master_index_source: Pages 619; Tuples 178031: Deleted 8940.
CPU 0.02s/0.09u sec elapsed 0.29 sec.
INFO: Removed 11802 tuples in 495 pages.
CPU 0.01s/0.03u sec elapsed 0.14 sec.
INFO: Pages 23923: Changed 36, Empty 0; Tup 178031: Vac 11802, Keep 0, UnUsed 325029.
Total CPU 0.96s/1.19u sec elapsed 11.97 sec.
INFO: --Relation pg_toast.pg_toast_122045388--
INFO: Pages 31: Changed 0, Empty 0; Tup 189: Vac 0, Keep 0, UnUsed 0.
Total CPU 0.00s/0.00u sec elapsed 0.03 sec.
VACUUM
tradein_clients=# explain analyze SELECT count(*) from general.profile_master where source='REGIS';
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+-----------------------------------------------------------------------------------------------------------------------------+
| Aggregate (cost=26398.37..26398.37 rows=1 width=0) (actual time=8355.76..8355.76 rows=1 loops=1) |
| -> Seq Scan on profile_master (cost=0.00..26148.39 rows=99994 width=0) (actual time=16.23..8237.53 rows=102090 loops=1) |
| Filter: (source = 'REGIS'::character varying) |
| Total runtime: 8355.88 msec |
+-----------------------------------------------------------------------------------------------------------------------------+
(4 rows)
> The given result seems suspect; an indexscan couldn't possibly read >50%
> of the rows in less than a quarter of the time for a seqscan. Unless
> (a) the table contains vast amounts of empty space that the seqscan has to
> slog through, or
Sorry i dont understand the vacuum verbose output throughly , does it looks like
though ?
(b) your second measurement is bogus due to caching
> performed by the first measurement.
I am not sure this time but i have the habit of running
EXPLAIN ANALYZE thrice and post the middle one ;-)
>
> Also, might the table be in order by the "source" column? A
> sufficiently high correlation might have persuaded the planner to try an
> indexscan even if point (a) isn't true.
Yes the data is loaded from one source completely before loading
from another source. So they were in order but i did a lot of updates
and deletes.
regds
mallah.
>
> regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
--
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)
Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.
From | Date | Subject | |
---|---|---|---|
Next Message | Rajesh Kumar Mallah | 2003-04-17 15:12:16 | Re: planner question.. |
Previous Message | Rajesh Kumar Mallah | 2003-04-17 14:56:36 | Re: analyse question.. |