From: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Ed L(dot)" <pgsql(at)bluepolka(dot)net>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Why won't it index scan? |
Date: | 2006-05-23 22:55:55 |
Message-ID: | 20060523225555.GL64371@pervasive.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, May 23, 2006 at 06:18:07PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> > On Mon, May 22, 2006 at 06:43:22PM -0400, Tom Lane wrote:
> >> It would be nice to have some *evidence*, not unsupported handwaving.
>
> > If someone has an idea on how to actually get that evidence, I'm all
> > ears.
>
> Well, for example, actually measuring the planner overhead from larger
> pg_statistic entries would be interesting. Plus how much more time
> ANALYZE takes to generate the entries. (I'm afraid that ANALYZE is
> probably worse-than-linear CPU-wise, but it may be dominated by disk
> access.)
How should I go about analyzing planner time? Subtract \timing from
EXPLAIN ANALYZE?
Well, I did find one reason not to go ape with this: the number of pages
analyzed scales with the number of buckets, so doubling the statistics
target will roughly double the ANALYZE time for any table over 6000
pages (though the effect isn't linear, see below). There is a small
increase in time for a small table, but I doubt it's enough for anyone
to care:
(single AMD64, 1G memory, FBSD 6.0, software RAID1)
bench=# set default_statistics_target= 100;
SET
Time: 0.320 ms
bench=# analyze verbose accounts;analyze verbose branches;
INFO: analyzing "public.accounts"
INFO: "accounts": scanned 30000 of 5172414 pages, containing 1740000 live rows and 0 dead rows; 30000 rows in sample, 300000012 estimated total rows
ANALYZE
Time: 198892.080 ms
INFO: analyzing "public.branches"
INFO: "branches": scanned 17 of 17 pages, containing 3000 live rows and 0 dead rows; 3000 rows in sample, 3000 estimated total rows
ANALYZE
Time: 25.133 ms
bench=# set default_statistics_target= 10;
SET
Time: 0.212 ms
bench=# analyze verbose accounts;analyze verbose branches;
INFO: analyzing "public.accounts"
INFO: "accounts": scanned 3000 of 5172414 pages, containing 174000 live rows and 0 dead rows; 3000 rows in sample, 300000012 estimated total rows
ANALYZE
Time: 27227.885 ms
INFO: analyzing "public.branches"
INFO: "branches": scanned 17 of 17 pages, containing 3000 live rows and 0 dead rows; 3000 rows in sample, 3000 estimated total rows
ANALYZE
Time: 1.973 ms
bench=# analyze branches;
ANALYZE
Time: 2.016 ms
bench=# analyze branches;
ANALYZE
Time: 2.009 ms
bench=# set default_statistics_target= 100;
SET
Time: 0.210 ms
bench=# analyze branches;
ANALYZE
Time: 2.231 ms
bench=# analyze branches;
ANALYZE
Time: 2.346 ms
bench=# analyze branches;
ANALYZE
Time: 9.220 ms
bench=# analyze branches;
ANALYZE
Time: 2.057 ms
bench=#
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-05-23 22:59:49 | Re: assymetry updating a boolean (=FALSE faster than =TRUE) |
Previous Message | Jim C. Nasby | 2006-05-23 22:33:53 | Re: assymetry updating a boolean (=FALSE faster than =TRUE) |