From: | Tomasz Ostrowski <tometzky+pg(at)ato(dot)waw(dot)pl> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
Cc: | Josh berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [BUGS] Routine analyze of single column prevents standard autoanalyze from running at all |
Date: | 2016-06-08 06:19:58 |
Message-ID: | 3eebdfc2-95dc-9ede-7467-47a3c26b8892@ato.waw.pl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
W dniu 2016-06-08 o 05:04, Tom Lane pisze:
> Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
>> Is there any significant advantage to not analyzing all columns? Only
>> case I can think of is if you have a fair number of columns that have
>> been toasted; otherwise I'd think IO would completely swamp any other
>> considerations.
>
> Yeah, my guess is that the OP's example where analyzing just one column
> was significantly cheaper boiled down to some of the other columns being
> mostly toasted data. Otherwise it's hard to see how there's much more
> expense in analyzing them all.
Actually no - this volatile column has smaller "statistics" than most of
the table, so analyzing it is much faster when it's data is not in RAM.
Here is a small exaggerated example showing a difference:
$ psql
tometzky=> create table test (id serial, data text);
tometzky=> insert into test(data)
select 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. '
||generate_series(0,10000000)::text;
tometzky=> alter table test alter column id set statistics 10;
tometzky=> alter table test alter column data set statistics 1000;
tometzky=> \q
# Drop OS page cache and restart postgres
# so the table data won't be in RAM anymore:
$ sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'
$ sudo systemctl restart postgresql;
# Test single column analyze:
$ psql
tometzky=> \timing
Timing is on.
tometzky=> analyze verbose test(id);
INFO: analyzing "public.test"
INFO: "test": scanned 3000 of 123457 pages, containing 243000 live rows
and 0 dead rows; 3000 rows in sample, 2712238 estimated total rows
ANALYZE
Time: 422,521 ms
tometzky=> \q
# Drop OS page cache and restart postgres again
$ sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches'
$ sudo systemctl restart postgresql;
$ psql
tometzky=> \timing
Timing is on.
tometzky=> analyze verbose test;
INFO: analyzing "public.test"
INFO: "test": scanned 123457 of 123457 pages, containing 10000001 live
rows and 0 dead rows; 300000 rows in sample, 10000001 estimated total rows
ANALYZE
Time: 9447,519 ms
--
Tomasz "Tometzky" Ostrowski
From | Date | Subject | |
---|---|---|---|
Next Message | Emiel Hermsen | 2016-06-08 08:39:14 | Re: Case in Order By Ignored without warning or error |
Previous Message | Peter Geoghegan | 2016-06-08 04:23:58 | Re: BUG #14150: Attempted to delete invisible tuple |
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro HORIGUCHI | 2016-06-08 06:26:15 | Re: slower connect from hostnossl clients |
Previous Message | Masahiko Sawada | 2016-06-08 06:15:33 | Re: Reviewing freeze map code |