From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ANALYZE sampling is too good |
Date: | 2014-03-09 01:55:18 |
Message-ID: | 20140309015518.GB32380@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I assume we never came up with a TODO from this thread:
---------------------------------------------------------------------------
On Tue, Dec 3, 2013 at 11:30:44PM +0000, Greg Stark wrote:
> At multiple conferences I've heard about people trying all sorts of
> gymnastics to avoid ANALYZE which they expect to take too long and
> consume too much I/O. This is especially a big complain after upgrades
> when their new database performs poorly until the new statistics are
> in and they did pg_upgrade to avoid an extended downtime and complain
> about ANALYZE taking hours.
>
> I always gave the party line that ANALYZE only takes a small
> constant-sized sample so even very large tables should be very quick.
> But after hearing the same story again in Heroku I looked into it a
> bit further. I was kind of shocked but the numbers.
>
> ANALYZE takes a sample of 300 * statistics_target rows. That sounds
> pretty reasonable but with default_statistics_target set to 100 that's
> 30,000 rows. If I'm reading the code right It takes this sample by
> sampling 30,000 blocks and then (if the table is large enough) taking
> an average of one row per block. Each block is 8192 bytes so that
> means it's reading 240MB of each table.That's a lot more than I
> realized.
>
> It means if your table is anywhere up to 240MB you're effectively
> doing a full table scan and then throwing out nearly all the data
> read.
>
> Worse, my experience with the posix_fadvise benchmarking is that on
> spinning media reading one out of every 16 blocks takes about the same
> time as reading them all. Presumably this is because the seek time
> between tracks dominates and reading one out of every 16 blocks is
> still reading every track. So in fact if your table is up to about
> 3-4G ANALYZE is still effectively going to do a full table scan, at
> least as far as I/O time goes.
>
> The current algorithm seems like it was designed with a 100G+ table in
> mind but the consequences on the more common 100M-100G tables weren't
> really considered. Consider what this means for partitioned tables. If
> they partition their terabyte table into 10 partitions ANALYZE will
> suddenly want to use 10x as much I/O which seems like a perverse
> consequence.
>
> I'm not sure I have a prescription but my general feeling is that
> we're spending an awful lot of resources going after a statistically
> valid sample when we can spend a lot less resources and get something
> 90% as good. Or if we're really going to read that much data that we
> might as well use more of the rows we find.
>
> --
> greg
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2014-03-09 06:45:55 | Re: Patch: show relation and tuple infos of a lock to acquire |
Previous Message | Bruce Momjian | 2014-03-09 01:44:34 | Re: [COMMITTERS] pgsql: libpq: change PQconndefaults() to ignore invalid service files |