From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ANALYZE sampling is too good |
Date: | 2013-12-11 16:03:39 |
Message-ID: | 16676.1386777819@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greg Stark <stark(at)mit(dot)edu> writes:
> So I've done some empirical tests for a table generated by:
> create table sizeskew as (select i,j,repeat('i',i) from
> generate_series(1,1000) as i, generate_series(1,1000) as j);
> I find that using the whole block doesn't cause any problem with the
> avg_width field for the "repeat" column.That does reinforce my belief
> that we might not need any particularly black magic here.
> It does however cause a systemic error in the histogram bounds. It
> seems the median is systematically overestimated by more and more the
> larger the number of rows per block are used:
Hm. You can only take N rows from a block if there actually are at least
N rows in the block. So the sampling rule I suppose you are using is
"select up to N rows from each sampled block" --- and that is going to
favor the contents of blocks containing narrower-than-average rows.
Now in this case, it looks like that ought to favor rows with *smaller* i
values, but you say the median goes up not down. So I'm not sure what's
going on. I thought at first that TOAST compression might be part of the
explanation, but TOAST shouldn't kick in on rows with raw representation
narrower than 2KB.
Did you do a run with no upper limit on the number of rows per block?
Because I'm not sure that tests with a limit in place are a good guide
to what happens without it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-12-11 16:07:05 | Re: [PATCH] Add transforms feature |
Previous Message | Andres Freund | 2013-12-11 16:01:13 | Re: Why the buildfarm is all pink |