From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: TOAST usage setting |
Date: | 2007-05-29 14:04:12 |
Message-ID: | 200705291404.l4TE4CX15194@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Gregory Stark wrote:
> "Bruce Momjian" <bruce(at)momjian(dot)us> writes:
>
> > I tested TOAST using a method similar to the above method against CVS
> > HEAD, with default shared_buffers = 32MB and no assert()s. I created
> > backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default),
> > 8, 16, 32, 64) which gives TOAST/non-TOAST breakpoints of 2k(default),
> > 1k, 512, 256, and 128, roughly.
> >
> > The results are here:
> >
> > http://momjian.us/expire/TOAST/
> >
> > Strangely, 128 bytes seems to be the break-even point for TOAST and
> > non-TOAST, even for sequential scans of the entire heap touching all
> > long row values. I am somewhat confused why TOAST has faster access
> > than inline heap data.
>
> Did your test also imply setting the MAX_TOAST_CHUNK_SIZE (or however that's
> spelled)? And what size long values were you actually storing? How did you
> generate them?
Please look at the script sqltest.sh at that URL. I did not modify
TOAST_MAX_CHUNK_SIZE, but it changes based on TOAST_TUPLES_PER_PAGE,
which I did change.
> I wonder if what's happening is that you have large chunks which when stored
> inline are leaving lots of dead space in the table. Ie, if you're generating
> values with size near 2k and the default chunk size you would expect to find
> an average of 1k dead space per page, or a 12.5% drain on performance. As you
> lower the chunk size you decrease that margin.
Well, that could be it, but effectively that is what would happen in the
real world too.
> However I agree that it's hard to believe that the costs of random access
> wouldn't swamp that 12.5% overhead pretty quickly.
>
> One query I used when measuring the impact of the variable varlena stuff was
> this which gives the distribution of tuples/page over a table:
>
> SELECT count(*),n
> FROM (SELECT count(*) AS n
> FROM foo
> GROUP BY (point_in(tidout(ctid)))[0]
> ) as x
> GROUP BY n;
>
> Which might help you peek at what's going on. You could also combine
> pg_column_size(foo.*) to measure the size of the tuple. I think that will
> measure the size of the tuple as is before the columns are detoasted.
Please use my test script and see what you find.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Zdenek Kotala | 2007-05-29 14:14:34 | Re: Users specific to a Database |
Previous Message | Tom Lane | 2007-05-29 14:00:06 | Re: What is the maximum encoding-conversion growth rate, anyway? |