Re: TOASTing smaller things

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TOASTing smaller things
Date: 2007-03-21 20:29:31
Message-ID: 60ircujq6s.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane) writes:
> Chris Browne <cbbrowne(at)acm(dot)org> writes:
>> #define TOAST_DENOMINATOR 17
>> /* Use this as the divisor; current default behaviour falls from TOAST_DENOMINATOR = 4 */
>
>> #define TOAST_TUPLE_THRESHOLD^I\
>> ^IMAXALIGN_DOWN((BLCKSZ - \
>> ^I^I^I^I MAXALIGN(sizeof(PageHeaderData) + 3 * sizeof(ItemIdData))) \
>> ^I^I^I^I / TOAST_DENOMINATOR)
>
> Given that you are quoting code that was demonstrably broken since
> the original coding of TOAST up till a month or two back, "it passes
> regression" is not adequate proof of "it's right". In fact I think
> it's not right; you have not got the roundoff condition straight.

OK, then maybe some refinement was needed. That seemed too easy.

Mind you, the problem seems to me to be that TOAST_TUPLE_THRESHOLD is
not solely a threshold to compare things to (as done in
heapam.c/toasting.c), but gets reused to calculate
TOAST_MAX_CHUNK_SIZE. If the threshold was solely used as that,
alignment wouldn't matter.

FYI, I took a sample table and loaded it into the resulting 8.3
backend based on the us of the totally naive TOAST_DENOMINATOR; there
may be something off in the sizing of the chunks, but that does not
appear to have injured fidelity of the tuples I stored.

Vacuum output:

--- Production system (no TOASTing)
INFO: "xml_log_table": found 0 removable, 1731329 nonremovable row versions in 175870 pages
DETAIL: 0 dead row versions cannot be removed yet.

--- In the 8.3 instance that did toast things...
INFO: "xml_log_table": found 0 removable, 1730737 nonremovable row versions in 41120 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
610 pages contain useful free space.
0 pages are entirely empty.
CPU 1.08s/0.36u sec elapsed 14.94 sec.
INFO: vacuuming "pg_toast.pg_toast_49194"
INFO: index "pg_toast_49194_index" now contains 2303864 row versions in 6319 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.09s/0.03u sec elapsed 1.71 sec.
INFO: "pg_toast_49194": found 0 removable, 2303864 nonremovable row versions in 98191 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
514 pages contain useful free space.
0 pages are entirely empty.

Problem with alignment of TOAST_MAX_CHUNK_SIZE or not, I seem to be
getting the right results, and this nicely partitions the table into 2
chunks, one, with the non-XML data, that occupies 41K pages, and the
TOAST section storing those less-frequently-accessed columns. (There
is a size difference; the production instance has more empty space
since it sees active inserts + deletes.)

In all ways except for "strict hygenic correctness of code," this
accomplished what I was hoping.

If someone could make a round-off-safe calculation of
TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE that exposed the
denominator so it could be safely modified, that would be one step
ahead... I generally try not to make changes to the core, so I'll try
to avoid that...

>> 4. A different mechanism would be to add a fifth storage column
>> strategy (the present four are PLAIN, EXTENDED, EXTERNAL, MAIN), let's
>> say, TOAST.
>
> Anything along this line would require invoking the toaster on every
> single tuple, since we'd always have to crawl through all the columns
> to see if toasting was supposed to happen. No thanks.

Ah, I see. I infer from that that the code starts by checking to see
if the tuple size is > TOAST_TUPLE_THRESHOLD, and only starts
rummaging through TOAST infrastructure if the tuple is big enough.

In that case, "TOAST by default" becomes rather a nonstarter, I agree.
In the application context I'm thinking of, one table out of ~80 is a
"TOAST candidate;" making access to the other 79 slower would not be
of benefit.

(Aside: I'll link to Simon Rigg's related note, as well as to the item
on the TODO list...)
<http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php>
<http://www.postgresql.org/docs/faqs.TODO.html>
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/finances.html
Where do you *not* want to go today? "Confutatis maledictis, flammis
acribus addictis" (<http://www.hex.net/~cbbrowne/msprobs.html>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-03-21 20:40:05 Re: relation 71478240 deleted while still in use on 8.1
Previous Message Bruce Momjian 2007-03-21 20:12:47 Re: TOASTing smaller things