Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 21:21:06
Message-ID: 22240.1372108866@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> On 06/24/2013 01:50 PM, Tom Lane wrote:
>> The point of what I was suggesting isn't to conserve storage, but to
>> reduce downtime during a schema change. Remember that a rewriting ALTER
>> TABLE locks everyone out of that table for a long time.

> Right, but I'm worried about the "surprise!" factor. That is, if we
> make the first default "free" by using a magic value, then a SET DEFAULT
> on that column is going to have some very surprising results as suddenly
> the whole table needs to get written out for the old default.

No, that's why we'd store the magic default separately. That will be
permanent and unaffected by later SET DEFAULT operations. (This
requires that every subsequently created tuple store the column
explicitly so that the magic default doesn't affect it; which is exactly
why there's a conflict with the currently-proposed patch.)

> ... Also for the reason Tom pointed out, the
> optimization would only work on with NOT NULL columns ... leading to
> another potential unexpected surprise when the column is marked NULLable.

Huh? We already have the case of null default handled.

> Well, actually, hundreds of columns is reasonably common for a certain
> user set (ERP, CRM, etc.). If we could handle that use case very
> efficiently, then it would win us some users, since other RDMBSes don't.
> However, there are multiple issues with having hundreds of columns, of
> which storage optimization is only one ... and probably the smallest one
> at that.

Agreed; there are a lot of things we'd have to address if we really
wanted to claim this is a domain we work well in. (I suspect Salesforce
will be chipping away at some of those issues, but as I said,
heap_form_tuple is not in their critical path.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-06-24 21:42:15 Re: is it bug? - printing boolean domains in sql/xml function
Previous Message Simon Riggs 2013-06-24 21:17:14 Re: ALTER TABLE ... ALTER CONSTRAINT