From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: VARIANT / ANYTYPE datatype |
Date: | 2011-05-04 23:24:00 |
Message-ID: | 4DC1E010.7090501@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/04/2011 07:05 PM, Tom Lane wrote:
> Alvaro Herrera<alvherre(at)commandprompt(dot)com> writes:
>> Excerpts from Tom Lane's message of mié may 04 14:36:44 -0300 2011:
>>> Just out of curiosity, what actual functionality gain would ensue over
>>> just using text? It seems like doing anything useful with the audit
>>> table contents would still require casting the column to text, or the
>>> moral equivalent of that.
>> Storage efficiency. These people have really huge databases; small
>> changes in how tight things are packed makes a large difference for
>> them. (For example, we developed a type to store SHA-2 digests in a
>> more compact way than bytea mainly because of this reason. Also, at
>> some time they also wanted to apply compression to hstore keys and
>> values.)
> Hmm. The prototypical case for this would probably be a 4-byte int,
> which if you add an OID to it so you can resolve the type is going to
> take 8 bytes, plus you are going to need a length word because there is
> really no alternative to the "VARIANT" type being varlena overall, which
> makes it 9 bytes if you're lucky on alignment and up to 16 if you're
> not. That is not shorter than the average length of the text
> representation of an int. The numbers don't seem a lot better for
> 8-byte quantities like int8, float8, or timestamp. It might be
> marginally worthwhile for timestamp, but surely this is a huge amount of
> effort to substitute for thinking of a more compact text representation
> for timestamps.
>
> Pardon me for being unconvinced.
>
>
I'm far from convinced that storing deltas per column rather than per
record is a win anyway. I don't have hard numbers to hand, but my vague
recollection is that my tests showed it to be a design that used more space.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-05-04 23:39:54 | Some surprising precedence behavior in PG's grammar |
Previous Message | David E. Wheeler | 2011-05-04 23:23:31 | Re: VARIANT / ANYTYPE datatype |