From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: reducing NUMERIC size for 9.1 |
Date: | 2010-07-29 18:35:12 |
Message-ID: | AANLkTin1tGEVWwJ3CZD8iN5N6JWXVzPC_aiuKa8grbaz@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jul 29, 2010 at 1:20 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yeah, you would need an additional layer of struct to represent the
> numeric with a length word in front of it. I think this is not
> necessarily bad because it would perhaps open the door to working
> directly with short-varlena-header values, which is never going to
> be possible with this:
>
>> typedef struct NumericData
>> {
>> int32 varlen;
>> int16 n_header;
>> union { ...
>
> OTOH alignment considerations may make that idea hopeless anyway.
My understanding of our alignment rules for on-disk storage is still a
bit fuzzy, but as I understand it we don't align varlenas. So
presumably if we get a pointer directly into a disk block, the first
byte might happen to be not aligned, which would make the rest of the
structure aligned; and from playing around with the system, it looks
like if we get a value from anywhere else it's typically using the
4-byte varlena header. So it seems like it might be possible to write
code that aligns the data only if needed and otherwise skips a
palloc-and-copy cycle. I'm not totally sure that would be a win, but
it could be. Actually, I had a thought that it might be even more of
a win if you added a flag to the NumericVar representation indicating
whether the digit array was palloc'd or from the original tuple. Then
you might be able to avoid TWO palloc-and-copy cycles, although at the
price of a fairly significant code restructuring.
Which is a long-winded way of saying - it's probably not hopeless.
>> Why n_data as char[1] instead of NumericDigit, you ask?
>
> Yes, we'd have to export NumericDigit if we wanted to declare these
> structs "properly" in numeric.h. I wonder if that decision should
> be revisited. I'd lean to making the whole struct local to numeric.c
> though. Is there anyplace else that really ought to see it?
Probably not. btree_gist is using it, but that's it, at least as far
as our tree is concerned. Attached please find a patch to make the
numeric representation private and add a convenience function
numeric_is_nan() for the benefit of btree_gist. If this looks sane,
I'll go ahead and commit it, which will simplify review of the main
patch once I rebase it over these changes.
>>> I hadn't actually looked. I think though that it's a mistake to break
>>> compatibility on both dscale and weight when you only need to break one.
>>> Also, weight is *certainly* uninteresting for NaNs since it's not even
>>> meaningful unless there are digits. dscale could conceivably be worth
>>> something.
>
>> I don't think I'm breaking compatibility on anything. Can you clarify
>> what part of the code you're referring to here? I'm sort of lost.
>
> On-disk is what I'm thinking about. Right now, a NaN's first word is
> all dscale except the sign bits. You're proposing to change that
> but I think it's unnecessary to do so.
*Where* am I proposing this? The new versions of NUMERIC_WEIGHT() and
NUMERIC_DSCALE() determine where to look for the bits in question
using NUMERIC_IS_SHORT(), which just tests NUMERIC_FLAGBITS(n) ==
NUMERIC_SHORT. There's nothing in there about the NaN case at all.
Even if there were, it's irrelevant because those bits are never
examined and, as far as I can tell, will always be zero barring a
cosmic ray hit. But even if they WERE examined, I don't see where I'm
changing the interpretation of them; in fact, I think I'm very
explicitly NOT doing that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
Attachment | Content-Type | Size |
---|---|---|
make_numericdata_private.patch | application/octet-stream | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2010-07-29 19:29:40 | Re: Performance Enhancement/Fix for Array Utility Functions |
Previous Message | Joshua D. Drake | 2010-07-29 18:16:50 | Re: On Scalability |