Packed varlena patch update

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Packed varlena patch update
Date: 2007-03-12 17:40:46
Message-ID: 871wjuz7fl.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I implemented Tom's suggestion of packing the external toast pointers
unaligned and copying them to a local struct to access the fields. This saves
3-6 bytes on every external toast pointer. Presumably this isn't interesting
if you're accessing the toasted values since they would be quite large
anyways, but it could be interesting if you're doing lots of queries that
don't access the toasted values. It does uglify the code a bit but it's all
contained in tuptoaster.c.

http://community.enterprisedb.com/varlena/patch-varvarlena-19.patch.gz

I think this is the last of the todo suggestions that were mentioned on the
list previously, at least that I recall.

Some implications:

1) There's no longer any macro to determine if an external attribute is
compressed. I could provide a function to do it if we need it but in all
the cases where I see it being used we already needed to extract the fields
of the toast pointer anyways, so it wouldn't make sense.

2) We check if a toasted value is replaced with a new one by memcmp'ing the
entire toast pointer. We used to just compare valueid and relid, so this
means we're now comparing extsize and rawsize as well. I can't see how they
could vary for the same valueid though so I don't think that's actually
changing anything.

Remaining things in the back of my mind though:

. I'm not 100% confident of the GIST changes. I think I may have changed a few
too many lines of code there. It does pass all the contrib regressions
though.

. I think there may be some opportunities for optimizing heaptuple.c. In
particular the places where it uses VARSIZE_ANY where it knows some of the
cases are impossible. It may not make a difference due to branch prediction
though.

. I've left the htonl/ntohl calls in place in the BIG_ENDIAN #if-branch.
They're easy enough to remove and it leaves us the option of removing the
#if entirely and just us use network-byte-order instead of the #ifdef.

I'm a bit worried about modules building against postgres.h without
including the config.h include file. Is that possible? Is it worth worrying
about? I think I could make it fail to build rather than crash randomly.

. One of the regression tests I wrote makes a datum of 1/2Meg. Can all the
build-farm machines handle that? I don't really need anything so large for
the regression but it was the most convenient way to make something which
after compressing was large enough to toast externally. It might be better
to find some less compressible data.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2007-03-12 17:45:27 Re: Bitmapscan changes
Previous Message Tom Lane 2007-03-12 17:17:25 Re: Bitmapscan changes