Re: Row Size and Indexing

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: tschak <jochen(dot)schlosser(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Row Size and Indexing
Date: 2006-01-15 13:43:19
Message-ID: 20060115134319.GB395@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jan 13, 2006 at 06:48:28AM -0800, tschak wrote:
> Hi guys,
>
> I have some questions about postgres and would be glad if you could
> help me...
> Here ist what I am planning to do:
> Build a database with 1 Billion (10^9) rows describing geometric
> properties:

<snip>

> Question Nr ONE --- Concerning row size
> Is it correct that I"ll need:
> 27 Byte (rowdescriptor) +4Byte (OID) + 4 Byte (for the distances) +
> BLOB_SIze (say 30 BYTE)
> => 65 Byte per Tupel dending on the storage space of the BLOB (in
> BLOB_Space or the tupel itself)?
> Is it possible to index the row if I disable the ROWID to save storage
> space in order to get more tupels per page?

If you don't need an OID, don't include it. IIRC, bit(4) will be 8
bytes due to being variable length. If you want to store small numbers,
maybe smallint (2 bytes) is more useful.

You can index anything and everything. PostgreSQL doesn't have a
ROWID so I'm not sure what you're referring to here.

> QUESTION NR. TWO --- Indexing
> My query will look like this:
> SELECT p from steric_descriptor WHERE
> p.distance0 <= threshold0
> AND ... ... ... ...
> AND p.distance7 <= threshold7;
> Acutally I was planning to use 8 distinct Bitmap-Indices with 15 Bits
> (2Byte) each per row (16 Byte in total) but as far as I know postgres
> does not offer a real disk-resident Bitmap-Index... Is that true? If
> yes, I would have to use composite B*trees which do not perform well
> for composite range queries. This leads to the new Bitmap-Scan of
> PostgreSQL => 8 B*Trees with a 4Bit key and blocksize/rowsize +1
> pointer (4Byte each). This means: approx. twice the size for the
> B*Tree-Version just to use the same functionality as a real
> Bitmap-Index would offer...
> Is there a way to avoid this or are my calculations somewhat wrong
> (maybe even totally)?

Well, PostgreSQL allows you to create your own index types if you don't
like b-trees. However, b-tree should be able to do what you want. I
have no idea what a "real disk-resident Bitmap-Index" would look like
so I can't comment on that.

BTW, you using <= on a bit type, which seems wierd to me. Shouldn't
they be numbers?

> Question Nr. THREE:
> If I do need OIDs for indexing and I have 10^9 Rows in my DB, do the
> Indices need Oids as well... If yes I might get problems with the 4
> Billion size limit of the Oids, is that correct? I don"t know how the
> physical structure of an Index looks like, so I might be wrong on this
> as well...

If you can't think of a reason why you need OIDs, don't use them. The
latest release of PostgreSQL phases them out for user tables anyway
since they're not actually useful 99% of the time.

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Zlatko Matić 2006-01-15 16:21:46 temporary tables, pgAdminIII
Previous Message Tim Bunce 2006-01-15 09:38:31 Re: Apache::DBI and DBD::Pg