Re: Bit datatype performance?

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Antonio Vieiro <antonio(at)antonioshome(dot)net>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bit datatype performance?
Date: 2011-09-14 13:58:32
Message-ID: dff86e89b9adedb4600bd07fe5962af2@mail.softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote:
> Hi all,
>
> One of my entities 'E' may be 'tagged' with an arbitrary set of 256
> tags 'T'.
>
> A first approach could be to add a M:N relationship between 'E' and
> 'T'.
>
> A second way to do this could be to add a BIT(256) datatype to 'E',
> setting bits to '1' if the entity is tagged with each one of the 256
> tags (i.e. using a 'bitmask' on the set of tags).
>
> Since querying entities 'E' with a certain set of tags 'T' must be
> very fast I was wondering if the second approach would be faster.
> What
> do you think?
>
> Thanks for any hints,
> Antonio

I assume each entity may have one or more different tags.

Actually performing test like
... where (t.bits & :mymask) = :mymask
should be quite fast and faster then creating additional relations, but
only if it's highly probable that your query will almost always scan
whole table.

The advantage of indexes is that the index is used 1st and tail
(slower) parts of query will always get "subset" of table. In bitset,
You will probably scan whole table.

So I think, you should do some performance test for large number of
data, and compare both ways. I think bitset will be fast for really
small data, but M:N relations may be faster for really large data sets.

You need to measure size of your database too, in M:N case with 256
tags it may be quite large.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pasman pasmański 2011-09-14 14:21:36 Re: Bit datatype performance?
Previous Message David Johnston 2011-09-14 13:23:12 Re: Bit datatype performance?