Re: Bit datatype performance?

From: Antonio Vieiro <antonio(at)antonioshome(dot)net>
To: Radosław Smogura <rsmogura(at)softperience(dot)eu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bit datatype performance?
Date: 2011-09-14 15:58:58
Message-ID: 4E70CF42.1070703@antonioshome.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi again,

Thanks for the tip. In fact I was thinking of creating an index on the
bitmask, so I could use:

... where t.bits = :mymask

directly, avoiding a full table scan. I assume this is possible
(indexing bit and comparing bits), isn't it?

Thanks,
Antonio

El 14/09/11 15:58, Radosław Smogura escribió:
> On Wed, 14 Sep 2011 12:00:35 +0200, Antonio Vieiro wrote:
>> Hi all,
>>
>> One of my entities 'E' may be 'tagged' with an arbitrary set of 256
>> tags 'T'.
>>
>> A first approach could be to add a M:N relationship between 'E' and 'T'.
>>
>> A second way to do this could be to add a BIT(256) datatype to 'E',
>> setting bits to '1' if the entity is tagged with each one of the 256
>> tags (i.e. using a 'bitmask' on the set of tags).
>>
>> Since querying entities 'E' with a certain set of tags 'T' must be
>> very fast I was wondering if the second approach would be faster. What
>> do you think?
>>
>> Thanks for any hints,
>> Antonio
>
> I assume each entity may have one or more different tags.
>
> Actually performing test like
> ... where (t.bits & :mymask) = :mymask
> should be quite fast and faster then creating additional relations, but
> only if it's highly probable that your query will almost always scan
> whole table.
>
> The advantage of indexes is that the index is used 1st and tail (slower)
> parts of query will always get "subset" of table. In bitset, You will
> probably scan whole table.
>
> So I think, you should do some performance test for large number of
> data, and compare both ways. I think bitset will be fast for really
> small data, but M:N relations may be faster for really large data sets.
>
> You need to measure size of your database too, in M:N case with 256 tags
> it may be quite large.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Shepard 2011-09-14 16:35:00 Identifying Reason for Column Name Returned by SELECT
Previous Message Ivan Voras 2011-09-14 14:38:23 Re: PostgreSQL benchmarked on XFS vs ZFS vs btrfs vs ext4