Re: Adding a suffix array index

From: Troels Arvin <troels(at)arvin(dot)dk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Adding a suffix array index
Date: 2004-11-19 13:23:34
Message-ID: pan.2004.11.19.12.55.48.92702@arvin.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote:

>> Part of my current code concerns packing DNA characters: As the alphabet
>> of DNA strings is very small (four characters), it seems like a
>> straigt-forward optimization to store each character in two bits.
>
> My advice would be to get it to work first, oprimize later.

Valid point. However, I needed something rather basic to work on, to get
to know C and to get to know PostgreSQL in a user defined type context.
But if packing proves to be a problem when implementing the interesting
stuff, then thanks&yes: Packing should be an afterthought.

>> My first and most immediate goal is to support efficient answering of a
>> question like "which rows contain the sequence TTGACCACTTG in column foo?".
>
> If you store your sequences as strings, you may try to use trigrams (or
> modify them to 4,5,6 or 7-grams ;) to get some feel how that works.
>
> trigram module is in contrib/pg_trgm.

(/me Printing readme.) Thanks.

--
Greetings from Troels Arvin, Copenhagen, Denmark

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2004-11-19 13:35:06 Re: Adding a suffix array index
Previous Message Troels Arvin 2004-11-19 13:13:48 Re: Adding a suffix array index