| From: | Troels Arvin <troels(at)arvin(dot)dk> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Adding a suffix array index |
| Date: | 2004-11-19 13:23:34 |
| Message-ID: | pan.2004.11.19.12.55.48.92702@arvin.dk |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote:
>> Part of my current code concerns packing DNA characters: As the alphabet
>> of DNA strings is very small (four characters), it seems like a
>> straigt-forward optimization to store each character in two bits.
>
> My advice would be to get it to work first, oprimize later.
Valid point. However, I needed something rather basic to work on, to get
to know C and to get to know PostgreSQL in a user defined type context.
But if packing proves to be a problem when implementing the interesting
stuff, then thanks&yes: Packing should be an afterthought.
>> My first and most immediate goal is to support efficient answering of a
>> question like "which rows contain the sequence TTGACCACTTG in column foo?".
>
> If you store your sequences as strings, you may try to use trigrams (or
> modify them to 4,5,6 or 7-grams ;) to get some feel how that works.
>
> trigram module is in contrib/pg_trgm.
(/me Printing readme.) Thanks.
--
Greetings from Troels Arvin, Copenhagen, Denmark
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Oleg Bartunov | 2004-11-19 13:35:06 | Re: Adding a suffix array index |
| Previous Message | Troels Arvin | 2004-11-19 13:13:48 | Re: Adding a suffix array index |