Quick Links

Re: Adding a suffix array index

From:	Troels Arvin <troels(at)arvin(dot)dk>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Adding a suffix array index
Date:	2004-11-19 13:23:34
Message-ID:	pan.2004.11.19.12.55.48.92702@arvin.dk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote:

>> Part of my current code concerns packing DNA characters: As the alphabet
>> of DNA strings is very small (four characters), it seems like a
>> straigt-forward optimization to store each character in two bits.
>
> My advice would be to get it to work first, oprimize later.

Valid point. However, I needed something rather basic to work on, to get
to know C and to get to know PostgreSQL in a user defined type context.
But if packing proves to be a problem when implementing the interesting
stuff, then thanks&yes: Packing should be an afterthought.

>> My first and most immediate goal is to support efficient answering of a
>> question like "which rows contain the sequence TTGACCACTTG in column foo?".
>
> If you store your sequences as strings, you may try to use trigrams (or
> modify them to 4,5,6 or 7-grams ;) to get some feel how that works.
>
> trigram module is in contrib/pg_trgm.

(/me Printing readme.) Thanks.

--
Greetings from Troels Arvin, Copenhagen, Denmark

In response to

Re: Adding a suffix array index at 2004-11-19 12:38:20 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Oleg Bartunov	2004-11-19 13:35:06	Re: Adding a suffix array index
Previous Message	Troels Arvin	2004-11-19 13:13:48	Re: Adding a suffix array index