From: | "Matt Clark" <matt(at)ymogen(dot)net> |
---|---|
To: | "Scott Cain" <cain(at)cshl(dot)org>, "Richard Huxton" <dev(at)archonet(dot)com> |
Cc: | "PgSQL Performance ML" <pgsql-performance(at)postgresql(dot)org>, <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: [SQL] EXTERNAL storage and substring on long strings |
Date: | 2003-08-04 16:56:00 |
Message-ID: | OAEAKHEHCMLBLIDGAFELEEDDDGAA.matt@ymogen.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance pgsql-sql |
> > 2. If you want to search for a sequence you'll need to deal with the case
> > where it starts in one chunk and ends in another.
>
> I forgot about searching--I suspect that application is why I faced
> opposition for shredding in my schema development group. Maybe I should
> push that off to the file system and use grep (or BLAST). Otherwise, I
> could write a function that would search the chunks first, then after
> failing to find the substring in those, I could start sewing the chunks
> together to look for the query string. That could get ugly (and
> slow--but if the user knows that and expects it to be slow, I'm ok with
> that).
If you know the max length of the sequences being searched for, and this is much less than the chunk size, then you could simply
have the chunks overlap by that much, thus guaranteeing every substring will be found in its entirety in at least one chunk.
From | Date | Subject | |
---|---|---|---|
Next Message | Fernando Papa | 2003-08-04 20:17:36 | Re: I can't wait too much: Total runtime 432478.44 msec |
Previous Message | Shridhar Daithankar | 2003-08-04 16:39:27 | Re: [PERFORM] OSDL Database Test Suite 3 is available on PostgreSQL |
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2003-08-04 20:29:56 | Re: [SQL] EXTERNAL storage and substring on long strings |
Previous Message | Scott Cain | 2003-08-04 16:25:41 | Re: [SQL] EXTERNAL storage and substring on long strings |