From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Jose Ildefonso Camargo Tolosa <ildefonso(dot)camargo(at)gmail(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Question about POSIX Regular Expressions performance on large dataset. |
Date: | 2010-08-18 02:33:08 |
Message-ID: | AANLkTinLvxuYR+aN4mCDr3dfCt0AU8yOg5Jwd-Szi5ez@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
You can do something similar on the same machine if you can come up
with a common way to partition your data. Then you split your 1B rows
up into chunks of 10M or so and put each on a table and hit the right
table. You can use partitioning / table inheritance if you want to,
or just know the table name ahead of time.
We did something similar with mnogo search. We break it up into a few
hundred different schemas and hit the one for a particular site to
keep the individual mnogo search tables small and fast.
On Tue, Aug 17, 2010 at 8:30 PM, Jose Ildefonso Camargo Tolosa
<ildefonso(dot)camargo(at)gmail(dot)com> wrote:
> Hi, again,
>
> I just had this wacky idea, and wanted to share it:
>
> what do you think of having the dataset divided among several servers,
> and sending the query to all of them, and then just have the
> application "unify" the results from all the servers?
>
> Would that work for this kind of *one table* search? (there are no
> joins, and will never be). I think it should, but: what do you think?
>
> Ildefonso.
>
> On Tue, Aug 17, 2010 at 9:51 PM, Jose Ildefonso Camargo Tolosa
> <ildefonso(dot)camargo(at)gmail(dot)com> wrote:
>> Hi!
>>
>> I'm analyzing the possibility of using PostgreSQL to store a huge
>> amount of data (around 1000M records, or so....), and these, even
>> though are short (each record just have a timestamp, and a string that
>> is less than 128 characters in length), the strings will be matched
>> against POSIX Regular Expressions (different regexps, and maybe
>> complex).
>>
>> Because I don't have a system large enough to test this here, I have
>> to ask you (I may borrow a medium-size server, but it would take a
>> week or more, so I decided to ask here first). How is the performance
>> of Regexp matching in PostgreSQL? Can it use indexes? My guess is:
>> no, because I don't see a way of generally indexing to match regexp :(
>> , so, tablescans for this huge dataset.....
>>
>> What do you think of this?
>>
>> Sincerely,
>>
>> Ildefonso Camargo
>>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>
--
To understand recursion, one must first understand recursion.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2010-08-18 04:39:37 | Re: plpgsql out parameter with select into |
Previous Message | Jose Ildefonso Camargo Tolosa | 2010-08-18 02:30:01 | Re: Question about POSIX Regular Expressions performance on large dataset. |