From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Jose Ildefonso Camargo Tolosa <ildefonso(dot)camargo(at)gmail(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Question about POSIX Regular Expressions performance on large dataset. |
Date: | 2010-08-18 02:28:09 |
Message-ID: | AANLkTimaiPKsy8xgyseSxD1qn-en7ciDP=U6kniNeFwH@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Tue, Aug 17, 2010 at 8:21 PM, Jose Ildefonso Camargo Tolosa
<ildefonso(dot)camargo(at)gmail(dot)com> wrote:
> Hi!
>
> I'm analyzing the possibility of using PostgreSQL to store a huge
> amount of data (around 1000M records, or so....), and these, even
> though are short (each record just have a timestamp, and a string that
> is less than 128 characters in length), the strings will be matched
> against POSIX Regular Expressions (different regexps, and maybe
> complex).
>
> Because I don't have a system large enough to test this here, I have
> to ask you (I may borrow a medium-size server, but it would take a
> week or more, so I decided to ask here first). How is the performance
> of Regexp matching in PostgreSQL? Can it use indexes? My guess is:
> no, because I don't see a way of generally indexing to match regexp :(
> , so, tablescans for this huge dataset.....
>
> What do you think of this?
Yes it can index such things, but it has to index them in a fixed way.
i.e. you can create functional indexes with pre-built regexes. But
for ones where the values change each time, you're correct, no indexes
will be used.
Could full text searching be used instead?
From | Date | Subject | |
---|---|---|---|
Next Message | Jose Ildefonso Camargo Tolosa | 2010-08-18 02:30:01 | Re: Question about POSIX Regular Expressions performance on large dataset. |
Previous Message | Jose Ildefonso Camargo Tolosa | 2010-08-18 02:21:25 | Question about POSIX Regular Expressions performance on large dataset. |