Quick Links

Re: large document multiple regex

From:	Jim Nasby <decibel(at)decibel(dot)org>
To:	Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc:	"postgres general" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: large document multiple regex
Date:	2007-02-02 02:54:24
Message-ID:	BF7F6E71-0397-4AB7-B138-9A88B98A8B99@decibel.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Jan 26, 2007, at 9:06 AM, Merlin Moncure wrote:
> I am receiving a large (300k+_ document from an external agent and
> need to reduce a few interesting bits of data out of the document on
> an insert trigger into separate fields.
>
> regex seems one way to handle this but is there any way to avoid
> rescanning the document for each regex. One solution I am kicking
> around is some C hackery but then I lose the expressive power of
> regex. Ideally, I need to be able to scan some text and return a
> comma delimited string of values extracted from it. Does anybody know
> if this is possible or have any other suggestions?

Have you thought about something like ~ '(first_string|second_string|
third_string)'? Obviously your example would be more complex, but I
believe that with careful crafting, you can get regex to do a lot
without resorting to multiple passes.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

large document multiple regex at 2007-01-26 14:06:50 from Merlin Moncure

Responses

Re: large document multiple regex at 2007-02-02 17:00:27 from Merlin Moncure

Browse pgsql-general by date

	From	Date	Subject
Next Message	Jim Nasby	2007-02-02 03:06:34	Re: relationship in a table
Previous Message	Merlin Moncure	2007-02-02 02:22:16	Re: How to allow users to log on only from my application not from pgadmin