Re: regexp idea

From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: regexp idea
Date: 2013-08-27 19:15:39
Message-ID: 1377630939277-5768731.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

rummandba wrote
> Hi,
>
> I have a string like:
> Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A
> final, June 5, 2013
>
> I need to extract date part from the string.
>
> I used the follows:
> regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
>
> But it gives me result August as it stops at "Augustine".
>
> In my case, date can be in different formats, some record may use "," or
> some may not.
>
> Any idea to achieve this?
>
> Thanks.

Not sure how you expect to match "June" with that particular expression but
to solve the mis-matching of "Augustine" you can use the word-boundary
escapes "\m" (word-start) and "\M" (word-end).

Unless you need fuzzy matching on the month name you should simply list all
twelve months and possible recognized abbreviations as well.

^.*\m(June|July|August|September)\M[, a-zA-Z0-9]+

I'd consider helping more with forming an actual expression but a single
input sample with zero context on how such a string is created gives little
to work with.

Though after the month there likely cannot be a letter so a better
definition would be:

\m(August)[, ]+(\d)+[, ]+(\d+)

HTH

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-idea-tp5768725p5768731.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2013-08-27 19:36:14 Re: OLAP
Previous Message AI Rumman 2013-08-27 18:44:44 regexp idea