Re: PostgreSQL POSIX regular expression problem

From: "FBurgess(at)Radiantblue(dot)com" <FBurgess(at)Radiantblue(dot)com>
To: Greg Sabino Mullane <greg(at)endpoint(dot)com>
Cc: "pgus-general(at)postgresql(dot)org" <pgus-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL POSIX regular expression problem
Date: 2016-04-07 22:15:20
Message-ID: 3BBE635F64E28D4C899377A61DAA9FE04530599B@NBSVR-MAIL01.radiantblue.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgus-general

Thanks Greg and others,

Here is my workaround to pull out just the corporate phone number in this case. Some of the data has phone numbers in this valid format: (703) 450-5300. This solution doesn't work for these, but all other variations are addressed.

select substring(regexp_replace(regexp_replace('OfficeName: abc office supply PhoneNumber: [commercial] (703)-999-5695, PhoneNumber: [Corporate] (703)-450-5300','[\)]','','g'),'[\(]','','g') from 'PhoneNumber:\s*\[Corporate]\s*([\d\-]+)');

Fred
________________________________________
From: Burgess, Freddie
Sent: Thursday, April 07, 2016 12:16 PM
To: Greg Sabino Mullane
Cc: pgus-general(at)postgresql(dot)org
Subject: RE: [pgus-general] PostgreSQL POSIX regular expression problem

HI Greg, your solution will work for all my variations except for parentheses around the area_code.

pfdb=# select substring('OfficeName: abc office supply PhoneNumber:[commercial] (865)-999-5695, emailaddress: abc(at)office(dot)com' from '([\d\-]+)');
substring
-----------
865
(1 row)

thanks!!
________________________________________
From: Greg Sabino Mullane [greg(at)endpoint(dot)com]
Sent: Thursday, April 07, 2016 11:42 AM
To: Burgess, Freddie
Cc: pgus-general(at)postgresql(dot)org
Subject: Re: [pgus-general] PostgreSQL POSIX regular expression problem

On Wed, Apr 06, 2016 at 06:31:42PM +0000, FBurgess(at)Radiantblue(dot)com wrote:
> * Trying to build a regular expression syntax that doesn't drop the last digit (5) at the very end of the phone number.
>
> testdb=# select substring('OfficeName: abc office supply PhoneNumber: [commercial] 865-999-5695 ' from 'PhoneNumber: \[commercial](.*?)(,|(\d\s+))');

You have two parens in there - the reason you are not getting the last digit is that the first parens (.*?) scoops
up everything - including a leading space - up to the second parens (,|(\d\s+)), which ensures that the final digit
plus whitespace combination is always excluded from the first match.

That's a very specific regex. Can you simply slurp a phone number directly, like so?:

substring('OfficeName: abc office supply PhoneNumber: [commercial] 865-999-5695 ' from '([\d\-]+)')

--
Greg Sabino Mullane greg(at)endpoint(dot)com
End Point Corporation
PGP Key: 0x14964AC8

In response to

Responses

Browse pgus-general by date

  From Date Subject
Next Message Greg Sabino Mullane 2016-04-07 23:44:06 Re: PostgreSQL POSIX regular expression problem
Previous Message FBurgess@Radiantblue.com 2016-04-07 16:16:41 Re: PostgreSQL POSIX regular expression problem