From: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
---|---|
To: | Andreas <maps(dot)on(at)gmx(dot)net>, pgsql-sql(at)postgresql(dot)org |
Subject: | Re: How to clean up phone-numbers with regex? |
Date: | 2014-05-19 15:43:49 |
Message-ID: | 537A26B5.3030309@pinpointresearch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On 05/19/2014 01:54 AM, Andreas wrote:
> Hi
>
> I need to clean up phone-numbers. Somehow I got a Excel list that has
> weird graphical characters trailing some of the entries.
> My DB is UTF8 so it would store this mess but I don't like to import
> it in the first place.
>
> OK, I know how to read the stuff into a temporary table to clean it up
> before the actual import.
> How can I do an update on the column that deletes every char that is
> not in a given set of chars like '+- 0123456/()'?
>
See: http://www.postgresql.org/docs/current/static/functions-matching.html
For the first case, the regexp_replace function is probably your best
bet. But note that, depending on the quality of your input, just
removing characters outside that range may still not yield the desired
result.
select regexp_replace('(12s3)-456-635/6(a+sdk', '[^0-9()+-/]', '', 'g');
regexp_replace
-------------------
(123)-456-635/6(+
You can remove all formatting by requiring only digits then check and/or
reformat later as desired.
steve=> select regexp_replace('(12s3)-456-6356(a+sdk', '[^0-9]', '', 'g');
regexp_replace
----------------
1234566356
>
> Second but similar question:
> How can I select records that have fields that contain characters not
> included in a given alphabet?
> E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?
>
See regexp_match on the above-referenced page.
Cheers,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Crawford | 2014-05-19 15:44:38 | Re: How to clean up phone-numbers with regex? |
Previous Message | Rob Sargent | 2014-05-19 13:39:58 | Re: How to clean up phone-numbers with regex? |