From: | Geoffrey Myers <lists(at)serioustechnology(dot)com> |
---|---|
To: | Vick Khera <vivek(at)khera(dot)org> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: finding bogus UTF-8 |
Date: | 2011-02-15 22:06:07 |
Message-ID: | 4D5AF8CF.9080001@serioustechnology.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Vick Khera wrote:
> On Tue, Feb 15, 2011 at 11:09 AM, Geoffrey Myers
> <lists(at)serioustechnology(dot)com> wrote:
>> comments would be appreciated.
>>
>
> If all you're doing is filtering stdin to stdout and deleting a range
> of characters, it seems that tr would be a faster tool:
>
> cat foo.txt | tr -d '\000-\008\013-\037\177-\377' > foo-cleaned.txt
I toyed with tr for a bit, but could not get it to work. The above did
not work for me either. Not exactly sure what it's doing, but here's a
couple of diff lines:
1619c1619
< days integer DEFAULT 28,
---
> days integer DEFAULT 2,
So it appears 'tr' is deleting the '8' character, rather then the octal
value for 008.
--
Until later, Geoffrey
"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2011-02-15 22:24:45 | Re: SELECT INTO array[i] with PL/pgSQL |
Previous Message | Alpha Beta | 2011-02-15 22:01:28 | subset of attributes |