Re: LATIN9 - hex in varchar after convert

From: "Steve Tucknott (TuSol)" <steve(at)tusol(dot)co(dot)uk>
To: PostGreSQL <pgsql-novice(at)postgresql(dot)org>
Subject: Re: LATIN9 - hex in varchar after convert
Date: 2020-04-26 08:02:27
Message-ID: 2c0310b2f49cf950c266334a1c3cfa2ed747395f.camel@tusol.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

aaa
On Sun, 2020-04-26 at 04:24 +0200, Laurenz Albe wrote:
> On Sat, 2020-04-25 at 10:41 +0100, Steve Tucknott (TuSol) wrote:
> > I have a table with a varchar(5000) that contains general text. The
> > table is typicallymaintained via a GUI, but on this occasion I
> > received a spreadsheet with data andloaded it - via copy - from a
> > csv extracted from that. The data looked fine in psql,but when
> > looking at the data in the GUI, characters such as single quote
> > marks (')appeared as a series of special characters. I assumed that
> > the spreadsheet then hadsome different encoding (UTF8?) and that I
> > then needed to 'translate' the characters.
>
> Very likely, the characters were not really single quotes, but "curly
> quotes"(UNICODE 201C and 201E) characters.
> One of the following scenarios must have taken place:
> 1. The file was encoded in UTF-8, but when you copied the data in,
> the encoding you specified (or had by default) was a single-gyte
> encoding like LATIN9.
> The curly quotes are more than one byte in UTF-8, but each byte
> was interpreted as a LATIN9 character.
> The solution would be to specify ENCODING 'UTF8' with COPY.
> 2. The characters are actually fine in the database, and you loaded
> them correctly, and your database client encoding is UTF8, but your
> terminal is in LATIN9.
> The characters were displayed correctly, but your terminal
> interpreted each byte as a character.
> To determine which was the case, look what bytes are in the database:
> SELECT badcol, badcol::bytea FROM tab WHERE id = 12345;
> Yours,Laurenz Albe
Laurenz,Thank you for the reply. After correcting my 'convert' mistake
(thanks again to Tom), I decided to drop the rows loaded from the
spreadsheet and correct the source data and reload - only as
Libreoffice has a simple 'find and replace' - so I could copy the
'naff' quote character into the find field and then replace with a
standard quote. Looks fine so far!

Thank you all for your help,
Steve

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message JORGE MALDONADO 2020-04-27 23:27:28 Advice on a table structure
Previous Message Laurenz Albe 2020-04-26 02:24:39 Re: LATIN9 - hex in varchar after convert