Re: Postgres, apps, special characters and UTF-8 encoding

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>, PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres, apps, special characters and UTF-8 encoding
Date: 2017-03-08 00:25:52
Message-ID: c68de6c9-3c1f-a45e-8828-8492b940e4a7@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 03/07/2017 03:20 PM, Ken Tanzer wrote:
> Hi. I've got a recurring problem with character encoding for a
> Postgres-based web PHP app, and am hoping someone can clue me in or at
> least point me in the right direction. I'll confess upfront my
> understanding of encoding issues is extremely limited. Here goes.
>
> The app uses a Postgres database, UTF-8 encoded. Through their
> browsers, users can add and edit records often including text. Most of
> the time this works fine. Though sometimes this will fail with Postgres
> complaining, for example, "Could query with ... , The error text was:
> ERROR: invalid byte sequence for encoding "UTF8": 0xe9 0x20 0x67"
>
> So this generally happens when people copy and paste things out of their
> word documents and such.
>
> As I understand it, those are likely encoded in something non-UTF-8,
> like WIN-1251 or something. And that one way or another, the encoding
> needs to be translated before it can be placed into the database. I'm
> not clear how this is supposed to happen though. Automatically by the
> browser? Done in the app? Some other way? And if in the app, how is
> one supposed to know what the incoming encoding is?

I don't use PHP, but found this:

http://www.php.net/manual/en/function.mb-detect-encoding.php

and this:

http://php.net/manual/en/function.mb-convert-encoding.php

>
> Thanks in advance for any help or pointers.
>
> Ken
>
>
> --
> AGENCY Software
> A Free Software data system
> By and for non-profits
> /http://agency-software.org//
> /https://agency-software.org/demo/client/
> ken(dot)tanzer(at)agency-software(dot)org <mailto:ken(dot)tanzer(at)agency-software(dot)org>
> (253) 245-3801
>
> Subscribe to the mailing list
> <mailto:agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
> learn more about AGENCY or
> follow the discussion.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message rob stone 2017-03-08 01:41:15 Re: Postgres, apps, special characters and UTF-8 encoding
Previous Message David G. Johnston 2017-03-07 23:32:36 Re: Postgres, apps, special characters and UTF-8 encoding