Re: how can I fix my accent issues?

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Igniris Valdivia Baez" <igniris(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Pgsql-General <pgsql-general(at)postgresql(dot)org>
Subject: Re: how can I fix my accent issues?
Date: 2023-12-13 16:11:03
Message-ID: 2b754a38-282c-4cb0-99a5-ac019893148b@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Igniris Valdivia Baez wrote:

> 3. After the revision the data is loaded to the destiny database and
> here is were I believe the issue is, because the data is reviewed in
> Windows and somehow Pentaho is not understanding correctly the
> interaction between both operating systems.

On Windows, a system in spanish would plausibly use
https://en.wikipedia.org/wiki/Windows-1252
as the default codepage.
On Unix, it might use UTF-8, with a locale like maybe es_CU.UTF-8.

Now if a certain component of your data pipeline assumes that
the input data is in the default encoding of the system, and
the input data appears to be always encoded with Windows-1252,
then only the version running on Windows will have it right.
The one that runs on Unix might translate the bytes that
do not meet its encoding expectations into the U+FFFD
code point.
At least that's a plausible explanation for the result you're seeing
in the Postgres database.

A robust solution is to not use defaults for encodings and explicitly
declare the encoding of every input throughout the data pipeline.

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2023-12-13 17:29:35 Re: how can I fix my accent issues?
Previous Message Igniris Valdivia Baez 2023-12-13 14:42:45 Re: how can I fix my accent issues?