From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Igniris Valdivia Baez" <igniris(at)gmail(dot)com> |
Cc: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Pgsql-General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: how can I fix my accent issues? |
Date: | 2023-12-13 16:11:03 |
Message-ID: | 2b754a38-282c-4cb0-99a5-ac019893148b@manitou-mail.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Igniris Valdivia Baez wrote:
> 3. After the revision the data is loaded to the destiny database and
> here is were I believe the issue is, because the data is reviewed in
> Windows and somehow Pentaho is not understanding correctly the
> interaction between both operating systems.
On Windows, a system in spanish would plausibly use
https://en.wikipedia.org/wiki/Windows-1252
as the default codepage.
On Unix, it might use UTF-8, with a locale like maybe es_CU.UTF-8.
Now if a certain component of your data pipeline assumes that
the input data is in the default encoding of the system, and
the input data appears to be always encoded with Windows-1252,
then only the version running on Windows will have it right.
The one that runs on Unix might translate the bytes that
do not meet its encoding expectations into the U+FFFD
code point.
At least that's a plausible explanation for the result you're seeing
in the Postgres database.
A robust solution is to not use defaults for encodings and explicitly
declare the encoding of every input throughout the data pipeline.
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2023-12-13 17:29:35 | Re: how can I fix my accent issues? |
Previous Message | Igniris Valdivia Baez | 2023-12-13 14:42:45 | Re: how can I fix my accent issues? |