pgsql: Repair assorted issues in locale data extraction.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Repair assorted issues in locale data extraction.
Date: 2019-04-23 22:51:48
Message-ID: E1hJ4GS-0005LH-St@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Repair assorted issues in locale data extraction.

cache_locale_time (extraction of LC_TIME-related info) had never been
taught the lessons we previously learned about extraction of info related
to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught
PGLC_localeconv() that data coming out of localeconv() was in an encoding
determined by the relevant locale, but we didn't realize that there's a
similar issue with strftime(). And commit a4930e7ca hardened
PGLC_localeconv() against errors occurring partway through, but failed
to do likewise for cache_locale_time(). So, rearrange the latter
function to perform encoding conversion and not risk failure while
it's got the locales set to temporary values.

This time around I also changed PGLC_localeconv() to treat it as FATAL
if it can't restore the previous settings of the locale values. There
is no reason (except possibly OOM) for that to fail, and proceeding with
the wrong locale values seems like a seriously bad idea --- especially
on Windows where we have to also temporarily change LC_CTYPE. Also,
protect against the possibility that we can't identify the codeset
reported for LC_MONETARY or LC_NUMERIC; rather than just failing,
try to validate the data without conversion.

The user-visible symptom this fixes is that if LC_TIME is set to a locale
name that implies an encoding different from the database encoding,
non-ASCII localized day and month names would be retrieved in the wrong
encoding, leading to either unexpected encoding-conversion error reports
or wrong output from to_char(). The other possible failure modes are
unlikely enough that we've not seen reports of them, AFAIK.

The encoding conversion problems do not manifest on Windows, since
we'd already created special-case code to handle that issue there.

Per report from Juan José Santamaría Flecha. Back-patch to all
supported versions.

Juan José Santamaría Flecha and Tom Lane

Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com

Branch
------
REL9_6_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/1e5de96629f639bdb4aa3d67cc49faf0c8c389e5

Modified Files
--------------
src/backend/utils/adt/pg_locale.c | 315 ++++++++++++++---------
src/test/regress/expected/collate.linux.utf8.out | 12 +
src/test/regress/sql/collate.linux.utf8.sql | 2 +
3 files changed, 201 insertions(+), 128 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2019-04-24 00:06:57 pgsql: Fix detection of passwords hashed with MD5
Previous Message Tom Lane 2019-04-23 21:17:40 pgsql: Remove useless comment.