Re: [GENERAL] psql weird behaviour with charset encodings

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hgonzalez(at)gmail(dot)com
Subject: Re: [GENERAL] psql weird behaviour with charset encodings
Date: 2015-06-18 03:37:48
Message-ID: CAB7nPqTSwtVDXEsjB0QJgQRazVtRVSzPYBi0VfdyvyhnF8D+jQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Thu, Jun 18, 2015 at 9:47 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote:
>> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> > > It would be good to purge the code of precisions on "s" conversion specifiers,
>> > > then Assert(!pointflag) in fmtstr() to catch new introductions. I won't plan
>> > > to do it myself, but it would be a nice little defensive change.
>> >
>> > This sounds like a good protection idea, but as it impacts existing
>> > backend code relying in sprintf's port version we should only do the
>> > assertion in HEAD in my opinion, and mention it in the release notes of the
>> > next major version at the time a patch in this area is applied. I guess
>
> Adding the assertion would be master-only. We don't necessarily release-note
> C API changes.

Cool. So we are on the same page.

>> > that we had better backpatch the places using .*s though on back-branches.
>
> I would tend to back-patch only the ones that cause interesting bugs. For
> example, we should not reach the read.c elog() calls anyway, so it's not a big
> deal if the GNU libc bug makes them a bit less helpful in back branches.
> (Thanks for the list of code sites; it was more complete than anything I had.)
> So far, only tar.c looks harmed enough to back-patch.
>
>> Attached is a patch purging a bunch of places from using %.*s, this will
>> make the code more solid when facing non-ASCII strings. I let pg_upgrade
>> and pg_basebackup code paths alone as it reduces the code lisibility by
>> moving out of this separator. We may want to fix them though if file path
>> names have non-ASCII characters, but it seems less critical.
>
> To add the assertion, we must of course fix all uses.

Sure.

> Having seen the patch I requested, I don't like the result as much as I
> expected to like it. The patched code is definitely harder to read and write:
>
>> @@ -1534,7 +1541,10 @@ parseNodeString(void)
>> return_value = _readDeclareCursorStmt();
>> else
>> {
>> - elog(ERROR, "badly formatted node string \"%.32s\"...", token);
>> + char buf[33];
>> + memcpy(buf, token, 32);
>> + buf[33] = '\0';
>> + elog(ERROR, "badly formatted node string \"%s\"...", buf);
>> return_value = NULL; /* keep compiler quiet */
>> }

We could spread what the first patch did in readfuncs.c by having some
more macros doing the duplicated work. Not that it would improve the
code readability of those macros..

> (Apropos, that terminator belongs in buf[32], not buf[33].)

Indeed.

> Perhaps we're better off setting aside the whole idea,

At least on OSX (10.8), I am seeing that no more than the number of
bytes defined by the precision is written. So it looks that we are
safe there. So yes thinking long-term this looks the better
alternative. And I am wondering about the potential breakage that this
could actually have with Postgres third-part tools using src/port's
snprintf.

> or forcing use of snprintf.c on configurations having the bug?

I am less sure about that. It doesn't seem worth it knowing that the
tendance is to evaluate the precision in terms in bytes and not
characters.
--
Michael

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Xavier 12 2015-06-18 07:17:01 Re: pg_xlog on a hot_stanby slave
Previous Message Sameer Kumar 2015-06-18 02:00:19 Re: pg_xlog on a hot_stanby slave

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2015-06-18 03:45:02 Re: 9.5 make world failing due to sgml tools missing
Previous Message Noah Misch 2015-06-18 00:47:56 Re: [GENERAL] psql weird behaviour with charset encodings