Re: 64 bit numbers vs format strings

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: 64 bit numbers vs format strings
Date: 2025-03-02 21:08:33
Message-ID: CA+hUKGLDnhaYB6aCXr3z=K6sU85S9OR7d2bUXZWpy_=yJebUyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> On 05.12.24 23:18, Thomas Munro wrote:
> > Old: errmsg("hello %llu", (unsigned long long) x)
> > New: errmsg("hello %" PRIu64, x)
>
> I have committed the subset of this patch for pg_checksums.c so that the
> translators and whoever else might be affected can try this out at small
> scale. (I don't expect any particular problems.) Then we can move on
> to the rest in a few weeks, I think.

Good plan, thanks. Here's a rebase.

I also added one more patch that I expect to be more contentious as it
is a UX change. Why do we display LSNs with a slash? I believe there
are two reasons: (1) Back in 2000 didn't require the existence of a 64
bit type, so XLogRecPtr was a struct holding two uint32 values. The
author could still have used "%08X%08X" for both printf and scanf if
that was the only reason. (2) It initially had a real semantic
division into two parts log ID and log offset, which the author
apparently wanted to convey to readers. I didn't check the full
history but I think at some point our log segments (first 16MB, now
initdb-time variable) superseded the log ID concept, which I think
originally had 4GB segments? (I could also have had something to do
with the abandoned undo system's needs, IDK.) That leads to the idea
of ditching the slash and displaying them in the more obvious (to my
aesthetic, anyway, YMMV):

SELECT pg_lsn(23783416::numeric);
- pg_lsn
------------
- 0/16AE7F8
+ pg_lsn
+------------------
+ 00000000016AE7F8

And likewise wherever they appear or are parsed in tools, protocols,
command lines etc.

/me activates flame-proof force field

I realised while contemplating that that my treatment of pgoff_t might
not be quite right in the first patch. It casts off_t (eg from struct
stat) to (pgoff_t) and display as "%" PRId64, which is correct for
Windows where pgoff_t is a typedef to __int64 (actually int64_t would
be more logical, but presumably int64_t is __int64 on that system, not
sure whether that is truly a distinct type according to its native
compiler), but on non-Windows we use the system off_t whose printf
type is unknown to us, and might in theory be a different signed 64
bit type and provoke a warning from GCC/Clang printf attributes.
Perhaps we should define just pgoff_t as int64_t everywhere? There
are no warnings on any of our CI OSes so I assume that those OSes
coincidentally define off_t the same way they define int64_t. That
being the case we could just ignore it for now, but another system
using GCC/Clang printf attributes (eg illumos or the other BSDs) might
not happen to agree. Not done yet.

And one more thing like that: in a couple of places we see warnings on
macOS CI that I'd missed: when printing the result of i64abs() as
PRId64, because it happens to use labs() and it happens to define
int64_t as long long, and when printing a Datum as PRIx64, because
Datum is uintptr_t and it happens to define that as unsigned long. I
suppose we should cast to int64 in the definition of c.h's i64abs()
macro and a couple of similar things, and cast Datum to uint64 in that
one place that wants to print it out. Not done yet, so you can still
see this on macOS CI's build step.

Attachment Content-Type Size
v2-0001-Use-PRI-64-instead-of-ll-in-format-strings-contin.patch text/x-patch 87.5 KB
v2-0002-pgbench-Make-set_random_seed-64-bit-everywhere.patch text/x-patch 1.8 KB
v2-0003-pgbench-Rationalize-types-in-parseScriptWeight.patch text/x-patch 1.7 KB
v2-0004-pgbench-Modernize-integer-parsing-routine.patch text/x-patch 3.0 KB
v2-0005-Remove-historical-slashes-from-LSN-format.patch text/x-patch 250.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-03-02 21:48:08 Re: 64 bit numbers vs format strings
Previous Message Alexander Borisov 2025-03-02 20:33:07 Re: Optimization for lower(), upper(), casefold() functions.