Re: Inconsistent printf placeholders

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, dgrowleyml(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent printf placeholders
Date: 2024-03-19 09:50:23
Message-ID: 3089594e-c1a3-4266-8ee1-fd37e7e16e9a@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.03.24 08:20, Kyotaro Horiguchi wrote:
> diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
> @@ -1369,8 +1369,8 @@ ReadTwoPhaseFile(TransactionId xid, bool missing_ok)
> errmsg("could not read file \"%s\": %m", path)));
> else
> ereport(ERROR,
> - (errmsg("could not read file \"%s\": read %d of %lld",
> - path, r, (long long int) stat.st_size)));
> + (errmsg("could not read file \"%s\": read %zd of %zu",
> + path, r, (Size) stat.st_size)));
> }
>
> pgstat_report_wait_end();

This might be worse, because stat.st_size is of type off_t, which could
be smaller than Size/size_t.

> diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
> index bc04e78abb..68645b4519 100644
> --- a/src/backend/libpq/be-secure-gssapi.c
> +++ b/src/backend/libpq/be-secure-gssapi.c
> @@ -572,9 +572,9 @@ secure_open_gssapi(Port *port)
> if (input.length > PQ_GSS_RECV_BUFFER_SIZE)
> {
> ereport(COMMERROR,
> - (errmsg("oversize GSSAPI packet sent by the client (%zu > %d)",
> + (errmsg("oversize GSSAPI packet sent by the client (%zu > %zu)",
> (size_t) input.length,
> - PQ_GSS_RECV_BUFFER_SIZE)));
> + (size_t) PQ_GSS_RECV_BUFFER_SIZE)));
> return -1;
> }
>

Might be better to add that cast to the definition of
PQ_GSS_RECV_BUFFER_SIZE instead, so that all code can benefit.

> diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
> index 7474f5bd67..baa76280b9 100644
> --- a/src/backend/replication/repl_gram.y
> +++ b/src/backend/replication/repl_gram.y
> @@ -312,11 +312,6 @@ timeline_history:
> {
> TimeLineHistoryCmd *cmd;
>
> - if ($2 <= 0)
> - ereport(ERROR,
> - (errcode(ERRCODE_SYNTAX_ERROR),
> - errmsg("invalid timeline %u", $2)));
> -
> cmd = makeNode(TimeLineHistoryCmd);
> cmd->timeline = $2;
>
> @@ -352,13 +347,7 @@ opt_slot:
>
> opt_timeline:
> K_TIMELINE UCONST
> - {
> - if ($2 <= 0)
> - ereport(ERROR,
> - (errcode(ERRCODE_SYNTAX_ERROR),
> - errmsg("invalid timeline %u", $2)));
> - $$ = $2;
> - }
> + { $$ = $2; }
> | /* EMPTY */ { $$ = 0; }
> ;
>

I don't think this is correct. It loses the check for == 0.

> diff --git a/src/backend/tsearch/to_tsany.c b/src/backend/tsearch/to_tsany.c
> index 88cba58cba..9d21178107 100644
> --- a/src/backend/tsearch/to_tsany.c
> +++ b/src/backend/tsearch/to_tsany.c
> @@ -191,7 +191,8 @@ make_tsvector(ParsedText *prs)
> if (lenstr > MAXSTRPOS)
> ereport(ERROR,
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> - errmsg("string is too long for tsvector (%d bytes, max %d bytes)", lenstr, MAXSTRPOS)));
> + /* cast values to avoid extra translatable messages */
> + errmsg("string is too long for tsvector (%ld bytes, max %ld bytes)", (long) lenstr, (long) MAXSTRPOS)));
>
> totallen = CALCDATASIZE(prs->curwords, lenstr);
> in = (TSVector) palloc0(totallen);

I think it would be better instead to change the message in tsvectorin()
to *not* use long. The size of long is unportable, so I would rather
avoid using it at all.

> diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
> index 8d28dd42ce..5de490b569 100644
> --- a/src/backend/utils/adt/varlena.c
> +++ b/src/backend/utils/adt/varlena.c
> @@ -3217,8 +3217,9 @@ byteaGetByte(PG_FUNCTION_ARGS)
> if (n < 0 || n >= len)
> ereport(ERROR,
> (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
> - errmsg("index %d out of valid range, 0..%d",
> - n, len - 1)));
> + /* cast values to avoid extra translable messages */
> + errmsg("index %lld out of valid range, 0..%lld",
> + (long long)n, (long long) len - 1)));
>
> byte = ((unsigned char *) VARDATA_ANY(v))[n];

I think this is taking it too far. We shouldn't try to make all similar
messages use the same placeholders. If the underlying types are
different, we should use them. Adding more casts makes the code less
robust overall. The size_t/ssize_t cleanup is different, because there
the types were arguably wrong to begin with, and by using the right
types we move toward more consistency.

> diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c
> index 6f0814d9ac..feb4d5dcf4 100644
> --- a/src/bin/pg_combinebackup/pg_combinebackup.c
> +++ b/src/bin/pg_combinebackup/pg_combinebackup.c
> @@ -1301,8 +1301,9 @@ slurp_file(int fd, char *filename, StringInfo buf, int maxlen)
> if (rb < 0)
> pg_fatal("could not read file \"%s\": %m", filename);
> else
> - pg_fatal("could not read file \"%s\": read only %zd of %lld bytes",
> - filename, rb, (long long int) st.st_size);
> + /* cast st_size to avoid extra translatable messages */
> + pg_fatal("could not read file \"%s\": read only %zd of %zu bytes",
> + filename, rb, (size_t) st.st_size);
> }
>
> /* Adjust buffer length for new data and restore trailing-\0 invariant */

Similar to above, casting off_t to size_t is dubious.

> diff --git a/src/port/user.c b/src/port/user.c
> index 7444aeb64b..9364bdb69e 100644
> --- a/src/port/user.c
> +++ b/src/port/user.c
> @@ -40,8 +40,8 @@ pg_get_user_name(uid_t user_id, char *buffer, size_t buflen)
> }
> if (pwerr != 0)
> snprintf(buffer, buflen,
> - _("could not look up local user ID %d: %s"),
> - (int) user_id,
> + _("could not look up local user ID %ld: %s"),
> + (long) user_id,
> strerror_r(pwerr, pwdbuf, sizeof(pwdbuf)));
> else
> snprintf(buffer, buflen,

Also dubious use of "long" here.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2024-03-19 09:53:04 Re: add AVX2 support to simd.h
Previous Message Bertrand Drouvot 2024-03-19 09:41:10 Re: Introduce XID age and inactive timeout based replication slot invalidation