Re: Inconsistent printf placeholders

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: dgrowleyml(at)gmail(dot)com
Cc: peter(at)eisentraut(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent printf placeholders
Date: 2024-03-15 07:20:27
Message-ID: 20240315.162027.1235963702431811541.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 15 Mar 2024 16:01:28 +1300, David Rowley <dgrowleyml(at)gmail(dot)com> wrote in
> On Fri, 15 Mar 2024 at 15:27, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > I have considered only the two messages. Actually, buffile.c and md.c
> > are already like that. The attached aligns the messages in
> > pg_combinebackup.c and reconstruct.c with the precedents.
>
> This looks like a worthy cause to make translator work easier.
>
> I don't want to widen the goalposts or anything, but just wondering if
> you'd searched for any others that could get similar treatment?
>
> I only just had a quick look at the following.
>
> $ cat src/backend/po/fr.po | grep -E "^msgid\s" | sed -E
> 's/%[a-zA-Z]+/\%/g' | sort | uniq -d -c
> 31 msgid ""
> 2 msgid "could not accept SSL connection: %"
> 2 msgid "could not initialize LDAP: %"
> 2 msgid "could not look up local user ID %: %"
...
> I've not looked at how hard it would be with any of the above to
> determine how hard it would be to make the formats consistent. The
> 3rd last one seems similar enough that it might be worth doing
> together with this?

I checked for that kind of msgids in a bit more intensive way. The
numbers are the line numbers in ja.po of backend. I didn't check the
same for other modules.

> ###: invalid timeline %lld
> @ backup/walsummaryfuncs.c:95
> invalid timeline %u
> @ repl_gram.y:318 repl_gram.y:359

In the first case, the %lld can be more than 2^31.

In the second case, %u is uint32. However, the bigger issue is
checking if the uint32 value is negative:

repl_gram.c: 147
> if ((yyvsp[0].uintval) <= 0)
> ereport(ERROR,
> (errcode(ERRCODE_SYNTAX_ERROR),
> errmsg("invalid timeline %u", (yyvsp[0].uintval))));

which cannot happen. I think we can simply remove the useless error
check.

> ###: could not read file \"%s\": read %d of %zu
> @ ../common/controldata_utils.c:116 ../common/controldata_utils.c:119 access/transam/xlog.c:3417 access/transam/xlog.c:4278 replication/logical/origin.c:750 replication/logical/origin.c:789 replication/logical/snapbuild.c:2040 replication/slot.c:2218 replication/slot.c:2259 replication/walsender.c:660 utils/cache/relmapper.c:833
> could not read file \"%s\": read %d of %lld
> @ access/transam/twophase.c:1372
> could not read file \"%s\": read %zd of %zu
> @ backup/basebackup.c:2102
> ###: oversize GSSAPI packet sent by the client (%zu > %zu)
> @ libpq/be-secure-gssapi.c:351
> oversize GSSAPI packet sent by the client (%zu > %d)
> @ libpq/be-secure-gssapi.c:575
> ###: compressed segment file \"%s\" has incorrect uncompressed size %d, skipping
> @ pg_receivewal.c:362
> compressed segment file \"%s\" has incorrect uncompressed size %zu, skipping
> @ pg_receivewal.c:448
> ###: could not read file \"%s\": read only %zd of %lld bytes
> @ pg_combinebackup.c:1304
> could not read file \"%s\": read only %d of %u bytes
> @ reconstruct.c:514

We can "fix" them the same way. I debated whether to use ssize_t for
read() and replace all instances of size_t with Size. However, in the
end, I decided to only keep it consistent with the surrounding code.

> ###: index %d out of valid range, 0..%d
> @ utils/adt/varlena.c:3220 utils/adt/varlena.c:3287
> index %lld out of valid range, 0..%lld
> @ utils/adt/varlena.c:3251 utils/adt/varlena.c:3323
> ###: string is too long for tsvector (%d bytes, max %d bytes)
> @ tsearch/to_tsany.c:194 utils/adt/tsvector.c:277 utils/adt/tsvector_op.c:1126
> string is too long for tsvector (%ld bytes, max %ld bytes)
> @ utils/adt/tsvector.c:223

We can unify them and did in the attached, but I'm not sure that's
sensible..

> ###: could not look up local user ID %d: %s
> @ ../port/user.c:43 ../port/user.c:79
> could not look up local user ID %ld: %s
> @ libpq/auth.c:1886

Both of the above use uid_t (defined as int) as user ID and it is
explicitly cast to "int" in the first case and to long in the second,
which seems mysterious. Although I'm not sure if there's a possibility
of uid_t being widened in the future, I unified them to the latter way
for robustness.

> ###: error while cloning relation \"%s.%s\" (\"%s\" to \"%s\"): %m
> @ file.c:44
> error while cloning relation \"%s.%s\" (\"%s\" to \"%s\"): %s
> @ file.c:65
>
> The latter seems to be changed to %m by reassiging save_errno to errno, as done in other places.
>
> ###: could not get data directory using %s: %m
> @ option.c:448
> could not get data directory using %s: %s
> @ option.c:452
> ###: could not get control data using %s: %m
> @ controldata.c:129 controldata.c:199
> could not get control data using %s: %s
> @ controldata.c:175 controldata.c:507
> ###: %s: %m
> @ copy.c:401 psqlscanslash.l:805 psqlscanslash.l:817 psqlscanslash.l:835
> %s: %s
> @ command.c:2728 copy.c:388

In these cases, %m can be replaced by %s by using
wait_result_to_str(-1), but I'm not sure it is sensible. (I didn't do
that in the attached)

Finally, it doesn't seem possible to sensibly unify everything that
follows.

> ###: could not initialize LDAP: %s
> @ libpq/auth.c:2327
> could not initialize LDAP: %m
> @ libpq/auth.c:2345
> ###: Valid values are between \"%d\" and \"%d\".
> @ access/common/reloptions.c:1616
> Valid values are between \"%f\" and \"%f\".
> @ access/common/reloptions.c:1636
> ###: permission denied for large object %s
> @ catalog/aclchk.c:2745
> permission denied for large object %u
> @ libpq/be-fsstubs.c:871 storage/large_object/inv_api.c:297 storage/large_object/inv_api.c:309 storage/large_object/inv_api.c:506 storage/large_object/inv_api.c:617 storage/large_object/inv_api.c:807
> ###: commutator operator %s is already the commutator of operator %s
> @ catalog/pg_operator.c:739
> commutator operator %s is already the commutator of operator %u
> @ catalog/pg_operator.c:744
> ###: must be owner of large object %s
> @ catalog/aclchk.c:2877
> must be owner of large object %u
> @ catalog/objectaddress.c:2511 libpq/be-fsstubs.c:329
> ###: could not open file \"%s\": %m
> @ replication/slot.c:2186 replication/walsender.c:628 replication/walsender.c:3051 storage/file/copydir.c:151 storage/file/fd.c:803 storage/file/fd.c:3510 storage/file/fd.c:3740 storage/file/fd.c:3830 storage/smgr/md.c:661 utils/cache/relmapper.c:818 utils/cache/relmapper.c:935 utils/error/elog.c:2085 utils/init/miscinit.c:1526 utils/init/miscinit.c:1660 utils/init/miscinit.c:1737 utils/misc/guc.c:4712 utils/misc/guc.c:4762
> could not open file \"%s\": %s
> @ ../port/open.c:115
> ###: %d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)
> @ utils/misc/guc.c:3179
> %g%s%s is outside the valid range for parameter \"%s\" (%g .. %g)
> @ utils/misc/guc.c:3215
> ###: could not accept SSL connection: %m
> @ libpq/be-secure-openssl.c:503
> could not accept SSL connection: %s
> @ libpq/be-secure-openssl.c:546
> ###: invalid value for parameter \"%s\": %d
> @ utils/adt/regexp.c:716 utils/adt/regexp.c:725 utils/adt/regexp.c:1082 utils/adt/regexp.c:1146 utils/adt/regexp.c:1155 utils/adt/regexp.c:1164 utils/adt/regexp.c:1173 utils/adt/regexp.c:1853 utils/adt/regexp.c:1862 utils/adt/regexp.c:1871 utils/misc/guc.c:6761 utils/misc/guc.c:6795
> invalid value for parameter \"%s\": %g
> @ utils/misc/guc.c:6829
> ###: could not close file \"%s\": %m
> @ bbstreamer_file.c:138 pg_recvlogical.c:650
> could not close file \"%s\": %s
> @ receivelog.c:227 receivelog.c:317 receivelog.c:688
> ###: could not fsync file \"%s\": %m
> @ pg_recvlogical.c:204
> could not fsync file \"%s\": %s
> @ receivelog.c:775 receivelog.c:1022 walmethods.c:1206
> ###: could not read from input file: %s
> @ compress_lz4.c:628 compress_lz4.c:647 compress_none.c:97 compress_none.c:140
> could not read from input file: %m
> @ compress_zstd.c:373 pg_backup_custom.c:655
> ###: could not get pg_ctl version data using %s: %m
> @ exec.c:47
> could not get pg_ctl version data using %s: %s
> @ exec.c:51
> ###: negator operator %s is already the negator of operator %s
> @ catalog/pg_operator.c:807
> negator operator %s is already the negator of operator %u
> @ catalog/pg_operator.c:812
> ###: timestamp out of range: \"%s\"
> @ access/transam/xlogrecovery.c:4937 utils/adt/timestamp.c:202 utils/adt/timestamp.c:455
> timestamp out of range: \"%g\"
> @ utils/adt/timestamp.c:762 utils/adt/timestamp.c:774

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
unify_placeholders_of_some_messages_v3.patch text/x-patch 14.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2024-03-15 07:28:53 Re: Inconsistent printf placeholders
Previous Message Alexander Kukushkin 2024-03-15 07:20:15 Re: Infinite loop in XLogPageRead() on standby