From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: refactoring - share str2*int64 functions |
Date: | 2019-07-17 07:55:39 |
Message-ID: | alpine.DEB.2.21.1907160735480.8986@lancre |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bonjour Michaël,
> FWIW, I was looking forward to putting my hands on this patch and try
> to get it merged so as we can get rid of those duplications. Here are
> some comments.
>
> +#ifdef FRONTEND
> + fprintf(stderr,
> + "invalid input syntax for type %s: \"%s\"\n",
> "bigint", str);
> +#else
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
> + errmsg("invalid input syntax for type %s: \"%s\"",
> + "bigint", str)));
> +#endif
> Have you looked at using the wrapper pg_log_error() here?
I have not.
I have simply merged the two implementations (pgbench & backend) as they
were.
> +extern bool pg_scanint8(const char *str, bool errorOK, int64 *result);
> +extern uint64 pg_strtouint64(const char *str, char **endptr, int base);
> Hmm. With this patch we have strtoint and pg_strtouint64, which makes
> the whole set inconsistent.
I understand that you mean bits vs bytes? Indeed it can bite!
> +
> #endif /* COMMON_STRING_H */
> Noise diff..
Indeed.
> numutils.c also has pg_strtoint16 and pg_strtoint32, so the locations
> become rather inconsistent with inconsistent APIs for the manipulation
> of int2 and int4 fields, and scanint8 is just a derivative of the same
> logic. We have two categories of routines here:
Yep, but the int2/4 functions are not used elsewhere.
> - The wrappers on top of strtol and strtoul & co, which are named
> respectively strtoint and pg_strtouint64 with the patch. The naming
> part is inconsistent, and we only handle uint64 and int32. We don't
> need to worry about int64 and uint32 because they are not used?
Indeed, it seems that they are not needed/used by client code, AFAICT.
> That's fine by me, but at least let's have a consistent naming.
Ok.
> Prefixing the functions with pg_* is a better practice in my opinion
> as we will unlikely run into conflicts this way.
Ok.
> - The str->integer conversion routines, which actually have very
> similar characteristics to the strtol families as they remove trailing
> whitespaces first, check for a sign, etc, except that they work only
> on base 10. And here we get into a state where pg_scanint8 should be
> actually called pg_strtoint64,
I just removed that:-)
ISTM that the issue is that the error handling of these functions is
pretty different.
> with an interface inconsistent with its int32/int16 relatives now only
> in the backend.
We can, but I'm not at ease with changing the error handling approach.
> Could we consider more consolidation here please? Like moving the whole
> set to src/common/?
My initial plan was simply to remove direct code duplications between
front-end and back-end, not to re-engineer the full set of string to int
conversion functions:-)
On the re-engineering front: Given the various points on the thread, ISTM
that there should probably be two behaviors for str to signed/unsigned
int{16,32,64}, and having only one kind of signature for all types would
be definitely better.
One low-level one that does the conversion or return an error.
Another higher-level one which possibly adds an error message (stderr for
front-end, log for back-end).
One choice is whether there are two functions (the higher one calling the
lower one and adding the messages) or just one with a boolean to trigger
the messages. I do not have a strong opinion. Maybe one function would be
simpler. Andres boolean-compatible enum return looks like a good option.
Overall, this leads to something like:
enum { STRTOINT_OK, STRTOINT_OVERFLOW_ERROR, STRTOINT_SYNTAX_ERROR }
pg_strto{,u}int{8?,16,32,64}
(const char * string, const TYPE * result, const bool verbose);
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-07-17 07:59:02 | Re: Add parallelism and glibc dependent only options to reindexdb |
Previous Message | Thomas Munro | 2019-07-17 07:16:00 | Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.) |