Re: proposal: minscale, rtrim, btrim functions for numeric

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marko Tiikkaja <marko(at)joh(dot)to>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: minscale, rtrim, btrim functions for numeric
Date: 2019-12-08 07:38:38
Message-ID: CAFj8pRDSkO30KbyPC-6bFNqcdA=xq6xM=3tJmNdMw+ugi5z9Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <kop(at)meme(dot)com> napsal:

> Hello Pavel,
>
> On Mon, 11 Nov 2019 15:47:37 +0100
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
> > Here is a patch. It's based on Dean's suggestions.
> >
> > I implemented two functions - first minscale, second trim_scale. The
> > overhead of second is minimal - so I think it can be good to have it.
> > I started design with the name "trim_scale", but the name can be any
> > other.
>
> Here are my thoughts on your patch.
>
> My one substantial criticism is that I believe that
> trim_scale('NaN'::numeric) should return NaN.
> So the test output should look like:
>
> template1=# select trim_scale('nan'::numeric) = 'nan'::numeric;
> ?column?
> ----------
> t
> (1 row)
>

fixed

>
> FWIW, I bumped around the Internet and looked at Oracle docs to see if
> there's any reason why minscale() might not be a good function name.
> I couldn't find any problems.
>
> I also couldn't think of a better name than trim_scale() and don't
> have any problems with the name.
>
> My other suggestions mostly have to do with documentation. Your code
> looks pretty good to me, looks like the existing code, you name
> variables and function names as in existing code, etc.
>
> I comment on various hunks in line below:
>
> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index 28eb322f3f..6f142cd679 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -918,6 +918,19 @@
> <entry><literal>6.0000000000</literal></entry>
> </row>
>
> + <row>
> + <entry>
> + <indexterm>
> + <primary>minscale</primary>
> + </indexterm>
> +
> <literal><function>minscale(<type>numeric</type>)</function></literal>
> + </entry>
> + <entry><type>integer</type></entry>
> + <entry>returns minimal scale of the argument (the number of
> decimal digits in the fractional part)</entry>
> + <entry><literal>scale(8.4100)</literal></entry>
> + <entry><literal>2</literal></entry>
> + </row>
> +
> <row>
> <entry>
> <indexterm>
>
> *****
> Your description does not say what the minimal scale is. How about:
>
> minimal scale (number of decimal digits in the fractional part) needed
> to store the supplied value without data loss
> *****
>

sounds better, updated

> @@ -1041,6 +1054,19 @@
> <entry><literal>1.4142135623731</literal></entry>
> </row>
>
> + <row>
> + <entry>
> + <indexterm>
> + <primary>trim_scale</primary>
> + </indexterm>
> +
> <literal><function>trim_scale(<type>numeric</type>)</function></literal>
> + </entry>
> + <entry><type>numeric</type></entry>
> + <entry>reduce scale of the argument (the number of decimal
> digits in the fractional part)</entry>
> + <entry><literal>scale(8.4100)</literal></entry>
> + <entry><literal>8.41</literal></entry>
> + </row>
> +
> <row>
> <entry>
> <indexterm>
>
> ****
> How about:
>
> reduce the scale (the number of decimal digits in the fractional part)
> to the minimum needed to store the supplied value without data loss
> *****
>

ok, changed

> diff --git a/src/backend/utils/adt/numeric.c
> b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644
> --- a/src/backend/utils/adt/numeric.c
> +++ b/src/backend/utils/adt/numeric.c
>
> ****
> I believe the hunks in this file should start at about line# 3181.
> This is right after numeric_scale(). Seems like all the scale
> related functions should be together.
>
> There's no hard standard but I don't see why lines (comment lines in
> your case) should be longer than 78 characters without good reason.
> Please reformat.
> ****
>
> @@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
> PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
> }
>
> +/*
> + * Calculate minimal display scale. The var should be stripped already.
>
> ****
> I think you can get rid of the word "display" in the comment.
> ****
>

done

> + */
> +static int
> +get_min_scale(NumericVar *var)
> +{
> + int minscale = 0;
> +
> + if (var->ndigits > 0)
> + {
> + NumericDigit last_digit;
> +
> + /* maximal size of minscale, can be lower */
> + minscale = (var->ndigits - var->weight - 1) *
> DEC_DIGITS; +
> + /*
> + * When there are not digits after decimal point, the
> previous expression
>
> ****
> s/not/no/
> ****
>
> + * can be negative. In this case, the minscale must be
> zero.
> + */
>
> ****
> s/can be/is/
> ****
>
> + if (minscale > 0)
> + {
> + /* reduce minscale if trailing digits in last
> numeric digits are zero */
> + last_digit = var->digits[var->ndigits - 1];
> +
> + while (last_digit % 10 == 0)
> + {
> + minscale--;
> + last_digit /= 10;
> + }
> + }
> + else
> + minscale = 0;
> + }
> +
> + return minscale;
> +}
> +
> +/*
> + * Returns minimal scale of numeric value when value is not changed
>
> ****
> Improve comment, something like:
> minimal scale required to represent supplied value without loss
>

ok

****
>
> + */
> +Datum
> +numeric_minscale(PG_FUNCTION_ARGS)
> +{
> + Numeric num = PG_GETARG_NUMERIC(0);
> + NumericVar arg;
> + int minscale;
> +
> + if (NUMERIC_IS_NAN(num))
> + PG_RETURN_NULL();
> +
> + init_var_from_num(num, &arg);
> + strip_var(&arg);
> +
> + minscale = get_min_scale(&arg);
> + free_var(&arg);
> +
> + PG_RETURN_INT32(minscale);
> +}
> +
> +/*
> + * Reduce scale of numeric value so value is not changed
>
> ****
> Likewise, comment text could be improved
> ****
>
> + */
> +Datum
> +numeric_trim_scale(PG_FUNCTION_ARGS)
> +{
> + Numeric num = PG_GETARG_NUMERIC(0);
> + Numeric res;
> + NumericVar result;
> +
> + if (NUMERIC_IS_NAN(num))
> + PG_RETURN_NULL();
> +
> + init_var_from_num(num, &result);
> + strip_var(&result);
> +
> + result.dscale = get_min_scale(&result);
> +
> + res = make_result(&result);
> + free_var(&result);
> +
> + PG_RETURN_NUMERIC(res);
> +}
>
> /*
> ----------------------------------------------------------------------
> * diff --git a/src/include/catalog/pg_proc.dat
> b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644
>
> ****
> How about moving these new lines to right after line# 4255, the
> scale() function?
> ****
>

has sense, moved

> --- a/src/include/catalog/pg_proc.dat
> +++ b/src/include/catalog/pg_proc.dat
> @@ -4288,6 +4288,12 @@
> proname => 'width_bucket', prorettype => 'int4',
> proargtypes => 'numeric numeric numeric int4',
> prosrc => 'width_bucket_numeric' },
> +{ oid => '3434', descr => 'returns minimal scale of numeric value',
>
> ****
> How about a descr of?:
>
> minimal scale needed to store the supplied value without data loss
> ****
>

done

>
> + proname => 'minscale', prorettype => 'int4', proargtypes =>
> 'numeric',
> + prosrc => 'numeric_minscale' },
> +{ oid => '3435', descr => 'returns numeric value with minimal scale',
>
> ****
> And likewise a descr of?:
>
> numeric with minimal scale needed to represent the given value
> ****
>
> + proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
> 'numeric',
> + prosrc => 'numeric_trim_scale' },
>

done

> { oid => '1747',
> proname => 'time_pl_interval', prorettype => 'time',
> diff --git a/src/test/regress/expected/numeric.out
> b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13
> 100644
>
> ****
> I have suggestions:
>
> Give the 2 functions separate comments (-- Tests for minscale() and
> -- Tests for trim_scale())
>
> Put () after the function names in the comments
> because that's what scale() does.
>
> Move the lines so the tests are right after the tests of scale().
>
> Be explicit when testing for NULL or NaN. I don't know that this is
> consistent with the rest of the regression tests but I don't see how
> being explicit could be wrong. Otherwise NULL and NaN are output the
> same ("") and you're not really testing.
>
> So test with expressions like "foo(NULL) IS NULL" or
> "foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results.
>

ok fixed

Thank you for review - I am sending updated rebased patch. Please, update
comments freely - my language skills (about English lang) are basic.

Regards

Pavel

> ****
>
> Regards,
>
> Karl <kop(at)meme(dot)com>
> Free Software: "You don't pay back, you pay forward."
> -- Robert A. Heinlein
>

Attachment Content-Type Size
minscale-20191208.patch text/x-patch 8.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hadi Moshayedi 2019-12-08 07:40:27 Fix a comment in CreateLockFile
Previous Message Amit Kapila 2019-12-08 05:15:36 Re: Windows buildfarm members vs. new async-notify isolation test