Re: Strange results when casting string to double

From: Carsten Klein <c(dot)klein(at)datagis(dot)com>
To: Gavan Schneider <list(dot)pg(dot)gavan(at)pendari(dot)org>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Subject: Re: Strange results when casting string to double
Date: 2022-02-17 08:16:02
Message-ID: 4b255a51-d9cd-8957-0d56-3b022ec0f3ad@datagis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Feb 17, 2022 at 00:07 Gavan Schneider wrote:

> Harking back to my long distant formative years I was taught to never ever rely on equality tests when it came to floating point values. Nothing has changed in this regard.
>
> If binary exact is part of the requirement then the answer is INTEGER or NUMERIC never FLOAT, REAL, DOUBLE or anything similar.
>
> So, assuming the trigger function is the only source of this spurious grief one could apply the lesson taught in the 60s from the dawn of computing:
> EQUALITY = absolute_value(op1 - op2) < epsilon — where op1 & op2 are DOUBLE, and epsilon is smaller than you care for
>
> Given the intrinsic (standards compliant) uncertainty when converting from absolute (e.g., string representation) to floating point there will never be value resolving why there are differences.
>
> I suggest using the comparison that is appropriate to the representation of those values or fix the design by using the proper representation.

Just some(!) background:

I know that there are other ways to compare _any_ floating point values.
However, doing that for a whole ROW in a fast manner is not trivial (e.
g. it tends to get quite slow). With the hstore extension and (so
called) binary equality I've found a very fast way which is also fully
generic. hstore uses text representation, so comparisons depend on how
values are converted to and from text.

But all that is not the point. Double precision conversion algorithms
are well defined and deterministic, so it should yield the same results
when called with the same arguments (every time and on every machine).
That is

1.56::double precision == 1.56000000000000005329070518201E0

1.56::double precision != 1.55999999999999983124610025698E0

With reduced precision (as in PG), that gives you:

1.56::double precision == 1.56

1.56::double precision != 1.55999999999999

However, one of my ProstgreSQL servers returns the latter (wrong) value.
You can test this with C library function 'strtod' or with any of the
online converters around:

https://www.binaryconvert.com/result_double.html

Click 'New Conversion' and enter 1.56 into the 'Decimal' field. Then
'Convert to binary' or hit enter.

So, the primary problem of that PostgreSQL server is, that it converts
text to double in a wrong way. Apart from any triggers, "binary
equality" and whatever else I'm doing in this project, this has dramatic
effects on the database, as it's messing up the values that I'm storing:

Imagine I do:

INSERT INTO foo (my_col) VALUES ('Hello World');

But the database has happily stored a different string:

SELECT my_col FROM foo;
my_col
-------------
Hello Worlc
(1 row)

Finding that string again may be done with "fuzzy search" or regular
expressions, but the much better approach is the database not to let
mess up the string while storing it.

Double precision values are limited in precision (in binary) and there
are numbers, that cannot be stored exactly (as it's true for many
decimal numbers, like 1/3 ~ 0.33333 as well). Nevertheless, with a given
maximum of precision, the same values should have the same (well
defined) binary value so that conversions between text and double should
not change the value at any time on any machine.

Carsten

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2022-02-17 08:41:16 Re: Strange results when casting string to double
Previous Message Gavan Schneider 2022-02-16 23:07:14 Re: Strange results when casting string to double