From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |
Date: | 2013-12-26 12:36:49 |
Message-ID: | CAApHDvo8VkzBtD4X2=0krTaE5MPm8wZS9P=Z+ede63tHVkUt1A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 15, 2013 at 2:27 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 12/14/2013 05:00 PM, Tom Lane wrote:
> > This consideration also makes me question whether we should apply the
> > method for NUMERIC. Although in principle numeric addition/subtraction
> > is exact, such a sequence could leave us with a different dscale than
> > is returned by the existing code. I'm not sure if changing the number of
> > trailing zeroes is a big enough behavior change to draw complaints.
>
> If we're going to disqualify NUMERIC too, we might as well bounce the
> feature. Without a fast FLOAT or NUMERIC, you've lost most of the
> target audience.
>
> I think even the FLOAT case deserves some consideration. What's the
> worst-case drift? In general, folks who do aggregate operations on
> FLOATs aren't expecting an exact answer, or one which is consistent
> beyond a certain number of significant digits.
>
> And Dave is right: how many bug reports would we get about "NUMERIC is
> fast, but FLOAT is slow"?
>
>
From what I can tell adding an inverse transition function to support AVG
for numeric does not affect the number of trailing zeros in the results, so
I've attached a patch which now has inverse transition functions to support
numeric types in AVG and all of the STDDEV* aggregates.
The function numeric_avg_accum_inv is the inverse transition function for
AVG. In the attached patch I've set this to be the inverse transition
function for SUM too. I know it was mentioned that having extra trailing
zeros here is a change of results and could be unwanted, but I've left it
in for now and I've included a failing regression test to demonstrate
exactly what has changed in the hope that it may seed a discussion on what
the best solution is.
Here's a quick example of what I'm talking about:
With this query I'll include a call to MAX to force the executor to not use
inverse transitions.
SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING),
MAX(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED
FOLLOWING)
FROM (VALUES(1,1.00000000000001),(2,2),(3,3)) v(i,n);
sum | max
------------------+-----
6.00000000000001 | 3
5 | 3
3 | 3
(3 rows)
Notice lack of trailing zeros for the sum results in rows 2 and 3.
Here an inverse transition will be performed... Notice the extra trailing
zeros on rows 2 and 3.
SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND
UNBOUNDED FOLLOWING)
FROM (VALUES(1,1.00000000000001),(2,2),(3,3)) v(i,n);
sum
------------------
6.00000000000001
5.00000000000000
3.00000000000000
(3 rows)
My ideas so far on what to do about this:
1. Don't allow inverse transition for SUM with numeric, but
since numeric_avg_accum_inv is already written for AVG numeric support we
could include a note in the docs to tell users how to create a SUM
aggregate for numeric that supports inverse transitions.
2. Don't worry about the trailing zeros... After all SELECT AVG(n) FROM
(VALUES(2::numeric)) v(n); adds a whole bunch of trailing zeros. Only the
above 2 queries shows that this behaviour can be a bit weird.
3. Invent some way to remove trailing zeros, perhaps in numeric_out. Is
there a reason they are there? Note that this would affect numeric globally.
What I don't currently know is: Are there any rules about trailing zeros on
numeric? I see that AVG with a single numeric produces many trailing zeros
after the decimal point. Are these meant to be there or is it just a side
effect of dividing on numerics?
Please note that I'm not trying to push for any of the above points. I just
want to get the information I currently have out there as perhaps someone
can think of something better or show a good reason for or against any of
the 3 points.
Regards
David Rowley
--
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
Attachment | Content-Type | Size |
---|---|---|
inverse_transition_functions_v1.7.patch.gz | application/x-gzip | 18.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Marko Kreen | 2013-12-26 13:42:12 | Fix memset usage in pgcrypto |
Previous Message | David Rowley | 2013-12-26 09:32:06 | [PATCH] Regression tests in windows ignore white space |