From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Possible bug in CASE evaluation |
Date: | 2013-06-25 13:01:52 |
Message-ID: | 20130625130152.GA7716@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-06-24 21:35:53 -0400, Noah Misch wrote:
> On Sat, Jun 22, 2013 at 04:54:50PM +0200, Andres Freund wrote:
> > On 2013-06-21 16:45:28 +0200, Andres Freund wrote:
> > > On 2013-06-21 09:51:05 -0400, Noah Misch wrote:
> > > > That being said, if we discover a simple-enough fix that performs well, we may
> > > > as well incorporate it.
> > >
> > > What about passing another parameter down eval_const_expressions_mutator
> > > (which is static, so changing the API isn't a problem) that basically
> > > tells us whether we actually should evaluate expressions or just perform
> > > the transformation?
> > > There's seems to be basically a couple of places where we call dangerous
> > > things:
> > > * simplify_function (via ->evaluate_function->evaluate_expr) which is
> > > called in a bunch of places
> > > * evaluate_expr which is directly called in T_ArrayExpr
> > > T_ArrayCoerceExpr
> > >
> > > All places I've inspected so far need to deal with simplify_function
> > > returning NULL anyway, so that seems like a viable fix.
> >
> > *Prototype* patch - that seems simple enough - attached. Opinions?
>
> Simple enough, yes. The other point still stands.
You mean performance? Primarily I still think we should first worry
about correctness first and then about performance. And CASE is the
documented (and really only, without writing procedual code) solution to
use for the cases where evaluation order actually *is* important.
But anyway, the question is to find realistic cases to measure the
performance of. Obviously you can just make arbitrarily expensive
expressions that can be computed full during constant folding. Which I
don't find very interesting, do you?
So, what I've done is to measure the performance difference when doing
full table queries of some CASE containing system views.
best of 5 everytime:
SELECT * FROM pg_stats;
master: 28.287 patched: 28.565
SELECT * FROM information_schema.applicable_roles;
master: 0.757 patched: 0.755
regression=# SELECT * FROM information_schema.attributes:
master: 8.392 patched: 8.555
SELECT * FROM information_schema.column_privileges;
master: 90.853 patched: 88.551
SELECT * FROM information_schema.columns;
master: 259.436 patched: 274.145
SELECT * FROM information_schema.constraint_column_usage ;
master: 14.736 patched 15.005
SELECT * FROM information_schema.parameters;
master: 76.173 patched: 79.850
SELECT * FROM information_schema.routines;
master: 45.102 patched: 46.517 ms
...
So, on those queries there's some difference (I've left out the ones
which are too short), but it's not big.
Now, for the other extreme, the following completely random query I just
typed out:
SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i THEN 2.0/3 END) f FROM generate_series(1, 1000000) g(i)) s WHERE f = 0;
master: 491.931 patched: 943.629
suffers way much worse because the division is so expensive...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2013-06-25 13:08:20 | Re: Possible bug in CASE evaluation |
Previous Message | MauMau | 2013-06-25 12:57:52 | Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) |