From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Speeding up ExecProject |
Date: | 2009-04-01 21:51:35 |
Message-ID: | 20179.1238622695@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I got interested in why trivial window functions looked really slow,
when using this example in the regression database:
regression=# explain analyze select *, row_number() over (order by unique2) from tenk1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.00..781.25 rows=10000 width=244) (actual time=0.036..28.923 rows=10000 loops=1)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..631.25 rows=10000 width=244) (actual time=0.023..5.727 rows=10000 loops=1)
Total runtime: 30.423 ms
(3 rows)
(Note: all examples in this post are the median of three tries, since
the timings are a bit noisy. That means the data is fully cached.
This is CVS HEAD on Fedora 10 x86_64.)
Profiling soon revealed that the bulk of the runtime was going into
ExecProject() and subsidiary expression-evaluation functions. In fact,
the problem can be illustrated without any window functions at all:
regression=# explain analyze select * from tenk1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.008..2.250 rows=10000 loops=1)
Total runtime: 3.652 ms
(2 rows)
regression=# explain analyze select *,1 from tenk1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.011..13.232 rows=10000 loops=1)
Total runtime: 14.602 ms
(2 rows)
The above does not show that ExecEvalConst is 5x slower than the rest
of the query put together. Rather, what is happening is that the moment
the query targetlist contains anything but Vars, we stop using the
optimized ExecVariableList() code and fall back on the generic code in
ExecTargetList(). There are rather a lot of columns in tenk1 (16 to be
exact) and so this is a big cost.
It occurred to me that we could refactor ExecProject and associated
logic so that we use ExecVariableList-like code for all simple Vars in a
targetlist, and only invoke the full expression evaluation machinery for
the non-Var members of a tlist (if any). The attached patch prototypes
this idea (it's code-complete, but the comments suck...). With the
patch, adding ,1 costs a lot less:
regression=# explain analyze select * from tenk1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.009..2.345 rows=10000 loops=1)
Total runtime: 3.780 ms
(2 rows)
regression=# explain analyze select *,1 from tenk1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) (actual time=0.010..5.949 rows=10000 loops=1)
Total runtime: 7.396 ms
(2 rows)
and the original example looks better too:
regression=# explain analyze select *, row_number() over (order by unique2) from tenk1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.00..781.25 rows=10000 width=244) (actual time=0.035..18.727 rows=10000 loops=1)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..631.25 rows=10000 width=244) (actual time=0.025..5.146 rows=10000 loops=1)
Total runtime: 20.228 ms
(3 rows)
I may be overreacting to the EXPLAIN ANALYZE timings. In practice,
if we were actually returning so many columns to the client, I/O
conversions and data transmission would exceed the ExecProject effort
by a significant margin. Still, there are lots of cases where
targetlists contain a lot of simple Vars, so this looks like it's
probably worth doing. Comments/objections?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 17.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Guillaume Smet | 2009-04-01 22:05:04 | Re: [Snowball-discuss] Snowball release cycle ? |
Previous Message | Kevin Grittner | 2009-04-01 21:44:53 | Re: [Snowball-discuss] Snowball release cycle ? |