From: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
---|---|
To: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> |
Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: slow count in window query |
Date: | 2009-07-29 17:20:17 |
Message-ID: | e08cc0400907291020v7381fa35j18dc1eccd01352c7@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2009/7/18 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
> If I understand exlain analyze correctly and it tells us the fact,
> WindowAgg without ORDER BY clause gets unreasonably slow. Let me see.
>
I haven't determined the difference between with and without ORDER BY
clause in OVER(), but I took a benchmark that throws an interesting
result.
$ bin/psql regression -c 'explain analyze select count(*) over() from x'
QUERY PLAN
--------------------------------------------------------------------------------
--------------------------------
WindowAgg (cost=0.00..2741.00 rows=100000 width=0) (actual time=3725.294..4559
.828 rows=100000 loops=1)
-> Seq Scan on x (cost=0.00..1491.00 rows=100000 width=0) (actual time=0.11
2..310.349 rows=100000 loops=1)
Total runtime: 4811.115 ms
(3 rows)
The query is quite slow because profiling hook function calls
gettimeofday() each time. And here's the result that counted up
eval_windowaggregate() call and its children functions. Elapse time is
in second and it is subtracted with total gettimeofday() overhead.
eval_windowaggregates:
Count 100000
Elapse 0.588426
Address |Name |Count |Elapse(Total)
0x8204067|initialize_windowaggregate | 1| 0.000277
0x8204d4a|spool_tuples |100002| 0.620092
0x83dcd08|tuplestore_select_read_pointer|100001| 0.011080
0x83dda2f|tuplestore_gettupleslot |100001| 0.049005
0x8204fdd|row_is_in_frame |100000| 0.014978
0x8204168|advance_windowaggregate |100000| 0.025675
0x81ead8a|ExecClearTuple |100000| 0.022105
0x8204462|finalize_windowaggregate | 1| 0.000015
0x8204120|MemoryContextSwitchTo | 2| 0.000000
spool_tuples() is dominant in eval_windowaggregates(). I think it is
not needed if the query contains only simple aggregate like count(*)
OVER () but currently we copy all the rows from the source table to
tuplestore. Even if it fits in memory, the copy operation costs too
much.
I am thinking about how to avoid unnecessary copy overhead...
Regards,
---
Hitoshi Harada
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Field | 2009-07-29 17:52:56 | 8.4 win32 shared memory patch |
Previous Message | decibel | 2009-07-29 17:16:48 | Re: [RFC] new digest datatypes, or generic fixed-len hex types? |