From: | "Albertson, Chris" <CAlbertson(at)primeadvantage(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | "group by" is quite expensive |
Date: | 2001-05-02 20:41:30 |
Message-ID: | 71EFB3F67FADD3119C0A00508B5544419661ED@mailtest123.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
What can I do to speed up queries like the following
select count(*) from ttt group by xxx;
Using Postgres 7.1 on Solaris 8 the above query on a 1M row
database takes 6 times longer than the following
select count(*) from ttt;
With Postgres "group by" is apparently quite expensive. Using
Oracle and MySQL the same query and the same data, the first query
takes only 25% longer. I think both of these DBMSes use a single
sequential scan of the data while Postgresql (using the explain query)
uses a multi pass process. Yes I did a vacuum analyze.
Is there anything I as user can do. Build some kind of index?
In absolut times "group by" is a killer. All querries on a single
1M row table that include "group by" take about 3 minutes. Oracle
takes about 40 seconds and MySQL about 25 seconds.
Here is what EXPLAIN shows.
alberch=# explain select count(*) from tyc_main group by nphoto;
NOTICE: QUERY PLAN:
Aggregate (cost=170404.22..175695.88 rows=105833 width=2)
-> Group (cost=170404.22..173050.05 rows=1058332 width=2)
-> Sort (cost=170404.22..170404.22 rows=1058332 width=2)
-> Seq Scan on tyc_main (cost=0.00..49705.32 rows=1058332
width=2)
EXPLAIN
alberch=# explain select count(*) from tyc_main;
NOTICE: QUERY PLAN:
Aggregate (cost=52351.15..52351.15 rows=1 width=0)
-> Seq Scan on tyc_main (cost=0.00..49705.32 rows=1058332 width=0)
EXPLAIN
When I run the queries the last takes 18 sec, the first
a little over 5 _minutes_
From | Date | Subject | |
---|---|---|---|
Next Message | pgsql | 2001-05-02 21:10:14 | Security and performance |
Previous Message | Doug McNaught | 2001-05-02 20:22:06 | Re: big pg 6.5 and 7.1 problem in simple application |