Re: Performance aggregates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: snpe <snpe(at)infosky(dot)net>
Cc: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>, PostgreSQL-General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance aggregates
Date: 2001-05-15 19:37:25
Message-ID: 11350.989955445@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

snpe <snpe(at)infosky(dot)net> writes:
> select roba,sum(izn)
> from e_kalkn k,e_kalkns ks
> where k.id=ks.id
> group by roba
> order by roba
>
> is 2.5 times faster on one commercial database

That other DBMS is probably implementing the GROUP BY + aggregate
using a hash table to compute all the aggregates in parallel,
rather than sorting first as Postgres does. You still need to sort
in the end to meet the ORDER BY spec, but you are only sorting the
aggregate results not the inputs. Disadvantage: if there are a lot
of distinct values of ROBA then your hash table may overrun memory.

We have a TODO item to implement hashed aggregation. Teaching the
planner to make an intelligent choice between sorted and hashed
aggregation will probably be harder than the actual execution code.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Trygve Falch 2001-05-15 20:26:14 Re: SELECT from a table in another database
Previous Message Tom Lane 2001-05-15 19:32:14 Re: Re: Re: Re: What's the best front end/client under MS Windows?