From: | Stuart Brooks <stuartb(at)cat(dot)co(dot)za> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Slow GROUP BY query |
Date: | 2008-01-29 14:43:31 |
Message-ID: | 479F3B93.9070907@cat.co.za |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
I have a very simple table set:
Transactions:
transaction_key PRIMARY KEY
client TEXT
time TIMESTAMP
LineItems
transaction_key INT
amount INT
A query to print the contents of transactions with a sum of the line
item amounts provides a very suboptimal result. The problem seems to be
the GROUP BY clause as it doesn't use the primary index. Rewriting the
query to only group on the transaction_key and returning the max of the
other transaction fields results in a query of <1ms. (see queries below)
Can anyone shed any light here, I would have expected the queries to
take roughly the same time?
Out of interest, since we are grouping by transaction_key which is
unique, surely the other Transaction fields in the group by could be
ignored by the planner?
Thanks
Stuart
(running postgresql 8.2.5 on NetBSD 3)
>> Slow query
EXPLAIN
SELECT t.transaction_key,t.cashier,t.time,SUM(l.amount)
FROM Transactions t JOIN LineItems l USING (transaction_key)
GROUP BY t.transaction_key,t.cashier,t.time
ORDER BY t.transaction_key;
QUERY PLAN
-------------------------------------------------------------------------------------------
Sort (cost=449.16..454.16 rows=2000 width=32)
Sort Key: t.transaction_key
-> HashAggregate (cost=314.50..339.50 rows=2000 width=32)
-> Hash Join (cost=66.00..262.07 rows=5243 width=32)
Hash Cond: (l.transaction_key = t.transaction_key)
-> Seq Scan on lineitems l (cost=0.00..117.43 rows=5243 width=16)
-> Hash (cost=41.00..41.00 rows=2000 width=24)
-> Seq Scan on transactions t (cost=0.00..41.00 rows=2000 width=24)
(8 rows)
>> Fast query
EXPLAIN
SELECT t.transaction_key,MAX(t.cashier),MAX(t.time),SUM(l.amount)
FROM Transactions t JOIN LineItems l USING (transaction_key)
GROUP BY t.transaction_key
ORDER BY t.transaction_key;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..459.11 rows=2000 width=32)
-> Merge Join (cost=0.00..371.68 rows=5243 width=32)
Merge Cond: (t.transaction_key = l.transaction_key)
-> Index Scan using transactions_pkey on transactions t (cost=0.00..86.25 rows=2000 width=24)
-> Index Scan using lineitems_transaction_index on lineitems l (cost=0.00..214.90 rows=5243 width=16)
(5 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Dawid Kuroczko | 2008-01-29 15:29:50 | JOINing SET returning function. |
Previous Message | Andreas Joseph Krogh | 2008-01-29 10:24:23 | Re: Slow Query problem |