From: | Richard Huxton <dev(at)archonet(dot)com> |
---|---|
To: | Joseph Shraibman <jks(at)selectacast(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: swap storm created by 8.2.3 |
Date: | 2007-05-25 19:04:07 |
Message-ID: | 46573327.3070803@archonet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Joseph Shraibman wrote:
>
>
> Richard Huxton wrote:
>> Joseph Shraibman wrote:
>>>>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>>>>> and it grew to be 10gig (as reported by top)
>>>>
>>>> What was the real query?
>>>
>>> First I selected 90634 rows (3 ints) into the first temp table, then
>>> I did "select intfield1, count(intfield2) FROM realtable rt WHERE
>>> rt.id = temptable.id and other conditions on rt here GROUP BY
>>> intfield1". The size of the second temp table should have been no
>>> more than 60000 rows.
>>
> <SNIP>
>>
>> If your memory settings in postgresql.conf are reasonable (and they
>> look fine), this shouldn't happen. Let's see if an EXPLAIN sheds any
>> light.
>>
> => explain SELECT ml.uid, count(ml.jid) AS cnt INTO TEMP tempml FROM ml
> WHERE ml.jid = tempjr1180108653561.id AND ml.status IN(2,5,20) GROUP BY
> ml.uid;
> NOTICE: adding missing FROM-clause entry for table "tempjr1180108653561"
I'm guessing this is just a typo from your test and you'd normally
mention the temp-table.
> LINE 2: ...INTO TEMP tempml FROM ml WHERE ml.jid = tempjr1180...
> ^
> QUERY PLAN
> ------------------------------------------------------------------------------------------
>
> HashAggregate (cost=11960837.72..11967601.06 rows=541067 width=8)
> -> Hash Join (cost=9675074.94..11849780.55 rows=22211434 width=8)
Here you seem to have 22 million rows estimated for your join.
> Hash Cond: (tempjr1180108653561.id = ml.jid)
> -> Seq Scan on tempjr1180108653561 (cost=0.00..31.40
> rows=2140 width=4)
Is the 2140 rows here a good estimate?
> -> Hash (cost=6511767.18..6511767.18 rows=181979021 width=8)
> -> Seq Scan on ml (cost=0.00..6511767.18 rows=181979021
> width=8)
OK, so the 22 million matches is because "ml" has 181 million rows. Is
that right too?
> Filter: (status = ANY ('{2,5,20}'::integer[]))
Overall it's estimating about 9 times the number of rows you were
expecting (541000 vs 60000). Not enough to account for your extreme
memory usage.
Let's see if that hash-join is really the culprit. Can you run EXPLAIN
and then EXPLAIN ANALYSE on the query, but first issue:
SET enable_hashjoin=off;
If that make little difference, try the same with enable_hashagg.
--
Richard Huxton
Archonet Ltd
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-05-25 19:40:14 | Re: swap storm created by 8.2.3 |
Previous Message | Ron Johnson | 2007-05-25 18:41:51 | Re: 2 instance of postgres service running against same db files? |