Quick Links

Re: BUG #2225: Backend crash -- BIG table

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc:	Patrick Rotsaert <patrick(dot)rotsaert(at)arrowup(dot)be>, pgsql-bugs(at)postgresql(dot)org
Subject:	Re: BUG #2225: Backend crash -- BIG table
Date:	2006-02-03 17:42:56
Message-ID:	27525.1138988576@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
>> pointspp=# explain select trid, count(*) from pptran group by trid
>> having count(*) > 1;
>> QUERY PLAN
>> --------------------------------------------------------------------------
>> HashAggregate (cost=1311899.28..1311902.78 rows=200 width=18)
>> Filter: (count(*) > 1)
>> -> Seq Scan on pptran (cost=0.00..1039731.02 rows=36289102 width=18)
>> (3 rows)

>>> Failing that, how many rows should the above return?

>> That is exactly what I am trying to find out. I can only guess that, but
>> it should not be more than a couple of 10k rows.

The problem is that the HashAgg will have to maintain a counter for
every distinct value of trid, not just those that occur more than
once. So if there are a huge number of one-time-only values you could
still blow out memory (and HashAgg doesn't currently know how to spill
to disk).

That "rows=200" estimate looks suspiciously like a default. Has this
table been ANALYZEd recently? I'd expect the planner not to choose
HashAgg if it has a more realistic estimate of the number of groups.

regards, tom lane

In response to

Re: BUG #2225: Backend crash -- BIG table at 2006-02-03 17:34:49 from Stephan Szabo

Responses

Re: BUG #2225: Backend crash -- BIG table at 2006-02-03 18:38:04 from Patrick Rotsaert

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Patrick Rotsaert	2006-02-03 18:27:27	Re: BUG #2225: Backend crash -- BIG table
Previous Message	Stephan Szabo	2006-02-03 17:34:49	Re: BUG #2225: Backend crash -- BIG table