Quick Links

Re: Searching for Duplicates and Hosed the System

From:	Bill Thoen <bthoen(at)gisnet(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Searching for Duplicates and Hosed the System
Date:	2007-08-19 19:40:18
Message-ID:	20070819194018.GA16687@www.gisnet.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Tom, here's the "explain" results: Does this help explain what went wrong?
(And yes, I think there will be a *lot* of groups.)

explain select count(*) as count,fips_st_cd, fips_cnty_cd, farm_nbr,
tract_nbr, field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr from
compliance_2006 group by fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr,
field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr order by 1 desc;

QUERY PLAN
--------------------------------------------------------
Sort (cost=15119390.46..15123902.54 rows=1804832 width=160)
Sort Key: count(*)
-> GroupAggregate (cost=13782933.29..14301822.43 rows=1804832
width=160)
-> Sort (cost=13782933.29..13828054.08 rows=18048318 width=160)
Sort Key: fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr,
field_nbr, crop_cd, crop_status_cd, practice_cd, seq_nbr
-> Seq Scan on compliance_2006 (cost=0.00..1039927.18
rows=18048318 width=160)
(6 rows)

On Sun, Aug 19, 2007 at 01:19:51PM -0400, Tom Lane wrote:
> Bill Thoen <bthoen(at)gisnet(dot)com> writes:
> > I knew this would take some time, but what I didn't expect was that about
> > an hour into the select, my mouse and keyboard locked up and also I
> > couldn't log in from another computer via SSH. This is a Linux machine
> > running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on
> > the disc too.
>
> > I finally had to shut the power off and reboot to regain control of my
> > computer (that wasn't good idea, either, but eventually I got everything
> > working again.)
>
> I've seen Fedora go nuts like that when it ran out of memory. Once it
> starts to swap heavily, performance goes into the tank; and once the
> kernel realizes it's in memory trouble, it starts to kill processes
> more or less at random. That might explain why ssh stopped working.
>
> One thing to do to make it more robust is to disable memory overcommit.
> I suspect also that configuring it with lots of swap space is
> counterproductive, because that just encourages the kernel to allow lots
> of swapping. I haven't actually experimented with that part though.
>
> As for why PG ran the system out of memory, I suspect that the planner
> drastically underestimated the number of groups to be created by your
> GROUP BY, and thought it could get away with a hash aggregation. We
> don't currently have any provision for spilling hash aggregation to
> disk, so if there's a very large number of groups the table just gets
> very big :-(. The planner is not supposed to choose hash agg if the
> estimated table size exceeds work_mem ... but if it had out-of-date
> statistics to work with it might have gotten the wrong answer. Have
> you ANALYZEd this table recently? What does EXPLAIN show as the
> estimated number of result rows?
>
> regards, tom lane
>

In response to

Re: Searching for Duplicates and Hosed the System at 2007-08-19 17:19:51 from Tom Lane

Responses

Re: Searching for Duplicates and Hosed the System at 2007-08-20 03:30:24 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Michael Glaesemann	2007-08-19 20:22:33	Re: entry log
Previous Message	Tino Wildenhain	2007-08-19 19:15:12	Re: Transactional DDL