Re: Searching for Duplicates and Hosed the System

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bill Thoen <bthoen(at)gisnet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Searching for Duplicates and Hosed the System
Date: 2007-08-19 17:19:51
Message-ID: 19142.1187543991@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Bill Thoen <bthoen(at)gisnet(dot)com> writes:
> I knew this would take some time, but what I didn't expect was that about
> an hour into the select, my mouse and keyboard locked up and also I
> couldn't log in from another computer via SSH. This is a Linux machine
> running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on
> the disc too.

> I finally had to shut the power off and reboot to regain control of my
> computer (that wasn't good idea, either, but eventually I got everything
> working again.)

I've seen Fedora go nuts like that when it ran out of memory. Once it
starts to swap heavily, performance goes into the tank; and once the
kernel realizes it's in memory trouble, it starts to kill processes
more or less at random. That might explain why ssh stopped working.

One thing to do to make it more robust is to disable memory overcommit.
I suspect also that configuring it with lots of swap space is
counterproductive, because that just encourages the kernel to allow lots
of swapping. I haven't actually experimented with that part though.

As for why PG ran the system out of memory, I suspect that the planner
drastically underestimated the number of groups to be created by your
GROUP BY, and thought it could get away with a hash aggregation. We
don't currently have any provision for spilling hash aggregation to
disk, so if there's a very large number of groups the table just gets
very big :-(. The planner is not supposed to choose hash agg if the
estimated table size exceeds work_mem ... but if it had out-of-date
statistics to work with it might have gotten the wrong answer. Have
you ANALYZEd this table recently? What does EXPLAIN show as the
estimated number of result rows?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Harpreet Dhaliwal 2007-08-19 17:25:48 Re: Transactional DDL
Previous Message Webb Sprague 2007-08-19 17:16:30 Re: Postgresql performance in production environment