Searching for Duplicates and Hosed the System

From: Bill Thoen <bthoen(at)gisnet(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Searching for Duplicates and Hosed the System
Date: 2007-08-19 16:44:51
Message-ID: 20070819164450.GA15623@www.gisnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm new to PostgreSQL and I ran into problem I don't want to repeat. I have
a database with a little more than 18 million records that takes up about
3GB. I need to check to see if there are duplicate records, so I tried a
command like this:

SELECT count(*) AS count, fld1, fld2, fld3, fld4 FROM MyTable
GROUP BY fld1, fld2, fld3, fld4
ORDER BY 1 DESC;

I knew this would take some time, but what I didn't expect was that about
an hour into the select, my mouse and keyboard locked up and also I
couldn't log in from another computer via SSH. This is a Linux machine
running Fedora Core 6 and PostgresQL is 8.1.4. There's about 50GB free on
the disc too.

I finally had to shut the power off and reboot to regain control of my
computer (that wasn't good idea, either, but eventually I got everything
working again.)

Is this normal behavior by PG with large databases? Did I misconfigure
something? Does anyone know what might be wrong?

- Bill Thoen

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2007-08-19 16:46:21 Re: WAITING in PG_STATS_ACTIVITY
Previous Message Phoenix Kiula 2007-08-19 15:38:50 Re: posgres tunning