Quick Links

Re: Searching for Duplicates and Hosed the System

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bill Thoen <bthoen(at)gisnet(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Searching for Duplicates and Hosed the System
Date:	2007-08-21 05:04:55
Message-ID:	21792.1187672695@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Bill Thoen <bthoen(at)gisnet(dot)com> writes:
> I ran VACUUM ANALYZE just before I launched this and there were no other
> postgress jobs running. I'm the only user as well. I also ran EXPLAIN
> prior to the run and got this:

> Nested Loop (cost=11.71..28800.34 rows=7219 width=584)
> -> Seq Scan on ers_regions e (cost=0.00..71.90 rows=16 width=28)
> Filter: (region = 1)
> -> Bitmap Heap Scan on compliance_2006 a (cost=11.71..1788.76
> rows=451 width=584)
> Recheck Cond: ((a.fips_st_cd = "outer".fips_st) AND
> (a.fips_cnty_cd = "outer".fips_cou))
> -> Bitmap Index Scan on key_tract (cost=0.00..11.71 rows=451
> width=0)
> Index Cond: ((a.fips_st_cd = "outer".fips_st) AND
> (a.fips_cnty_cd = "outer".fips_cou))
> (7 rows)

Do those estimated row counts look sane to you --- in particular the
estimate of 7219 rows out of the whole query?

AFAICS a plan of this shape simply cannot consume a huge amount of
memory on the server, no matter how badly off the rowcount estimates
are. However, if it sends enough rows to the client, the *client* side
could be having a problem absorbing the data. libpq is designed to
try to absorb the full result rowset --- the API it provides is not
amenable to partial result sets.

> I looked in the log and saw this:
> LOG: transaction ID wrap limit is 1073746500, limited by database
> "postgres"
> LOG: transaction ID wrap limit is 1073746500, limited by database
> "postgres"
> LOG: could not send data to client: Broken pipe

Those first two messages are unrelated --- they are just routine
autovacuum output. The third one says that a client process crashed.
So now I'm thinking that the memory-overrun problem was on the client
side.

If you need to deal with very large result sets, the standard advice
is to use a cursor so you can pull a few hundred or thousand rows
at a time via FETCH.

regards, tom lane

In response to

Re: Searching for Duplicates and Hosed the System at 2007-08-20 23:08:35 from Bill Thoen

Responses

Re: Searching for Duplicates and Hosed the System at 2007-08-23 19:11:48 from Decibel!

Browse pgsql-general by date

	From	Date	Subject
Next Message	Trinath Somanchi	2007-08-21 08:46:36	Pgcluster 1.7 Fail safe !!!
Previous Message	Pavel Stehule	2007-08-21 05:02:04	Re: Automated testing of functions