From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bill Thoen <bthoen(at)gisnet(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Searching for Duplicates and Hosed the System |
Date: | 2007-08-21 05:04:55 |
Message-ID: | 21792.1187672695@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Bill Thoen <bthoen(at)gisnet(dot)com> writes:
> I ran VACUUM ANALYZE just before I launched this and there were no other
> postgress jobs running. I'm the only user as well. I also ran EXPLAIN
> prior to the run and got this:
> Nested Loop (cost=11.71..28800.34 rows=7219 width=584)
> -> Seq Scan on ers_regions e (cost=0.00..71.90 rows=16 width=28)
> Filter: (region = 1)
> -> Bitmap Heap Scan on compliance_2006 a (cost=11.71..1788.76
> rows=451 width=584)
> Recheck Cond: ((a.fips_st_cd = "outer".fips_st) AND
> (a.fips_cnty_cd = "outer".fips_cou))
> -> Bitmap Index Scan on key_tract (cost=0.00..11.71 rows=451
> width=0)
> Index Cond: ((a.fips_st_cd = "outer".fips_st) AND
> (a.fips_cnty_cd = "outer".fips_cou))
> (7 rows)
Do those estimated row counts look sane to you --- in particular the
estimate of 7219 rows out of the whole query?
AFAICS a plan of this shape simply cannot consume a huge amount of
memory on the server, no matter how badly off the rowcount estimates
are. However, if it sends enough rows to the client, the *client* side
could be having a problem absorbing the data. libpq is designed to
try to absorb the full result rowset --- the API it provides is not
amenable to partial result sets.
> I looked in the log and saw this:
> LOG: transaction ID wrap limit is 1073746500, limited by database
> "postgres"
> LOG: transaction ID wrap limit is 1073746500, limited by database
> "postgres"
> LOG: could not send data to client: Broken pipe
Those first two messages are unrelated --- they are just routine
autovacuum output. The third one says that a client process crashed.
So now I'm thinking that the memory-overrun problem was on the client
side.
If you need to deal with very large result sets, the standard advice
is to use a cursor so you can pull a few hundred or thousand rows
at a time via FETCH.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Trinath Somanchi | 2007-08-21 08:46:36 | Pgcluster 1.7 Fail safe !!! |
Previous Message | Pavel Stehule | 2007-08-21 05:02:04 | Re: Automated testing of functions |