500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

From: El-Lotso <el(dot)lotso(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: el(dot)lotso(at)gmail(dot)com
Subject: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
Date: 2007-09-11 16:57:48
Message-ID: 1189529868.17184.17.camel@neuromancer.home.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

appreciate if someone can have some pointers for this.

PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD

3 mail tables which has already been selected "out" into separate tables
(useing create table foo as select * from foo_main where x=y)

These test tables containing only a very small subset of the main data's
table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table)

table definitions and actual query are attached. (names has been altered
to protect the innocent)

I've played around with some tweaking of the postgres.conf setting per
guidance from jdavis (in irc) w/o much(any) improvement. Also tried
re-writing the queries to NOT use subselects (per depesz in irc also)
also yielded nothing spectacular.

The only thing I noticed was that when the subqueries combine more than
3 tables, then PG will choke. If only at 3 joined tables per subquery,
the results come out fast, even for 6K rows.

but if the subqueries (these subqueries by itself, executes fast and
returns results in 1 to 10secs) were done independently and then placed
into a temp table, and then finally joined together using a query such
as

select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x
= y)

then it would also be fast

work_mem = 8MB / 32MB /128MB (32 MB default in my setup)
effective_Cache_size = 128MB/500MB (500 default)
shared_buffers = 200MB
geqo_threshold = 5 (default 12)
geqo_effort = 2 (default 5)
ramdom_page_cose = 8.0 (default 4)
maintenance_work_mem = 64MB
join_collapse_limit = 1/8/15 (8 default)
from_collapse_limit = 1/8/15 (8 default)
enable_nestloop = f (on by default)

based on current performance, even with a small number of rows in the
individual tables (max 20k), I can't even get a result out in 2 hours.
(> 3 tables joined per subquery) which is making me re-think of PG's
useful-ness.

BTW, I also tried 8.2.4 CVS_STABLE Branch

Attachment Content-Type Size
PG_long_running_query.txt text/plain 12.0 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message El-Lotso 2007-09-11 17:02:23 Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
Previous Message Kevin Kempter 2007-09-11 16:24:58 More Vacuum questions...