From: | Richard Huxton <dev(at)archonet(dot)com> |
---|---|
To: | philb(at)vodafone(dot)ie |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Query plan for very large number of joins |
Date: | 2005-06-02 16:02:28 |
Message-ID: | 429F2D94.3030805@archonet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
philb(at)vodafone(dot)ie wrote:
> Hi,
>
> I am using PostgreSQL (7.4) with a schema that was generated
> automatically (using hibernate). The schema consists of about 650
> relations. One particular query (also generated automatically)
> consists of left joining approximately 350 tables.
May I be the first to offer an "ouch"!
> At this stage, most tables are empty and those with values have less
> than 50 entries. The query takes about 90 seconds to execute (on a
> P4, 2.6Ghz).
>
> All of the relations have a primary key which is indexed and all of
> the joins are on foreign keys which are explicitly declared. I've
> checked the obvious tunables (effective_cache_size, shared_memory and
> sort_buffer) but changing these has had no effect. The system has a
> total of 750MB RAM, I've varied the shared memory up to 256MB and the
> sort buffer up to 128MB without affecting the performance.
The sort-mem is the only thing I can see helping with a single query.
> Running the query as a JDBC prepared statement indicates that the
> query optimiser is spending a negligable amount of time on the task
> (~ 36 ms) compared to the executor (~ 90 seconds). The output of
> EXPLAIN indicates (AFAICT) that all of the joins are of type "Nested
> Loop Left Join" and all of the scans are of type "Seq Scan". I have
> refrained from posting the query and the query plan since these are
> 80K and 100K apiece but if anyone wants to see them I can certainly
> forward them on.
Well, if most tables are small then a seq-scan makes sense. Does it look
like it's estimating the number of rows badly anywhere? I'm not sure the
list will accept attachments that large - is it possible to upload them
somewhere accessible?
> My (uninformed) suspicion is that the optimiser has failed over to
> the default plan on the basis of the number of tables in the join. My
> question is, is there anyone out there using PostgreSQL with this
> size of schema? Is there anything that can be done to bring about the
> order of magnitude increase in speed that I need?
Well - the genetic planner must surely be kicking in here (see the
run-time configuration chapter of the manuals, query-planning,
geqo_threshold). However, I'm not sure how much leeway there is in
planning a largely left-joined query.
It could be there's some overhead in the executor that's only noticable
with hundreds of tables involved, you're running at about 0.25 secs per
join.
I take it you have no control over the schema or query, so there's not
much fiddling you can do. You've tried sort_mem, so there are only two
things I can think of:
1. Try the various enable_xxx config settings and see if disabling
seq-scan or the relevant join-type does anything (I'm not sure it will)
2. Try against 8.0 - there may be some improvement there.
Other people on this list have experience on larger systems than me, so
they may be able to help more.
--
Richard Huxton
Archonet Ltd
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew T. O'Connor | 2005-06-02 16:10:27 | Re: How to avoid database bloat |
Previous Message | ohp | 2005-06-02 15:57:04 | Re: SURVEY: who is running postgresql on 8 or more CPUs? |