Re: Slow Multi-joins performance [DEVELOPERS attn please]

From: "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: jlparkinson(at)bigpond(dot)com, pgsql-sql(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Slow Multi-joins performance [DEVELOPERS attn please]
Date: 2002-09-09 16:37:16
Message-ID: 20020909163716.GB19968@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Richard -
Your analysis of this looks right on, to me. With current code,
if you put in explicit JOINS, the table get joined in that order,
no questions. By specifying an all JOIN version, you've made the
optimizers job very easy: only one plan to consider.

Your point about realistic data and complexity of queries is a good one.
There has been some recent work on doing something to cache query plans,
so if the same query gets run a lot, you only pay the planning cost a
few times. Not sure hoe much of that code (if any) made it into 7.3.

As an aside, the EXPLAIN text shows row estimates of 10, when we _know_
the tables have 1 row each, so VACUUM ANALYZE needs to be run. Doing
so (on a 7.1.2 datbse, BTW) cuts the measured execution time in half
(though not to instantanious, since planning still occurs) Letting the
planner/optimzer know as much as possible is almost always a good thing.

Ross

On Mon, Sep 09, 2002 at 04:24:08PM +0100, Richard Huxton wrote:
> On Friday 06 Sep 2002 11:59 am, jlparkinson(at)bigpond(dot)com wrote:

<snip artificial 13 way join example>

> Interesting - I get something similar here. If I rewrite the view with
> explicit joins as below:
>
> SELECT t.id, a.name AS ta, b.name AS tb ... FROM t JOIN a ON t.ta=a.id JOIN b
> ON t.tb=b.id ...
>
> it returns instantly. Running an EXPLAIN ANALYSE, both have similar query
> plans with twelve nested joins and 13 seq scans (as you'd expect for tables
> with 1 row each). The only apparent difference is the order of the seq scans.
> The best bit is the
>
> Total runtime: 4.32 msec (original)
> Total runtime: 5.32 msec (explicit JOINs)
>
> Which says to me that your form is fine. Testing says otherwise, so there must
> be some element of the query that is not being accounted for in EXPLAIN
> ANALYSE. Your log shows the genetic algorithm (geqo_main line) kicking in
> because it sees a complex query and it could be that this is the problem -
> PostgreSQL takes a look at the 13-way join and thinks it's going to be very
> expensive. If you had a genuinely complex query, the time to analyse options
> would be a benefit, but here I'm guessing it's not. Perhaps try it with
> increasing amounts of data and more restrictions and see if performance stays
> constant.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Peter Atkins 2002-09-09 16:49:38 Changing Column Type
Previous Message Masaru Sugawara 2002-09-09 16:23:29 Re: the best way to get the topest 3 record in every group