Re: left outer join terrible slow compared to inner join

From: "Clay Luther" <claycle(at)cisco(dot)com>
To: "Greg Stark" <gsstark(at)mit(dot)edu>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: left outer join terrible slow compared to inner join
Date: 2003-08-28 21:59:58
Message-ID: F67EB38120F7BB4BB972C78609580207108E5F@ipcbu-exchange.amer.unity.cisco.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> That said. My first guess as to the real problem. Of the 37
> tables 36 of them
> aren't being accessed using indexes. Do you have indexes on
> the join columns?
> Perhaps you should? Postgres performs better when it has indexe

Heh...well, first let me say:

1) Our database is highly normalized.
2) All joins in the query are performed across indeces.
3) It IS a huge query.

There is a reason to the madness. Namely, this query was driven by a client application requirement. In a particular operational case, the application needed to be able to say "give me all this information now!" without making round trips to the database. The query itself has grown over the years (last time I looked at it, it was only 24 joins, not 37). But, as I said before,

1) It works
2) It works VERY fast (in SQLServer)
3) It works in production and has for years now

When I looked at the query for the first time in years last week, here is an email I sent to other people on my team:

--snip--
Here's the ugly one, the station-d statement:

How or why we ever wrote such a monster is beyond me :-)
--snip--

But, it exists, it works, and it currently not scheduled to be changed. Such is the nature of inertia.

:-)

cwl

> -----Original Message-----
> From: Greg Stark [mailto:gsstark(at)mit(dot)edu]
> Sent: Thursday, August 28, 2003 4:32 PM
> To: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
> "Clay Luther" <claycle(at)cisco(dot)com> writes:
>
> > Interstingly enough, the EXPLAIN ANALYZE itself took 90+ seconds:
>
> "explain" would produce just the plan, normally in a few ms.
> This query might
> take a while though.
>
> "explain analyze" says to produce the plan and then actually
> run the query and
> annotate the plan with the actual timing results at each
> node. Note the
> "actual time" labels on each row. So it's not suprising that
> it took 90s.
>
>
> Now, uh, there are 37 tables involved in this query. That's
> kind of a lot.
> Like, really, a lot. It's possible this is a sane, if
> extremely normalized
> design, but well, still. 37 is a big number.
>
> Postgres has to consider 37 factorial different ways of
> combining these
> tables. or about
> 13,763,750,000,000,000,000,000,000,000,000,000,000,000,000
> different combinations. That makes it harder for it to come
> up with the best
> combination. You might consider rewriting it to use the ANSI
> join syntax "LEFT
> JOIN" and "RIGHT JOIN" if you haven't already. That might help it out.
>
>
> That said. My first guess as to the real problem. Of the 37
> tables 36 of them
> aren't being accessed using indexes. Do you have indexes on
> the join columns?
> Perhaps you should? Postgres performs better when it has indexes.
>
>
> I'm a bit puzzled how one could get up to 37 tables in a
> single query other
> than just having taken normalization a bit too far. But if
> that was the
> thinking then I would expect the joins to be on the primary
> keys of all the
> tables, which would presumably have indexes. So, well, I
> guess I'll just stay
> puzzled.
>
> --
> greg
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo(at)postgresql(dot)org
>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2003-08-28 22:04:19 Re: Replication Ideas
Previous Message Williams, Travis L, NEO 2003-08-28 21:55:27 Join question