CROSS JOIN performance

From: Andy Chambers <achambers(at)mcna(dot)net>
To: pgsql <pgsql-general(at)postgresql(dot)org>
Subject: CROSS JOIN performance
Date: 2012-02-21 13:10:43
Message-ID: CAAfW55o8Duta-GxHqc84AE=BMZOkJe6tPbANDWZjyb7mC0o2Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

In our porting of a big mysql app to postgres, we're finding lots of
queries like

select foo
from (foo f, bar b)
left join caz c on f.id = f.caz_id
where f.id = b.foo_id

I've seen the message where Tom explains why this is invalid in ANSI
SQL so I converted it to

select foo
from foo f CROSS JOIN bar b
left join caz c on f.id = f.caz_id
where f.id = b.foo_id

...and it works. However, sometimes quite slowly. When we've looked
into the slow ones, we've found that changing it again to

select foo
from foo f INNER JOIN bar b ON f.id = b.foo_id
left join caz c on f.id = f.caz_id

makes it perform much better.

Furthermore, we're starting to find that performance of the 3rd is
significantly better than the 2nd, *ONLY* when the CROSS JOINs are
followed by more joins (like in this case). If there are no more
tables being joined, changing to the 3rd version yields no performance
gain.

Are these three queries logically equivalent (well, at least the
latter two since the first isn't valid SQL)? If so, does it make
sense that the optimizer has difficulty with the second case.

Cheers,
Andy

--
Andy Chambers

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Miller, Ian P 2012-02-21 14:19:09 Installing Tablefunc
Previous Message George Tsinarakis 2012-02-21 10:46:57 Questionnaire on motivation analysis of open source and open content