Re: multiple joins + Order by + LIMIT query performance issue

From: Shaun Thomas <sthomas(at)leapfrogonline(dot)com>
To: Antoine Baudoux <ab(at)taktik(dot)be>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: multiple joins + Order by + LIMIT query performance issue
Date: 2008-05-06 16:43:40
Message-ID: 1210092220.14833.33.camel@berners-lee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2008-05-06 at 16:03 +0100, Antoine Baudoux wrote:

> My understanding is that in the first case the sort is
> done after all the table joins and filtering, but in the
> second case ALL the rows in t_event are scanned and sorted
> before the join.

You've actually run into a problem that's bitten us in the ass a couple
of times. The problem with your second query is that it's *too*
efficient. You'll notice the first plan uses a bevy of nest-loops,
which is very risky if the row estimates are not really really accurate.
The planner says "Hey, customer_id=1 could be several rows in the
t_network table, but not too many... I better check them one by one."
I've turned off nest-loops sometimes to avoid queries that would run
several hours due to mis-estimation, but it looks like yours was just
fine.

The second query says "Awesome! Only one network... I can just search
the index of t_event backwards for this small result set!"

But here's the rub... try your query *without* the limit clause, and you
may find it's actually faster, because the planner suddenly thinks it
will have to scan the whole table, so it choses an alternate plan
(probably back to the nest-loop). Alternatively, take off the order-by
clause, and it'll remove the slow backwards index-scan.

I'm not sure what causes this, but the problem with indexes is that
they're not necessarily in the order you want unless you also cluster
them, so a backwards index scan is almost always the wrong answer.
Personally I consider this a bug, and it's been around since at least
the 8.1 tree. The only real answer is that you have a fast version of
the query, so try and play with it until it acts the way you want.

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steve Atkins 2008-05-06 16:56:16 Re: What constitutes a complex query
Previous Message Shaun Thomas 2008-05-06 16:43:29 Re: need to speed up query