Re: Query performance issue

From: "Jonathan Gray" <jgray(at)streamy(dot)com>
To: "'Chris'" <dmagick(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query performance issue
Date: 2007-07-24 09:50:18
Message-ID: 0f3101c7cdd8$0b359740$21a0c5c0$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

That particular plan is our goal because we've "hacked" it together to
perform better than the normal sql plans. Analytically it makes sense to
approach this particular problem in this way because it is relatively
invariant to the distributions and sizes of the tables (with only having to
deal with increased index size).

Also, changing around the query doesn't change the query plan at all. The
planner is intelligent enough to figure out what it really needs to join on
despite my poor query writing. I originally had it this way to ensure my
(customerid,orderstamp) conditions were in the correct order but again
appears to not matter.

I will try to get a more complex/sophisticated test case running. I'm not
able to post my actual structure or queries but I'll try to produce a better
example of the other (multiple table) case tomorrow.

Thanks.

Jonathan Gray

-----Original Message-----
From: Chris [mailto:dmagick(at)gmail(dot)com]
Sent: Tuesday, July 24, 2007 2:36 AM
To: Jonathan Gray
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Query performance issue

Jonathan Gray wrote:
> Chris,
>
> Creating indexes on the customerclass table does speed up the queries but
> still does not create the plan we are looking for (using the double index
> with a backward index scan on the orders table).

Stupid question - why is that particular plan your "goal" plan?

> The plans we now get, with times on par or slightly better than with the
> plpgsql hack, are:
>
> EXPLAIN ANALYZE
> SELECT o.orderid,o.orderstamp FROM indextest.orders o
> INNER JOIN indextest.customerclass cc ON (cc.classid = 2)
> WHERE o.customerid = cc.customerid ORDER BY o.orderstamp DESC LIMIT 5;

Didn't notice this before...

Shouldn't this be:

INNER JOIN indextest.customerclass cc ON (o.customerid = cc.customerid)
WHERE cc.classid = 2

ie join on the common field not the classid one which doesn't appear in
the 2nd table?

> As I said, this is a hypothetical test case we have arrived at that
> describes our situation as best as we can given a simple case. We're
> interested in potential issues with the approach, why postgres would not
> attempt something like it, and how we might go about implementing it
> ourselves at a lower level than we currently have (in SPI, libpq, etc).
>
> If it could be generalized then we could use it in cases where we aren't
> pulling from just one table (the orders table) but rather trying to merge,
> in sorted order, results from different conditions on different tables.
> Right now we use something like the plpgsql or plpythonu functions in the
> example and they outperform our regular SQL queries by a fairly
significant
> margin.

I'm sure if you posted the queries you are running with relevant info
you'd get some help ;)

--
Postgresql & php tutorials
http://www.designmagick.com/

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Marc Mamin 2007-07-24 14:38:43 Re: Postgres configuration for 64 CPUs, 128 GB RAM...
Previous Message Chris 2007-07-24 09:36:25 Re: Query performance issue