Re: Bottleneck?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Ip Wing Kin John <wkipjohn(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Bottleneck?
Date: 2009-08-10 12:02:02
Message-ID: 603c8f070908100502if38c0d9qe817c26b96006aea@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Aug 10, 2009 at 2:22 AM, Ip Wing Kin John<wkipjohn(at)gmail(dot)com> wrote:
> Hi Scott,
>
> Thanks for you suggestion. I have follow your suggestion by disable
> nestloop and have a substantial improvement. Takes 51s now. I have
> attached the new query plan in another file.
>
> What I want to ask is, is there any other way to hint the planner to
> choose to use merge join rather than nested loop by modifying my SQL?
> I did try to sort my second inner join by the join condition, but the
> planner still prefer to use nested loop.
>
> As I am afraid changing the system wide configuration will have some
> side effect on my other queries.
>
> Here is my SQL.
>
> select * from dummymediastatus rec INNER JOIN ( SELECT volumeGUID ,
> MAX(startDatetime) AS msdt FROM dummymediastatus INNER JOIN ( select *
> from getcurrentguids(1249281281666,'hardware.volume',null,null) ) AS
> cfg ON ( cfg.getcurrentguids = volumeGUID) WHERE startDatetime <=
> 1249281281666 GROUP BY volumeGUID ) AS rec2 ON (  rec.volumeGUID =
> rec2.volumeGUID AND  rec.startDatetime = rec2.msdt ) where  (  ( 1>0
> and 1>0 )  and  rec.acsGUID in ( SELECT * FROM resolve('acs0') ) )
> order by rec.startDatetime DESC,rec.id DESC;

It looks to me like a big chunk of your problem is here:

-> Function Scan on getcurrentguids (cost=0.00..260 .00 rows=1000
width=32) (actual time=977.013..997.404 rows=80000 loops=1)

The planner's estimate of the number of rows is off by a factor of 80
here. You should probably think about inlining the SQL contained
inside that function, if possible. You might also want to look at the
"rows" setting of CREATE OR REPLACE FUNCTION.

As tempting as it is to encapsulate some of your logic into a
set-returning function of some sort, as you've done here, I've found
that it tends to suck. Even if you fix the row estimate, the planner
will still estimate join selectivity etc. poorly for those rows
because, of course, there are no statistics.

...Robert

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2009-08-10 15:19:18 Re: SQL select query becomes slow when using limit (with no offset)
Previous Message Scott Marlowe 2009-08-10 07:07:40 Re: Bottleneck?