Re: Join query on 1M row table slow

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: CSN <cool_screen_name90001(at)yahoo(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Join query on 1M row table slow
Date: 2004-02-10 22:20:06
Message-ID: Pine.LNX.4.33.0402101515350.29717-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well, it looks like the number of rows estimate for the nested loop in the
first query and the hash agg in the second are off by a factor 3 for the
first query, and a factor of 20 for the second. Try running number 1 with
set enable_nestloop = off
and see if the first one gets faster.

You might also wanna try turning off hash aggregate on the second one and
see how that works.

upping the analyze target on those two tables may help a bit too.

On Tue, 10 Feb 2004, CSN wrote:

>
> Doh! Yeah, now I remember ;)
>
> QUERY 1:
>
> => explain analyze select p.* from product_categories
> pc inner join products p on pc.product_id = p.id where
> pc.category_id = 1016 order by p.title limit 25 offset
> 0;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=9595.99..9596.05 rows=25 width=290)
> (actual time=514.808..514.942 rows=25 loops=1)
> -> Sort (cost=9595.99..9598.45 rows=986
> width=290) (actual time=514.794..514.868 rows=25
> loops=1)
> Sort Key: p.title
> -> Nested Loop (cost=0.00..9546.96 rows=986
> width=290) (actual time=0.672..421.732 rows=2358
> loops=1)
> -> Index Scan using idx_pc_category_id
> on product_categories pc (cost=0.00..3607.28 rows=986
> width=4) (actual time=0.343..125.762 rows=2358
> loops=1)
> Index Cond: (category_id = 1016)
> -> Index Scan using pkey_products_id
> on products p (cost=0.00..6.01 rows=1 width=290)
> (actual time=0.075..0.083 rows=1 loops=2358)
> Index Cond: ("outer".product_id =
> p.id)
> Total runtime: 516.174 ms
> (9 rows)
>
>
> QUERY 2:
>
> => explain analyze select p.* from products p where
> p.id in ( select product_id from product_categories pc
> where pc.category_id = 1016) order by p.title limit 25
> offset 0;
>
> QUERY PLAN
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=4282.18..4282.24 rows=25 width=290)
> (actual time=447.852..447.979 rows=25 loops=1)
> -> Sort (cost=4282.18..4282.46 rows=111
> width=290) (actual time=447.836..447.904 rows=25
> loops=1)
> Sort Key: p.title
> -> Nested Loop (cost=3609.75..4278.41
> rows=111 width=290) (actual time=104.256..358.182
> rows=2358 loops=1)
> -> HashAggregate
> (cost=3609.75..3609.75 rows=111 width=4) (actual
> time=103.922..114.836 rows=2358 loops=1)
> -> Index Scan using
> idx_pc_category_id on product_categories pc
> (cost=0.00..3607.28 rows=986 width=4) (actual
> time=0.401..92.253 rows=2358 loops=1)
> Index Cond: (category_id =
> 1016)
> -> Index Scan using pkey_products_id
> on products p (cost=0.00..6.01 rows=1 width=290)
> (actual time=0.069..0.076 rows=1 loops=2358)
> Index Cond: (p.id =
> "outer".product_id)
> Total runtime: 449.370 ms
> (10 rows)
>
>
> -CSN
>
>
> --- "scott.marlowe" <scott(dot)marlowe(at)ihs(dot)com> wrote:
> > On Tue, 10 Feb 2004, CSN wrote:
> >
> > > > 2. Vacuum analyze the tables concerned and post
> > the
> > > > output of EXPLAIN ANALYSE
> > > > with your query.
> > >
> > > => explain analyze;
> > >
> > > results in:
> > >
> > > ERROR: syntax error at or near ";" at character
> > 16
> >
> > No silly. you do:
> >
> > explain analyze select ... (rest of the query...)
> >
> > and it runs the query and tells you how long each
> > bit took and what it
> > THOUGHT it would get back in terms of number of rows
> > and what it actually
> > got back.
> >
> > Let us know...
> >
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Finance: Get your refund fast by filing online.
> http://taxes.yahoo.com/filing.html
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ed L. 2004-02-10 22:20:09 Re: DB cache size strategies
Previous Message CSN 2004-02-10 22:04:49 Re: Join query on 1M row table slow