Re: Help with rewriting query

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Junaili Lie <junaili(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Help with rewriting query
Date: 2005-06-09 02:59:07
Message-ID: 20050609025907.GB6139@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jun 08, 2005 at 15:48:27 -0700,
Junaili Lie <junaili(at)gmail(dot)com> wrote:
> Hi,
> The suggested query below took forever when I tried it.
> In addition, as suggested by Tobias, I also tried to create index on
> food(p_id, id), but still no goal (same query plan).
> Here is the explain:
> TEST1=# explain select f.p_id, max(f.id) from Food f, Person p where
> (f.p_id = p.id) group by p.id;

The above is going to require reading all the food table (assuming no
orphaned records), so the plan below seems reasonable.

> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> GroupAggregate (cost=0.00..214585.51 rows=569 width=16)
> -> Merge Join (cost=0.00..200163.50 rows=2884117 width=16)
> Merge Cond: ("outer".id = "inner".p_id)
> -> Index Scan using person_pkey on person p
> (cost=0.00..25.17 rows=569 width=8)
> -> Index Scan using person_id_food_index on food f
> (cost=0.00..164085.54 rows=2884117 width=16)
> (5 rows)
>
>
>
>
> TEST1=# explain select p.id, (Select f.id from food f where
> f.p_id=p.id order by f.id desc limit 1) from person p;

Using a subselect seems to be the best hope of getting better performance.
I think you almost got it right, but in order to use the index on
(p_id, id) you need to order by f.p_id desc, f.id desc. Postgres won't
deduce this index can be used because f.p_id is constant in the subselect,
you need to give it some help.

> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------
> Seq Scan on Person p (cost=100000000.00..100007015.24 rows=569 width=8)
> SubPlan
> -> Limit (cost=0.00..12.31 rows=1 width=8)
> -> Index Scan Backward using food_pkey on food f
> (cost=0.00..111261.90 rows=9042 width=8)
> Filter: (p_id = $0)
> (5 rows)
>
> any ideas or suggestions is appreciate.
>
>
> On 6/8/05, Tobias Brox <tobias(at)nordicbet(dot)com> wrote:
> > [Junaili Lie - Wed at 12:34:32PM -0700]
> > > select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group
> > > by f.p_id will work.
> > > But I understand this is not the most efficient way. Is there another
> > > way to rewrite this query? (maybe one that involves order by desc
> > > limit 1)
> >
> > eventually, try something like
> >
> > select p.id,(select f.id from food f where f.p_id=p.id order by f.id desc limit 1)
> > from person p
> >
> > not tested, no warranties.
> >
> > Since subqueries can be inefficient, use "explain analyze" to see which one
> > is actually better.
> >
> > This issue will be solved in future versions of postgresql.
> >
> > --
> > Tobias Brox, +47-91700050
> > Tallinn
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steve Pollard 2005-06-09 03:56:57 Re: Importing from pg_dump slow, low Disk IO
Previous Message Neil Conway 2005-06-09 00:19:51 Re: Postgresql on an AMD64 machine