Re: Join the same row

From: Edison Azzi <edisonazzi(at)terra(dot)com(dot)br>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join the same row
Date: 2005-12-07 18:45:12
Message-ID: 43972DB8.90400@terra.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Richard Huxton escreveu:

> Edison Azzi wrote:
>
>> Hi,
>>
>> I´m trying to optimize some selects between 2 tables and the best way
>> I found was
>> alter the first table and add the fields of the 2nd table. I adjusted
>> the contents and
>> now a have only one table with all info that I need. Now resides my
>> problem, because
>> of legacy queries I decided to make a Rule that replace the 2nd table.
>>
>> Until now all worked well, but I found when I make a join between de
>> result
>> table and de Rule, even tought is the same row in the same table, the
>> optimizer
>> generete two access for the same row:
>> cta_pag is the table and ctapag_adm is the rule.
>>
>> CREATE OR REPLACE RULE "_RETURN" AS
>> ON SELECT TO ctapag_adm DO INSTEAD SELECT cta_pag.nrlancto,
>> cta_pag.codconta, cta_pag.frequencia, cta_pag.nrlanctopai
>> FROM cta_pag
>> WHERE cta_pag.origem = 'A'::bpchar;
>>
>> This is one of the legacy queries:
>>
>> select * from cta_pag p , ctapag_adm a where a.nrlancto= p.nrlancto
>> and p.nrlancto = 21861;
>
>
> OK - and you get a self-join (which is what you asked for, but you'd
> like the planner to notice that it might not be necessary).
>
>> Resulting in twice the time for accessing.
>>
>> Acessing just on time the same row:
>>
>> select * from cta_pag p where p.nrlancto = 21861
>
>
> This isn't the same query though. Your rule has an additional
> condition origem='A'. This means it wouldn't be correct to eliminate
> the self-join even if the planner could.
>
>> Is there a way to force the optimizer to understand that is the
>> same row?
>
>
> However, even if you removed the condition on origem, I don't think
> the planner will notice that it can eliminate the join. It's just too
> unusual a case for the planner to have a rule for it.
>
> I might be wrong about the planner - I'm just another user. One of the
> developers may correct me.

You are rigth, the planner will not eliminate the join, see:

select * from cta_pag a, cta_pag p where a.nrlancto=p.nrlancto and
p.nrlancto = 21861;

EXPLAIN:
Nested Loop (cost=0.00..11.48 rows=1 width=816)
-> Index Scan using cta_pag_pk on cta_pag a (cost=0.00..5.74 rows=1
width=408)
Index Cond: (21861::numeric = nrlancto)
-> Index Scan using cta_pag_pk on cta_pag p (cost=0.00..5.74 rows=1
width=408)
Index Cond: (nrlancto = 21861::numeric)

I know that this is too unusual case, but I hoped that the planner could
deal
with this condition. I´m trying to speed up without have to rewrite a
bunch of
queries. Now I'll have to think another way to work around this issue.

Thanks,

Edison.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2005-12-07 18:55:04 Re: Join the same row
Previous Message Stephan Vollmer 2005-12-07 18:33:37 Re: First query is slow, subsequent queries fast