From: | "Nicolas Barbier" <nicolas(dot)barbier(at)gmail(dot)com> |
---|---|
To: | Ottó Havasvölgyi <havasvolgyi(dot)otto(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Eliminating unnecessary left joins |
Date: | 2007-04-08 12:02:26 |
Message-ID: | b0f3f5a10704080502r25e85476o483dbbc403cc31ff@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2007/4/7, Ottó Havasvölgyi <havasvolgyi(dot)otto(at)gmail(dot)com>:
> My simple example:
>
> Class hierarchy and fields:
> Shape (ID, X, Y)
> |
> +-Circle (ID, Radius)
> |
> +-Rectangle (ID, Width, Height)
>
> The mapper creates 3 tables with the columns next to the class name.
> And it creates 3 views. One of them:
>
> RectangleView: SELECT r."ID" as "ID", s."X" as "X", s."Y" as "Y", r."Width"
> as "Width", r."Height" as "Height" FROM "Rectangle" r LEFT JOIN "Shape" s ON
> ( r.ID=s.ID)
I find this view definition a bit strange: why is there a left outer
join? I expect there to be a FK from Rectangle.ID to Shape.ID ("all
rectangles are shapes"), which makes the definition totally equivalent
with one in which a normal join is used (whether attributes of Shape
are used or not).
The main use case I see for the original optimization is ORMs that
join in a whole hierarchy, even when only a part of it is needed. I
guess that that is rather common. The ORM that I use does exactly
this, because the main target-DBMSs (MS-SQL and Oracle) do the
optimization for it.
Example (somewhat less contrived than my previous one):
Imagine an implementation of the typical "books that are borrowed by
people" n-m relationship, using three tables ("Book", "Borrowed",
"Person"). Let's find all books that have been borrowed by a certain
person.
The "non-ORM" version would be something like:
SELECT Book.*
FROM
Book
JOIN Borrowed ON Borrowed.book_id = Book.id
WHERE Borrowed.person_id = <x>;
Now assume that Borrowed is a class hierarchy mapped into multiple
tables by a typical ORM. The query would probably become something
like:
SELECT Book.*
FROM
Book
JOIN Borrowed_Parent ON Borrowed_Parent.book_id = Book.id
LEFT JOIN Borrowed_Child1 ON Borrowed_Child1.id = Borrowed_Parent.id
LEFT JOIN Borrowed_Child2 ON Borrowed_Child2.id = Borrowed_Parent.id
(...)
WHERE Borrowed_Parent.person_id = <x>;
It is clear that the children of the hierarchy are needlessly joined
in (as the only attribute that is actually needed is person_id, which
is on the parent level). It is not always trivial for the ORM to find
that out, without writing stuff that looks suspiciously similar to a
DBMS optimizer.
Maybe it is debatable whether this optimization should be done by the
application (i.e. the ORM) or by the DBMS. I am personally in favor of
doing it in the DBMS.
greetings,
Nicolas
--
Nicolas Barbier
http://www.gnu.org/philosophy/no-word-attachments.html
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-04-08 15:05:37 | Re: Reviewers Guide to Deferred Transactions/TransactionGuarantee |
Previous Message | Simon Riggs | 2007-04-08 08:46:45 | Re: Reviewers Guide to Deferred Transactions/TransactionGuarantee |