Re: Help with strange join

From: Mike Rylander <mrylander(at)gmail(dot)com>
To: Victor Spång Arthursson <victor(at)tosti(dot)dk>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Help with strange join
Date: 2005-02-07 12:30:49
Message-ID: b918cf3d0502070430617d7dd0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 7 Feb 2005 12:34:39 +0100, Victor Spång Arthursson
<victor(at)tosti(dot)dk> wrote:
>
> 2005-02-04 kl. 20.36 skrev Mike Rylander:
>
> > How about:
> >
> > SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
> > c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
> > WHERE EXISTS ( SELECT l.relid FROM languages l WHERE l.relid =
> > i.betegnelse GROUP BY l.relid HAVING COUNT(DIStINCT l.lang) = 1);
>
> Hi and thanks for your answer!

No problem. Did it do what you expect?

>
> Could you please try to break this SQL down and try to explain what it
> does, and also show how to change language…
>

Sure. I guess that the EXISTS version was faster? In any case:

SELECT r.* -- you can add more columns here
-- from any table but languages

-- I assume the joins are correct and transparent. I just
-- attempted to follow your example. I wasn't sure about
-- the join from ingredients to languages since the example
-- didn't have any exact matches, but the format of the relid
-- column looked right.
FROM opskrifter r
JOIN opskrifter_content c ON (r.nummer =c.opskrift)
JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)

WHERE
EXISTS ( -- EXISTS runs a subselect and if ANY rows
are returned
-- it becomes true
http://www.postgresql.org/docs/8.0/interactive/functions-subquery.html#AEN12496

-- This query, if run by itself, will
give you all the "relid"s from
-- "languages" where ALL the rows in
the group have the same
-- "lang" attribute. The HAVING
clause is the key. It looks at
-- each group of rows with the same
relid as a virtual subtable
-- and counts the number of distinct
"lang" columns. If I
-- understood your question
correctly, any partially translated
-- ingredient lists will have more
than one distinct "lang". In
-- order to select a specific
language you would need to
-- wrap this in another because of
GROUP BY restrictions, and
-- use an aggregate to return the
"lang" attribute.

SELECT lang FROM (
SELECT l.relid, MAX(lang) as lang
FROM languages l
WHERE l.relid = i.betegnelse
GROUP BY l.relid
HAVING COUNT(DIStINCT l.lang) = 1
) s2
WHERE l.lang = 'DE'
);

To generalize this, just replace
WHERE l.lang = 'DE'
with
WHERE l.lang = ?
and pass the language in as a parameter to the query. The JOIN
version would be much easier to limit. BTW, there was a typo in the
first post in the JOIN version. It's fixed below. Just change it
from

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING
COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse);

to

SELECT r.* FROM opskrifter r JOIN opskrifter_content c ON (r.nummer =
c.opskrift) JOIN opskrifter_ingredienser i ON (c.ingrediens = i.id)
JOIN ( SELECT l.relid FROM languages l GROUP BY l.relid HAVING
COUNT(DISINCT l.lang) = 1) s ON (l.relid = i.betegnelse AND l.lang = 'DE');

I am curious as to which is faster for you. If "languages" is small I
would expect the EXISTS version to be faster, but for a large
"languages" the JOIN should be faster since it only generates the
subselect result set once.

--
Mike Rylander
mrylander(at)gmail(dot)com
GPLS -- PINES Development
Database Developer
http://open-ils.org

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Shaun Clements 2005-02-07 13:22:15 Problem performing a restore of a data schema in Windows
Previous Message Victor Spång Arthursson 2005-02-07 11:34:39 Re: Help with strange join