From: | Jay Levitt <jay(dot)levitt(at)gmail(dot)com> |
---|---|
To: | Alban Hertroys <haramrae(at)gmail(dot)com> |
Cc: | Merlin Moncure <mmoncure(at)gmail(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, David Johnston <polobo(at)yahoo(dot)com>, Robert James <srobertjames(at)gmail(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Controlling complexity in queries |
Date: | 2011-12-15 04:11:33 |
Message-ID: | 4EE97375.4080203@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Alban Hertroys wrote:
>> select questions.id from questions
>> join (
>> select u.id from users as u
>> group by u.id
>> ) as s
>> on s.id = questions.user_id
>> where questions.id = 1;
> You could write that as:
>
> select questions.id
> from questions as q
> where exists (select 1 from users as u where u.id = q.user_id)
> and questions.id = 1;
>
> That's basically what you are doing, checking that a user with a given id
> from the questions table exists in the users table.
>
> That said, wouldn't a foreign key constraint help you even better? If
> questions.user_id is required to refer to an existing users.id (by an FK
> constraint), than the check in the query becomes moot.
Ahh, I see.. yes, this query is just the smallest possible query that
exhibits the same not-using-the-index behavior as the real query, which
needs columns from both questions and users, and thus needs the join. (And
it has aggregates, and needs the GROUP BY too.) There already is a
constraint, questions.user_id always refers to a real users.id, etc.
This is actually a great case where relational thinking does NOT map well to
functional composability; as Tom Lane pointed out, the solution is just "add
the WHERE clause to the subquery too." But the subquery is in a function
that doesn't *know* it's being restricted, and (to me) shouldn't have to
know; that's what the optimizer does for a living.
FWIW, and this may help the OP, my plan for tackling the "but I want
readability AND performance" issue is to
1. write a monolithic, optimized, incomprehensible version of the query
2. maintain the pretty functions alongside it
3. Write unit tests that confirm that the output of #1 and #2 is identical.
Kinda like how gcc builds gcc and verifies that the output is the same as
gcc building gcc building gcc.
Jay
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Angelico | 2011-12-15 04:14:58 | Re: Philosophical question |
Previous Message | Edson Richter | 2011-12-15 02:15:21 | Re: Philosophical question |