From: | Scott Marlowe <smarlowe(at)g2switchworks(dot)com> |
---|---|
To: | Bill Moseley <moseley(at)hank(dot)org> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: GROUP BY requirement |
Date: | 2005-08-30 18:35:25 |
Message-ID: | 1125426924.28179.78.camel@state.g2switchworks.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, 2005-08-26 at 14:39, Bill Moseley wrote:
> I'm wondering if adding a GROUP BY (as required by Postgres) will
> change the results of a select on a view.
>
> I have the following view which joins a "class" with a teacher. A
> teacher is a "person" and I have an "instructors" link table.
>
> CREATE VIEW class_list (id, class_time, instructor )
> AS
> SELECT DISTINCT ON(class.id)
> class.id, class.class_time, person.first_name
>
> FROM class, instructors, person
> WHERE instructors.person = person.id
> AND class.id = instructors.class;
>
> I also have a table "registration" that links students with a class.
> The registration table has a "reg_status" column to say if they are
> confirmed or on the wait_list. So when showing the above I'd also
> like to see how many students are confirmed and on the wait_list.
>
> DROP VIEW cl;
> CREATE VIEW cl (id, class_time, instructor,
> confirmed_cnt, wait_list_cnt)
> AS
> SELECT DISTINCT ON(class.id)
> class.id, class.class_time, person.first_name,
> sum (CASE WHEN registration.reg_status = 1 THEN 1 ELSE 0 END) as confirmed_cnt,
> sum (CASE WHEN registration.reg_status = 2 THEN 1 ELSE 0 END) as wait_list_cnt,
>
> FROM class, instructors, person, registration
> WHERE instructors.person = person.id
> AND class.id = instructors.class
> AND class.id = registration.class
>
> GROUP BY class.id, class.class_time, person.first_name;
>
> PostgreSQL requires the GROUP BY. But, I'm not clear how the GROUP BY
> might change the results between the two views above.
>
> http://www.postgresql.org/docs/8.0/static/sql-select.html#SQL-GROUPBY
>
> says:
>
> When GROUP BY is present, it is not valid for the SELECT list
> expressions to refer to ungrouped columns except within aggregate
> functions, since there would be more than one possible value to
> return for an ungrouped column.
>
> Frankly, I cannot see how it might change results of a select between
> the two views. Am I missing something?
OK, distinct on suffers from this problem. Given the following simple
dataset:
mytable:
a | b
------
1 | 0
1 | 1
select distinct on (a) a,b from mytable;
One can see how the possible results are:
1,0 and 1,1, right? All depending on the order in which they are
fetched.
The same would be true if you could do a group by on a and select b:
select a,b from mytable group by a;
Right?
Now, if it's impossible for your dataset to return such sets, due to the
way it's built, it is likely not fully normalized. I.e. you have data
like this:
classid | instructorname | moreinfo...
--------------------------------------
1 | 'John Smith' | 'information'
1 | 'John Smith' | 'even more information'
and so on. Or your join is creating such a data set.
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2005-08-30 18:48:35 | Re: POSS. FEATURE REQ: "Dynamic" Views |
Previous Message | Andrew Sullivan | 2005-08-30 17:56:25 | Re: Postgresql replication |