Re: Need a SQL to create sets of hobbies

From: "Aaron Bono" <postgresql(at)aranya(dot)com>
To: CN <cnliou9(at)fastmail(dot)fm>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Need a SQL to create sets of hobbies
Date: 2006-09-20 18:38:05
Message-ID: bf05e51c0609201138t6e453c5pf85863a3a422cf08@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 9/20/06, CN <cnliou9(at)fastmail(dot)fm> wrote:
>
> Hi!
>
> CREATE TABLE x1 (name TEXT, hobby TEXT, PRIMARY KEY(name,hobby));
>
> INSERT INTO x1 VALUES('John','music');
> INSERT INTO x1 VALUES('John','arts');
> INSERT INTO x1 VALUES('Bob','arts');
> INSERT INTO x1 VALUES('Bob','music');
> INSERT INTO x1 VALUES('Rocky','copmputer');
> INSERT INTO x1 VALUES('Steve','arts');
> INSERT INTO x1 VALUES('Steve','football');
> INSERT INTO x1 VALUES('Tom','computer');
> INSERT INTO x1 VALUES('Tom','music');
>
> select * from x1;
>
> name | hobby
> -------+----------
> John | music
> John | arts
> Bob | arts
> Bob | music
> Rocky | computer
> Steve | arts
> Steve | football
> Tom | computer
> Tom | music
> (9 rows)
>
> John and Bob have the same hobbies - music and arts. So music and arts
> are treated as one set of hobbies.
>
> Rocky has an unique set of interest - computer.
>
> Steve also likes arts just as John and Bob do, but he also has an
> exclusive interest - football. Thus, his set of hobbies is unique -
> arts, football.
>
> One of Tom's hobbies, music, overlaps those of John, Bob, and Rocky; but
> computer does not. Hence his hobbies, computer and music, forms a new
> set of hobbies.
>
> Now we have 4 sets of hobbies:
>
> set 1: music, arts
> set 2: computer
> set 3: arts, football
> set 4: computer, music
>
> I am looking for an SQL that creates sets of hobbies in table x2 by
> selecting from table x1:
>
> CREATE TABLE x2 (sid INTEGER, hobby TEXT, PRIMARY KEY(sid,hobby));
>
> and makes x2 contain rows:
>
> sid | hobby
> -------+----------
> 1 | music
> 1 | arts
> 2 | computer
> 3 | arts
> 3 | football
> 4 | computer
> 4 | music
>
> where gid starts from 1.

Your best design is to break this into 3 tables:

person (
person_id,
person_name
)

hobby (
hobby_id,
hobby_name
)

person_hobby (
person_id,
hobby_id
)

Then you can get the list of hobbies for each person like this:

SELECT
person.person_id,
array(
SELECT hobby_name
FROM person_hobby
INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
WHERE person_hobby.person_id = person.person_id
ORDER BY hobby_name
) AS hobby_list
FROM person

Then do this to flatten it back out:

SELECT
-- Gives you a unique id though using a SERIAL on a table would be
better
min(mysub.person_id),
hobby.hobby_name
FROM (
SELECT
person.person_id,
array(
SELECT hobby_name
FROM person_hobby
INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
WHERE person_hobby.person_id = person.person_id
ORDER BY hobby_name
) AS hobby_list
FROM person
) mysub
INNER JOIN person_hobby ON (person_hobby.person_id = mysub.person_id)
INNER JOIN hobby ON (hobby.hobby_id = person_hobby.hobby_id)
GROUP BY
mysub.hobby_list,
hobby.hobby_name

I did not try it so it may require a little tweaking to work. Also, I don't
know what the performance would be like.

==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com
http://codeelixir.com
==================================================================

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message TJ O'Donnell 2006-09-20 22:00:32 ERROR: could not write block 196261 of temporary file: No space left
Previous Message CG 2006-09-20 18:10:42 Re: Nested loops are killing throughput