Re: Netflix Prize data

From: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Netflix Prize data
Date: 2006-10-04 22:51:22
Message-ID: 21733.24.91.171.78.1160002282.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> I signed up for the Netflix Prize. (www.netflixprize.com)
>> and downloaded their data and have imported it into PostgreSQL.
>> Here is how I created the table:
>
> I signed up as well, but have the table as follows:
>
> CREATE TABLE rating (
> movie SMALLINT NOT NULL,
> person INTEGER NOT NULL,
> rating SMALLINT NOT NULL,
> viewed DATE NOT NULL
> );
>
> I also recommend not loading the entire file until you get further
> along in the algorithm solution. :)
>
> Not that I have time to really play with this....

As luck would have it, I wrote a recommendations system based on music
ratings a few years ago.

After reading the NYT article, it seems as though one or more of the guys
behind "Net Perceptions" is either helping them or did their system, I'm
not sure. I wrote my system because Net Perceptions was too slow and did a
lousy job.

I think the notion of "communities" in general is an interesting study in
statistics, but every thing I've seen in the form of bad recommendations
shows that while [N] people may share certain tastes, but that doesn't
nessisarily mean that what one likes the others do. This is especially
flawed with movie rentals because it is seldom a 1:1 ratio of movies to
people. There are often multiple people in a household. Also, movies are
almost always for multiple people.

Anyway, good luck! (Not better than me, of course :-)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Woodward 2006-10-04 22:57:58 Re: Netflix Prize data
Previous Message Bruce Momjian 2006-10-04 22:47:34 Re: workaround for buggy strtod is not necessary