| From: | "Greg Sabino Mullane" <greg(at)turnstep(dot)com> |
|---|---|
| To: | pgsql-sql(at)postgresql(dot)org |
| Subject: | Re: people who buy A, also buy C, D, E |
| Date: | 2005-06-28 23:53:39 |
| Message-ID: | c5c5417bcc537847c105ade5e360341c@biglumber.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-sql |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>>> The goal of my query is: given a book, what did other people who
>>> bought this book also buy? I plan the list the 5 most popular such
>>> books.
>
> SELECT b.product_id, sum(quantity) as rank FROM ordered_products a,
> ordered_products b WHERE a.product_id=(the product id) AND
> b.order_id=a.order_id AND b.product_id != a.product_id GROUP BY
> b.product_id ORDER BY rank DESC LIMIT 6;
I don't think this is exactly what the original poster had in mind:
we want a ranking of a dynamically generated subset of all possible
products (e.g. books). So if someone buys "Harry Potter and the Proprietary
Database", then only the books bought by people who also bought /that/
book are considered, ranked, and ordered. There's not a lot of caching that
can be effectively done, due to the high number of combinations and large
potential for change.
> table ordered_products: order_id, product_id, quantity
I'm not sure where you are getting "quantity" from: as near as I
can tell, this will always be a quantity of 1: one person ordering
one item.
- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200506281946
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iD8DBQFCweKavJuQZxSWSsgRAkmHAJ9fQ+Degs6jSrGRozEoI35F8nlyBACfYm2u
QgawxHOij5FHVd0FopW25IU=
=r5eo
-----END PGP SIGNATURE-----
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Sabino Mullane | 2005-06-29 00:13:49 | Re: ORDER records based on parameters in IN clause |
| Previous Message | Nick Johnson | 2005-06-28 19:55:23 | Re: ENUM like data type |