determining how many products are in how many categories

From: David Garamond <lists(at)zara(dot)6(dot)isreserved(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: determining how many products are in how many categories
Date: 2004-02-08 17:09:53
Message-ID: 40266D61.3070304@zara.6.isreserved.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

# product table (simplified):
create table p (
id char(22) not null primary key,
name text,
desc text
);

# product category table (simpl.):
create table pc (
id char(22) not null primary key,
name text,
desc text
);

# table that maps products into categories:
create table p_pc (
id char(22) not null primary key,
pid char(22) not null references p(id),
pcid char(22) not null references pc(id)
);
create index i_ppc_pid on p_pc(pid);
create index i_ppc_pcid on p_pc(pcid);
create unique index i_ppc_pid_pcid on p_pc(pid, pcid);

There are about 50k rows in p, 40k in pc, and 140k in p_pc. Most
products are only placed in 1 category, some in 2, fewer in 3, and fewer
stills in 4, 5, 6 categories.

I want to know how many products are placed in 1 category, how many in
2, and so on. Here's my query:

select count(pid),num_cat from (
select pid,count(cid) as num_cat
from ppc group by pid) as f
group by num_cat;

A sample output (4000 products are categorized in 5 different places,
4998 in 4, and so on):

count | num_cat
-------+---------
4000 | 5
4998 | 4
7502 | 3
10001 | 2
17499 | 1
(5 rows)

However, this query is taking about 2.5 minutes. Any idea on how to make
it faster?

--
dave

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Philippe Lang 2004-02-09 07:46:40 Disable/Enable Trigger?
Previous Message Tom Lane 2004-02-07 20:53:55 Re: Storing a range of numbers