From: | adamcrume(at)hotmail(dot)com (Adam) |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | count of occurences |
Date: | 2001-09-12 22:16:39 |
Message-ID: | 5f64c126.0109121416.7f73bc3f@posting.google.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
I help run a job database and have a table of search records. I want
a query that will return the top 10 jobs by search frequency. I'm
familiar with ORDER BY and LIMIT, so I basically need this:
Given a table search_records:
job_num
-------
1
2
2
3
4
4
4
I want a query that will return:
job_num | count
--------+------
1 |1
2 |2
3 |1
4 |3
I tried
select distinct job_num, (select count(*) from search_records j where
j.job_num=k.job_num) from search_records k
but it is horribly slow (it takes several minutes on a table of about
25k rows!). I assume it scans the entire table for every job_num in
order to count the number of occurences of that job_num, taking order
n^2 time. Since I can easily use job_num as an index (being integers
from 0 to roughly 400 so far) I could just do a "select * from
search_records" and do the counting in PHP (our HTML pre-processor) in
order n time. However, I don't know how to do an order n*log(n) sort
in PHP, just n^2, so there would still be an efficiency problem.
I have Postgresql 7.0.3.
Help is of course greatly appreciated.
From | Date | Subject | |
---|---|---|---|
Next Message | Graham Leggett | 2001-09-12 22:55:30 | Re: Error: Bad Timestamp Format |
Previous Message | Barry Lind | 2001-09-12 22:07:01 | Re: Error: Bad Timestamp Format |
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Buttafuoco | 2001-09-13 00:18:46 | Re: Index location patch for review (more pgbench results) |
Previous Message | Rene Pijlman | 2001-09-12 21:41:39 | Re: Timezones and time/timestamp values in FE/BE protocol |