From: | Chris Smith <chris(at)interspire(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | rewrite count distinct query |
Date: | 2004-12-06 07:00:50 |
Message-ID: | 41B403A2.20000@interspire.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
'Scuse the long post :) I'm trying to include all relevant info..
I'm trying to work out a better way to approach a query, any tips are
greatly appreciated.
The relevant tables:
db=# \d tp_conversions
Table "public.tp_conversions"
Column | Type | Modifiers
---------------+------------------------+--------------------
conversionid | integer | not null default 0
type | character varying(10) |
name | character varying(255) |
amount | double precision |
cookieid | character varying(32) |
currtime | integer |
ip | character varying(20) |
origintype | character varying(20) |
originfrom | character varying(255) |
origindetails | character varying(255) |
userid | integer |
Indexes:
"tp_conversions_pkey" primary key, btree (conversionid)
"conv_origindetails" btree (origindetails)
"conv_originfrom" btree (originfrom)
"conv_origintype" btree (origintype)
"conv_time" btree (currtime)
"conv_userid" btree (userid)
trackpoint=# SELECT count(*) from tp_conversions;
count
-------
261
(1 row)
db=# \d tp_search
Table "public.tp_search"
Column | Type | Modifiers
------------------+------------------------+--------------------
searchid | integer | not null default 0
searchenginename | character varying(255) |
keywords | character varying(255) |
currtime | integer |
ip | character varying(20) |
landingpage | character varying(255) |
cookieid | character varying(32) |
userid | integer |
Indexes:
"tp_search_pkey" primary key, btree (searchid)
"search_cookieid" btree (cookieid)
"search_keywords" btree (keywords)
"search_searchenginename" btree (searchenginename)
"search_userid" btree (userid)
trackpoint=# SELECT count(*) from tp_search;
count
-------
5086
(1 row)
What I'm trying to do...
Work out the number of conversions for each search origin.
This query works:
select
count(distinct conversionid) as convcount,
count(distinct searchid) as searchcount,
(count(distinct conversionid) / count(distinct searchid)) as perc,
s.searchenginename
from tp_conversions c, tp_search s
where
c.origintype='search' and s.searchenginename=c.originfrom and
s.userid=c.userid and c.userid=1
group by searchenginename
order by convcount desc;
convcount | searchcount | perc | searchenginename
-----------+-------------+------+------------------
15 | 2884 | 0 | Google
1 | 110 | 0 | Google AU
2 | 308 | 0 | Google CA
1 | 25 | 0 | Google CL
1 | 143 | 0 | Google DE
1 | 117 | 0 | Google IN
1 | 26 | 0 | Google NZ
3 | 49 | 0 | Google RO
1 | 60 | 0 | Google TH
2 | 174 | 0 | Yahoo
(10 rows)
However the percentage is wrong.
I can cast one to a float:
(count(distinct conversionid) / count(distinct searchid)::float)
and it'll give me a better percentage:
convcount | searchcount | perc | searchenginename
-----------+-------------+---------------------+------------------
15 | 2884 | 0.00520110957004161 | Google
3 | 49 | 0.0612244897959184 | Google RO
2 | 308 | 0.00649350649350649 | Google CA
2 | 174 | 0.0114942528735632 | Yahoo
1 | 110 | 0.00909090909090909 | Google AU
1 | 25 | 0.04 | Google CL
1 | 143 | 0.00699300699300699 | Google DE
1 | 117 | 0.00854700854700855 | Google IN
1 | 26 | 0.0384615384615385 | Google NZ
1 | 60 | 0.0166666666666667 | Google TH
(10 rows)
(I think the answer to this is 'no' but I'm going to ask anyway :P)
Is there an easier way to get the more-detailed percentage (it's meant
to work in multiple databases - so casting to a float won't work for
other db's) ?
More importantly... Is there a better way to write the query (I don't
like the count(distinct...) but it works and gives the right info) ?
I tried to do it with a union:
SELECT
count(searchid),
searchenginename
from tp_search s
where userid=1
group by searchenginename
union
select
count(conversionid),
originfrom
from tp_conversions c
where c.userid=1
group by originfrom;
but then realised that getting the data out with php would be a
nightmare (plus I can't get the percentages).
Lastly:
db=# SELECT version();
version
---------------------------------------------------------------
PostgreSQL 7.4.3 on i386-pc-linux-gnu, compiled by GCC 2.95.4
(1 row)
(I know it's a little out of date, upgrading later this week).
Any suggestions/hints/tips welcome :)
Thanks,
Chris.
From | Date | Subject | |
---|---|---|---|
Next Message | Rolf stvik | 2004-12-06 07:01:47 | Re: 3rd RFD: comp.databases.postgresql (was: |
Previous Message | Michael Fuhr | 2004-12-06 06:51:03 | Re: [HACKERS] DBD::PgSPI 0.02 |