From: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
---|---|
To: | Tory M Blue <tmblue(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL |
Date: | 2010-01-23 02:25:17 |
Message-ID: | 4B5A5E0D.4000003@postnewspapers.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 23/01/2010 1:59 AM, Tory M Blue wrote:
> It's this query, run 6000 times with a diff makeid's /
> /
>
> /SELECT COUNT(DISTINCT uid ) AS active_users FROM
> pixelpool.userstats WHERE makeid ='bmw-ferman' AND tagged =true/
>
> / Plan/
>
> / "Aggregate (cost=49467.00..49467.01 rows=1 width=8)"/
>
> / " -> Bitmap Heap Scan on userstats (cost=363.49..49434.06
> rows=13175 width=8)"/
>
> / " Recheck Cond: (makeid = 'b1mw-ferman'::text)"/
>
> / " Filter: tagged"/
>
> / " -> Bitmap Index Scan on idx_retargetuserstats_makeidtag
> (cost=0.00..360.20 rows=13175 width=0)"/
>
> / " Index Cond: ((makeid = 'b1mw-ferman'::text) AND (tagged
> = true))"/
Try:
- Adding a partial index on makeid, eg:
CREATE INDEX userstats_makeid_where_tagged_idx
ON userstats (makeid) WHERE (tagged);
- Instead of repeating the query 6000 times in a loop, collect the data
in one pass by joining against a temp table containing the makeids of
interest.
SELECT COUNT(DISTINCT u.uid) AS active_users
FROM pixelpool.userstats u
INNER JOIN temp_makeids m ON (u.makeid = m.makeid)
WHERE u.tagged = true;
(If the 6000 repeats are really a correlated subquery part of a bigger
query you still haven't shown, then you might be able to avoid 6000
individual passes by adjusting your outer query instead).
--
Craig Ringer
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Carey | 2010-01-23 02:51:24 | Re: Inserting 8MB bytea: just 25% of disk perf used? |
Previous Message | fkater@googlemail.com | 2010-01-22 20:42:03 | Re: Inserting 8MB bytea: just 25% of disk perf used? |