From: | Casey Duncan <casey(at)pandora(dot)com> |
---|---|
To: | Michael Fuhr <mike(at)fuhr(dot)org> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Possible infinite loop in query using bitmap scans |
Date: | 2006-03-14 01:25:49 |
Message-ID: | cb63857e8cb866fca0b5daa088322c12@pandora.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mar 13, 2006, at 9:50 AM, Michael Fuhr wrote:
> On Sun, Mar 12, 2006 at 11:36:23PM -0800, Casey Duncan wrote:
>> SELECT count(*) FROM webhits
>> WHERE path LIKE '/radio/tuner_%.swf' AND status = 200
>> AND date_recorded >= '3/10/2006'::TIMESTAMP
>> AND date_recorded < '3/11/2006'::TIMESTAMP;
> [...]
>> Aggregate (cost=794775.08..794775.09 rows=1 width=0)
> [...]
>> According to the planner it should take <15 minutes which is typical
>> in
>> practice.
>
> The planner's cost estimate is in units of disk page fetches, not
> time. The above estimate isn't 794775.09 ms (~13.25 min) but rather
> 794775.09 times the cost of a single page fetch, however much that
> is. See "Using EXPLAIN" in the "Performance Tips" chapter of the
> documentation.
>
> http://www.postgresql.org/docs/8.1/interactive/performance-
> tips.html#USING-EXPLAIN
Doh! I should've known that.
>> About half the times it runs, however, it never terminates
>> (even after days) and just spins consuming 99+% of CPU with no disk
>> activity. This query was never a problem in postgres versions < 8.1.2,
>> however the data has grown substantially since that time. I notice it
>> uses the recent in-memory bitmap feature, so I wondered if it was
>> exposing a bug.
>
> If the problem happens half the time then you have a somewhat
> repeatable test case. Do you get more consistent performance if
> you set enable_bitmapscan to off? What's the query plan if you do
> that?
Here's the plan with bitmap scans off:
QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------
Aggregate (cost=891363.71..891363.72 rows=1 width=0)
-> Index Scan using webhits_date_idx1 on webhits
(cost=0.00..891360.30 rows=1362 width=0)
Index Cond: ((date_recorded >= '2006-03-10
00:00:00'::timestamp without time zone) AND (date_recorded <
'2006-03-11 00:00:00'::timestamp without time zone))
Filter: (((path)::text ~~ '/radio/tuner_%.swf'::text) AND
(status = 200))
(4 rows)
The query runs to completion this way in about 40 minutes. I turned
bitmap scans back on and it hangs again (I ran it for about 5 hours).
> If you narrow the search criteria so the query returns fewer rows,
> do you still see the problem? Can you identify a "sour spot" where
> the problem starts to happen?
I'll do that tomorrow, and let you know.
>> If I restart the postmaster, the query will complete in the expected
>> time.
>
> Does the problem eventually start happening again? If so, after
> how long? How did you determine that the restart is relevant? Do
> you consistently see different (presumably better) performance after
> a restart than if you don't restart?
This is a production box, so I can't restart it whenever I want and I
haven't yet reproduced it elsewhere -- the data base size makes that
cumbersome at best -- but once after it hung up, I restarted postgres
and the report ran to completion for a couple of days then started
hanging again. Today it seems to pretty consistently hang, I'll see if
I can restart it overnight and test it again.
-Casey
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-03-14 01:38:21 | Re: Relation 'pg_largeobject' does not exist |
Previous Message | Brandon Keepers | 2006-03-14 01:24:26 | Relation 'pg_largeobject' does not exist |