Re: query looping?

From: Brian Cox <brian(dot)cox(at)ca(dot)com>
To: "Robert Haas [robertmhaas(at)gmail(dot)com]" <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: query looping?
Date: 2010-01-05 21:05:58
Message-ID: 4B43A9B6.30208@ca.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> SELECT SUM(1) FROM ts_stats_transetgroup_user_weekly b WHERE
> ts_interval_start_time > [value] AND ts_interval_start_time < [value];
>
> ...and similarly for the bitmap index scan.
cemdb=> SELECT SUM(1) FROM ts_stats_transetgroup_user_weekly b WHERE
ts_interval_start_time >= '2009-12-28' AND ts_interval_start_time <
'2010-01-04';
sum
-------
89758
(1 row)

cemdb=> select sum(1) from ts_stats_transet_user_interval where
ts_interval_start_time >= '2009-01-03' and ts_interval_start_time <
'2009-01-03 08:00';
sum
-----

(1 row)

cemdb=> select sum(1) from ts_stats_transet_user_interval where
ts_interval_start_time >= '2010-01-03' and ts_interval_start_time <
'2010-01-03 08:00';
sum
--------
800000
(1 row)

the estimates in the 1st query plan are OK (since they are the "same").
The 2nd, however, look to be too low. FYI, this query finally completed,
so it wasn't looping but the query plan is very poor:

[24585-cemdb-admin-2010-01-05 10:54:49.511 PST]LOG: duration:
124676746.863 ms execute <unnamed>: select count(distinct b.ts_id) from
ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c,
ts_transetgroup_transets_map m where b.ts_transet_group_id =
m.ts_transet_group_id and m.ts_transet_incarnation_id =
c.ts_transet_incarnation_id and c.ts_user_incarnation_id =
b.ts_user_incarnation_id and c.ts_interval_start_time >= $1 and
c.ts_interval_start_time < $2 and b.ts_interval_start_time >= $3 and
b.ts_interval_start_time < $4
[24585-cemdb-admin-2010-01-05 10:54:49.511 PST]DETAIL: parameters: $1 =
'2010-01-03 00:00:00-08', $2 = '2010-01-03 08:00:00-08', $3 =
'2010-01-01 00:00:00-08', $4 = '2010-01-04 00:00:00-08'

compare to:

[root(at)rdl64xeoserv01 log]# time PGPASSWORD=**** psql -U admin -d cemdb
-c "select count(distinct b.ts_id) from
ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c,
ts_transetgroup_transets_map m where b.ts_transet_group_id =
m.ts_transet_group_id and m.ts_transet_incarnation_id =
c.ts_transet_incarnation_id and c.ts_user_incarnation_id =
b.ts_user_incarnation_id and c.ts_interval_start_time >= '2010-01-03
00:00' and c.ts_interval_start_time < '2010-01-03 08:00' and
b.ts_interval_start_time >= '2009-12-28 00:00' and
b.ts_interval_start_time < '2010-01-04 00:00'"
count
-------
89758
(1 row)

real 0m3.804s
user 0m0.001s
sys 0m0.003s

so why the former ~40,000 times slower?

Thanks,
Brian

Browse pgsql-performance by date

  From Date Subject
Next Message Brian Cox 2010-01-05 21:33:08 Re: query looping?
Previous Message Keresztury Balázs 2010-01-05 17:38:23 Re: forced sequential scan when condition has current_user