From: | chidamparam muthusamy <mchidamparam(at)gmail(dot)com> |
---|---|
To: | Richard Huxton <dev(at)archonet(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: postgres performance |
Date: | 2013-12-07 10:30:24 |
Message-ID: | CABUk_4gemzU+vFz_9tZ-0oueXHgupAeB1CrYcjXNX0KooQqidg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
hi,
thank you so much for the input.
Can you please clarify the following points:
*1. Output of BitmapAnd = 303660 rows*
-> BitmapAnd (cost=539314.51..539314.51 rows=303660 width=0) (actual
time=9083.085..9083.085 rows=0 loops=1)
-> Bitmap Index Scan on groupid_index
(cost=0.00..164070.62 rows=7998674 width=0) (actual
time=2303.788..2303.788 rows=7840766 loops=1)
Index Cond: ((detailed_report.group_id)::text =
'CHOICE'::text)
-> Bitmap Index Scan on client_index
(cost=0.00..175870.62 rows=7998674 width=0) (actual
time=2879.691..2879.691 rows=7840113 loops=1)
Index Cond: ((detailed_report.client)::text =
'ChoiceFone'::text)
-> Bitmap Index Scan on partial_endtime_index
(cost=0.00..199145.02 rows=9573259 width=0) (actual
time=1754.044..1754.044 rows=9476589 loops=1)
Index Cond: ((detailed_report.end_time >=
'2013-05-01 00:00:00+00'::timestamp with time zone) AND
(detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp wi
th time zone))
*2. In the Next outer node Bitmap Heap Scan, estimated rows = 303660 and
actual rows = 2958392, why huge difference ? How to bring it down. *
Bitmap Heap Scan on public.detailed_report (cost=539314.51..1544589.52
rows=303660 width=44) (actual time=9619.913..51757.911 rows=2958392 loops=1)
*3. what is the cause for Recheck, is it possible to reduce the time taken
for Recheck ?*
Recheck Cond: (((detailed_report.group_id)::text = 'CHOICE'::text) AND
((detailed_report.client)::text = 'ChoiceFone'::text) AND
(detailed_report.end_time >= '2013-05-01 00:00:
00+00'::timestamp with time zone) AND (detailed_report.end_time <
'2013-06-01 00:00:00+00'::timestamp with time zone))
thanks
On Sat, Dec 7, 2013 at 12:07 AM, Richard Huxton <dev(at)archonet(dot)com> wrote:
> On 06/12/13 17:36, chidamparam muthusamy wrote:
>
> I rather think Alan is right - you either want a lot more RAM or faster
> disks. Have a look at your first query...
>
>
> Query:
>> EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway)
>> as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0)
>> ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd,
>> sum(call_duration_recv)/1000.0 as duration_recv,
>> sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as
>> call_amount_recv, sum(call_amount_pay) as call_amount_
>> pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and
>> end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE'
>> GROUP by client, gateway ORDER BY call_amount_recv DESC;
>>
>
> QUERY PLAN
>> ------------------------------------------------------
>> Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual
>> time=137852.474..137852.474 rows=5 loops=1)
>> Sort Key: (sum(call_amount_recv))
>> Sort Method: quicksort Memory: 25kB
>> Buffers: shared read=2491664
>>
>
> -> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44)
>> (actual time=137852.402..137852.454 rows=5 loops=1)
>> Buffers: shared read=2491664
>>
>
> -> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87
>> rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227
>> loops=1)
>> Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with
>> time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
>> zone) AND ((group_id)::text = 'adm
>> in'::text) AND ((client)::text = 'CHOICE'::text))
>> Buffers: shared read=2491664
>>
>
> -> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report
>> (cost=0.00..644570.81 rows=1029218 width=0) (actual
>> time=3418.754..3418.754 rows=5248227 loops=1)
>> Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time
>> zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
>> zone) AND ((group_id)::text =
>> 'admin'::text) AND ((client)::text = 'CHOICE'::text))
>> Buffers: shared read=95055
>>
>
> Total runtime: *137868.946 ms*
>> (13 rows)
>>
>
> The index is being used, but most of your time is going on the "Bitmap
> Heap Scan". You're processing 5.2 million rows in about 120 seconds -
> that's about 43 rows per millisecond - not too bad. It's not getting any
> cache hits though, it's having to read all the blocks. Looking at the
> number of blocks, that's ~2.5 million at 8KB each or about 20GB. You just
> don't have the RAM to cache that.
>
> If you have lots of similar reporting queries to run, you might get away
> with dropping the index and letting them run in parallel. Each individual
> query would be slow but they should be smart enough to share each other's
> sequential scans - the disks would basically be looping through you data
> continuously.
>
> --
> Richard Huxton
> Archonet Ltd
>
From | Date | Subject | |
---|---|---|---|
Next Message | Oliver Seemann | 2013-12-07 12:43:18 | Re: One huge db vs many small dbs |
Previous Message | Mack Talcott | 2013-12-07 03:21:20 | Debugging shared memory issues on CentOS |