Quick Links

Re: postgres performance

From:	chidamparam muthusamy <mchidamparam(at)gmail(dot)com>
To:	Richard Huxton <dev(at)archonet(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: postgres performance
Date:	2013-12-07 10:30:24
Message-ID:	CABUk_4gemzU+vFz_9tZ-0oueXHgupAeB1CrYcjXNX0KooQqidg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

hi,
thank you so much for the input.
Can you please clarify the following points:
*1. Output of BitmapAnd = 303660 rows*
-> BitmapAnd (cost=539314.51..539314.51 rows=303660 width=0) (actual
time=9083.085..9083.085 rows=0 loops=1)
-> Bitmap Index Scan on groupid_index
(cost=0.00..164070.62 rows=7998674 width=0) (actual
time=2303.788..2303.788 rows=7840766 loops=1)
Index Cond: ((detailed_report.group_id)::text =
'CHOICE'::text)
-> Bitmap Index Scan on client_index
(cost=0.00..175870.62 rows=7998674 width=0) (actual
time=2879.691..2879.691 rows=7840113 loops=1)
Index Cond: ((detailed_report.client)::text =
'ChoiceFone'::text)
-> Bitmap Index Scan on partial_endtime_index
(cost=0.00..199145.02 rows=9573259 width=0) (actual
time=1754.044..1754.044 rows=9476589 loops=1)
Index Cond: ((detailed_report.end_time >=
'2013-05-01 00:00:00+00'::timestamp with time zone) AND
(detailed_report.end_time < '2013-06-01 00:00:00+00'::timestamp wi
th time zone))

*2. In the Next outer node Bitmap Heap Scan, estimated rows = 303660 and
actual rows = 2958392, why huge difference ? How to bring it down. *
Bitmap Heap Scan on public.detailed_report (cost=539314.51..1544589.52
rows=303660 width=44) (actual time=9619.913..51757.911 rows=2958392 loops=1)

*3. what is the cause for Recheck, is it possible to reduce the time taken
for Recheck ?*
Recheck Cond: (((detailed_report.group_id)::text = 'CHOICE'::text) AND
((detailed_report.client)::text = 'ChoiceFone'::text) AND
(detailed_report.end_time >= '2013-05-01 00:00:
00+00'::timestamp with time zone) AND (detailed_report.end_time <
'2013-06-01 00:00:00+00'::timestamp with time zone))

thanks

On Sat, Dec 7, 2013 at 12:07 AM, Richard Huxton <dev(at)archonet(dot)com> wrote:

> On 06/12/13 17:36, chidamparam muthusamy wrote:
>
> I rather think Alan is right - you either want a lot more RAM or faster
> disks. Have a look at your first query...
>
>
> Query:
>> EXPLAIN (analyze, buffers) SELECT text(client) as client, text(gateway)
>> as gateway,count(*)::bigint as total_calls, (avg(duration)/1000.0)
>> ::numeric(10,2) as acd, (avg(pdd)) ::numeric(10,2) as pdd,
>> sum(call_duration_recv)/1000.0 as duration_recv,
>> sum(call_duration_pay)/1000.0 as duration_pay, sum(call_amount_recv) as
>> call_amount_recv, sum(call_amount_pay) as call_amount_
>> pay FROM detailed_report WHERE end_time>='2013-05-01 00:00' and
>> end_time<'2013-07-01 00:00' and group_id='admin' and client ='CHOICE'
>> GROUP by client, gateway ORDER BY call_amount_recv DESC;
>>
>
> QUERY PLAN
>> ------------------------------------------------------
>> Sort (cost=3422863.06..3422868.69 rows=2254 width=44) (actual
>> time=137852.474..137852.474 rows=5 loops=1)
>> Sort Key: (sum(call_amount_recv))
>> Sort Method: quicksort Memory: 25kB
>> Buffers: shared read=2491664
>>
>
> -> HashAggregate (cost=3422664.28..3422737.53 rows=2254 width=44)
>> (actual time=137852.402..137852.454 rows=5 loops=1)
>> Buffers: shared read=2491664
>>
>
> -> Bitmap Heap Scan on detailed_report (cost=644828.11..3399506.87
>> rows=1029218 width=44) (actual time=4499.558..125443.122 rows=5248227
>> loops=1)
>> Recheck Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with
>> time zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
>> zone) AND ((group_id)::text = 'adm
>> in'::text) AND ((client)::text = 'CHOICE'::text))
>> Buffers: shared read=2491664
>>
>
> -> Bitmap Index Scan on endtime_groupid_client_tsidx_detail_report
>> (cost=0.00..644570.81 rows=1029218 width=0) (actual
>> time=3418.754..3418.754 rows=5248227 loops=1)
>> Index Cond: ((end_time >= '2013-05-01 00:00:00+00'::timestamp with time
>> zone) AND (end_time < '2013-07-01 00:00:00+00'::timestamp with time
>> zone) AND ((group_id)::text =
>> 'admin'::text) AND ((client)::text = 'CHOICE'::text))
>> Buffers: shared read=95055
>>
>
> Total runtime: *137868.946 ms*
>> (13 rows)
>>
>
> The index is being used, but most of your time is going on the "Bitmap
> Heap Scan". You're processing 5.2 million rows in about 120 seconds -
> that's about 43 rows per millisecond - not too bad. It's not getting any
> cache hits though, it's having to read all the blocks. Looking at the
> number of blocks, that's ~2.5 million at 8KB each or about 20GB. You just
> don't have the RAM to cache that.
>
> If you have lots of similar reporting queries to run, you might get away
> with dropping the index and letting them run in parallel. Each individual
> query would be slow but they should be smart enough to share each other's
> sequential scans - the disks would basically be looping through you data
> continuously.
>
> --
> Richard Huxton
> Archonet Ltd
>

In response to

Re: postgres performance at 2013-12-06 18:37:13 from Richard Huxton

Responses

Re: postgres performance at 2013-12-07 12:55:41 from desmodemone

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Oliver Seemann	2013-12-07 12:43:18	Re: One huge db vs many small dbs
Previous Message	Mack Talcott	2013-12-07 03:21:20	Debugging shared memory issues on CentOS