Re: slow query with inline function on AWS RDS with RDS 24x large

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Ayub Khan <ayub(dot)hp(at)gmail(dot)com>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: slow query with inline function on AWS RDS with RDS 24x large
Date: 2021-06-04 08:40:30
Message-ID: CAFj8pRDzkBpjiHNqfHh1pwkDTi0gqnEGWdNP-pnHLyiAeuwqiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi

pá 4. 6. 2021 v 10:32 odesílatel Ayub Khan <ayub(dot)hp(at)gmail(dot)com> napsal:

> BRIN index is only on the date_time column, I even tried with btree index
> with no performance gains.
>

-> Bitmap Heap Scan on restaurant_order ro
(cost=5427.94..3353966.60 rows=19275986 width=108) (actual
time=1036.793..1036.793 rows=0 loops=1)
Recheck Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:05:32.784199+00'::timestamp with time zone))
Rows Removed by Index Recheck: 5039976
Heap Blocks: lossy=275230

When the most rows are removed in recheck, then the effectivity of the
index is not good

Pavel

>
> On Fri, Jun 4, 2021 at 11:23 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>>
>>
>> pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan <ayub(dot)hp(at)gmail(dot)com> napsal:
>>
>>>
>>> below query is slow even with no data
>>>
>>>
>>> explain ANALYZE
>>>
>>> WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
>>> SELECT ro.order_id,
>>> ro.date_time,
>>> round(ro.order_amount, 2) AS order_amount,
>>> b.branch_id,
>>> b.branch_name,
>>> st_x(b.location) AS from_x,
>>> st_y(b.location) AS from_y,
>>> b.user_id AS branch_user_id,
>>> b.contact_info,
>>> r.restaurant_id,
>>> c.city_id,
>>> c.city_name,
>>> c.city_name_ar,
>>> st_linefromtext(((((((('LINESTRING('::text || st_x(b.location)) || ' '::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry)) || ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS from_to,
>>> to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
>>> ro.customer_comment,
>>> 'N'::text AS is_new_customer,
>>> ro.picked_up_time,
>>> ro.driver_assigned_date_time,
>>> oom.offer_amount,
>>> oom.offer_type_code AS offer_type,
>>> ro.uk_vat
>>> FROM business, restaurant_order ro
>>>
>>> JOIN branch b ON b.branch_id = ro.branch_id
>>> JOIN restaurant r ON r.restaurant_id = b.restaurant_id
>>> JOIN city c ON c.city_id = b.city_id
>>> LEFT JOIN order_offer_map oom using (order_id)
>>> WHERE ro.date_time >= business.start_date AND ro.date_time<= f_now_immutable_with_tz();
>>>
>>>
>>>
>>> Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual time=1056.926..1056.934 rows=0 loops=1)
>>> Hash Cond: (ro.order_id = oom.order_id)
>>> -> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209) (actual time=1056.926..1056.932 rows=0 loops=1)
>>> Hash Cond: (ro.branch_id = b.branch_id)
>>> -> Nested Loop (cost=5427.94..3546726.47 rows=19275986 width=108) (actual time=1036.809..1036.810 rows=0 loops=1)
>>> -> Function Scan on start_date (cost=0.00..0.01 rows=1 width=8) (actual time=0.006..0.008 rows=1 loops=1)
>>> -> Bitmap Heap Scan on restaurant_order ro (cost=5427.94..3353966.60 rows=19275986 width=108) (actual time=1036.793..1036.793 rows=0 loops=1)
>>> Recheck Cond: ((date_time >= start_date.start_date) AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>>> Rows Removed by Index Recheck: 5039976
>>> Heap Blocks: lossy=275230
>>> -> Bitmap Index Scan on rest_ord_date_brin (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038 rows=2917120 loops=1)
>>> Index Cond: ((date_time >= start_date.start_date) AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>>> -> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual time=20.106..20.109 rows=20949 loops=1)
>>> Buckets: 32768 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3112kB
>>> -> Hash Join (cost=343.29..1083.35 rows=5866 width=109) (actual time=1.620..14.539 rows=20949 loops=1)
>>> Hash Cond: (b.restaurant_id = r.restaurant_id)
>>> -> Hash Join (cost=2.26..726.91 rows=5866 width=109) (actual time=0.029..8.597 rows=20949 loops=1)
>>> Hash Cond: (b.city_id = c.city_id)
>>> -> Seq Scan on branch b (cost=0.00..668.49 rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
>>> -> Hash (cost=1.56..1.56 rows=56 width=29) (actual time=0.020..0.021 rows=56 loops=1)
>>> Buckets: 1024 Batches: 1 Memory Usage: 12kB
>>> -> Seq Scan on city c (cost=0.00..1.56 rows=56 width=29) (actual time=0.004..0.010 rows=56 loops=1)
>>> -> Hash (cost=233.42..233.42 rows=8609 width=8) (actual time=1.575..1.575 rows=8609 loops=1)
>>> Buckets: 16384 Batches: 1 Memory Usage: 465kB
>>> -> Index Only Scan using "restaurant_idx$$_274b003d" on restaurant r (cost=0.29..233.42 rows=8609 width=8) (actual time=0.006..0.684 rows=8609 loops=1)
>>> Heap Fetches: 0
>>> -> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never executed)
>>> -> Seq Scan on order_offer_map oom (cost=0.00..33000.09 rows=1273009 width=13) (never executed)
>>> Planning Time: 1.180 ms
>>> Execution Time: 1057.535 ms
>>>
>>> could some one explain why it is slow, if I insert 50k records the execution time reaches 20 seconds
>>>
>>>
>> -> Bitmap Heap Scan on restaurant_order ro
>> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
>> time=1036.793..1036.793 rows=0 loops=1)
>> Recheck Cond: ((date_time >= start_date.start_date)
>> AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time
>> zone))
>> Rows Removed by Index Recheck: 5039976
>> Heap Blocks: lossy=275230
>> -> Bitmap Index Scan on rest_ord_date_brin
>> (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
>> rows=2917120 loops=1)
>> Index Cond: ((date_time >=
>> start_date.start_date) AND (date_time <= '2021-06-04
>> 08:05:32.784199+00'::timestamp with time zone))
>>
>> Looks so the BRIN index is not in good condition. Maybe you need reindex,
>> maybe BRIN index is not good format for your data.
>>
>> There are lot of data - few millions of rows
>>
>> Regards
>>
>> Pavel
>>
>>
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ayub Khan 2021-06-04 08:51:13 Re: slow query with inline function on AWS RDS with RDS 24x large
Previous Message Ayub Khan 2021-06-04 08:32:24 Re: slow query with inline function on AWS RDS with RDS 24x large