From: | Ayub Khan <ayub(dot)hp(at)gmail(dot)com> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: slow query with inline function on AWS RDS with RDS 24x large |
Date: | 2021-06-04 08:51:13 |
Message-ID: | CAHdeyE+BJUwXiyk7Xb5bdbsbke6BMUdH8bEeX6GTtpzmPhXkqA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
You are right, I dropped BRIN and created btree and the performance on 0
rows matching criteria table is good, below is the plan with BTREE. I will
test by inserting lot of data.
Hash Join (cost=50186.91..3765911.10 rows=5397411 width=291) (actual
time=1.501..1.504 rows=0 loops=1)
Hash Cond: (b.restaurant_id = r.restaurant_id)
-> Hash Left Join (cost=49845.88..2078197.48 rows=5397411 width=216)
(actual time=0.079..0.081 rows=0 loops=1)
Hash Cond: (ro.order_id = oom.order_id)
-> Hash Join (cost=933.18..2007856.35 rows=5397411 width=209)
(actual time=0.078..0.080 rows=0 loops=1)
Hash Cond: (b.city_id = c.city_id)
-> Hash Join (cost=930.92..1956181.11 rows=19276467
width=188) (actual time=0.048..0.050 rows=0 loops=1)
Hash Cond: (ro.branch_id = b.branch_id)
-> Nested Loop (cost=0.56..1904639.80 rows=19276467
width=108) (actual time=0.048..0.048 rows=0 loops=1)
-> Function Scan on start_date (cost=0.00..0.01
rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=1)
-> Index Scan using rest_ord_date_brin on
restaurant_order ro (cost=0.56..1711875.12 rows=19276467 width=108)
(actual time=0.042..0.042 rows=0 loops=1)
Index Cond: ((date_time >=
start_date.start_date) AND (date_time <= '2021-06-04
08:48:45.377833+00'::timestamp with time zone))
-> Hash (cost=668.49..668.49 rows=20949 width=88)
(never executed)
-> Seq Scan on branch b (cost=0.00..668.49
rows=20949 width=88) (never executed)
-> Hash (cost=1.56..1.56 rows=56 width=29) (actual
time=0.026..0.027 rows=56 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on city c (cost=0.00..1.56 rows=56
width=29) (actual time=0.009..0.016 rows=56 loops=1)
-> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never
executed)
-> Seq Scan on order_offer_map oom (cost=0.00..33000.09
rows=1273009 width=13) (never executed)
-> Hash (cost=233.42..233.42 rows=8609 width=8) (actual
time=1.403..1.403 rows=8609 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 465kB
-> Index Only Scan using "restaurant_idx$$_274b003d" on restaurant
r (cost=0.29..233.42 rows=8609 width=8) (actual time=0.007..0.634
rows=8609 loops=1)
Heap Fetches: 0
Planning Time: 1.352 ms
Execution Time: 1.571 ms
On Fri, Jun 4, 2021 at 11:41 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:
> Hi
>
>
> pá 4. 6. 2021 v 10:32 odesílatel Ayub Khan <ayub(dot)hp(at)gmail(dot)com> napsal:
>
>> BRIN index is only on the date_time column, I even tried with btree index
>> with no performance gains.
>>
>
> -> Bitmap Heap Scan on restaurant_order ro (cost=5427.94..3353966.60 rows=19275986 width=108) (actual time=1036.793..1036.793 rows=0 loops=1)
> Recheck Cond: ((date_time >= start_date.start_date) AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
> Rows Removed by Index Recheck: 5039976
> Heap Blocks: lossy=275230
>
> When the most rows are removed in recheck, then the effectivity of the index is not good
>
> Pavel
>
>
>
>
>>
>> On Fri, Jun 4, 2021 at 11:23 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>
>>>
>>>
>>> pá 4. 6. 2021 v 10:07 odesílatel Ayub Khan <ayub(dot)hp(at)gmail(dot)com> napsal:
>>>
>>>>
>>>> below query is slow even with no data
>>>>
>>>>
>>>> explain ANALYZE
>>>>
>>>> WITH business AS( SELECT * FROM get_businessday_utc_f() start_date)
>>>> SELECT ro.order_id,
>>>> ro.date_time,
>>>> round(ro.order_amount, 2) AS order_amount,
>>>> b.branch_id,
>>>> b.branch_name,
>>>> st_x(b.location) AS from_x,
>>>> st_y(b.location) AS from_y,
>>>> b.user_id AS branch_user_id,
>>>> b.contact_info,
>>>> r.restaurant_id,
>>>> c.city_id,
>>>> c.city_name,
>>>> c.city_name_ar,
>>>> st_linefromtext(((((((('LINESTRING('::text || st_x(b.location)) || ' '::text) || st_y(b.location)) || ','::text) || st_x(ro.location_geometry)) || ' '::text) || st_y(ro.location_geometry)) || ')'::text, 28355) AS from_to,
>>>> to_char(ro.date_time, 'HH24:MI'::text) AS order_time,
>>>> ro.customer_comment,
>>>> 'N'::text AS is_new_customer,
>>>> ro.picked_up_time,
>>>> ro.driver_assigned_date_time,
>>>> oom.offer_amount,
>>>> oom.offer_type_code AS offer_type,
>>>> ro.uk_vat
>>>> FROM business, restaurant_order ro
>>>>
>>>> JOIN branch b ON b.branch_id = ro.branch_id
>>>> JOIN restaurant r ON r.restaurant_id = b.restaurant_id
>>>> JOIN city c ON c.city_id = b.city_id
>>>> LEFT JOIN order_offer_map oom using (order_id)
>>>> WHERE ro.date_time >= business.start_date AND ro.date_time<= f_now_immutable_with_tz();
>>>>
>>>>
>>>>
>>>> Hash Left Join (cost=55497.32..5417639.59 rows=5397276 width=291) (actual time=1056.926..1056.934 rows=0 loops=1)
>>>> Hash Cond: (ro.order_id = oom.order_id)
>>>> -> Hash Join (cost=6584.61..3674143.44 rows=5397276 width=209) (actual time=1056.926..1056.932 rows=0 loops=1)
>>>> Hash Cond: (ro.branch_id = b.branch_id)
>>>> -> Nested Loop (cost=5427.94..3546726.47 rows=19275986 width=108) (actual time=1036.809..1036.810 rows=0 loops=1)
>>>> -> Function Scan on start_date (cost=0.00..0.01 rows=1 width=8) (actual time=0.006..0.008 rows=1 loops=1)
>>>> -> Bitmap Heap Scan on restaurant_order ro (cost=5427.94..3353966.60 rows=19275986 width=108) (actual time=1036.793..1036.793 rows=0 loops=1)
>>>> Recheck Cond: ((date_time >= start_date.start_date) AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>>>> Rows Removed by Index Recheck: 5039976
>>>> Heap Blocks: lossy=275230
>>>> -> Bitmap Index Scan on rest_ord_date_brin (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038 rows=2917120 loops=1)
>>>> Index Cond: ((date_time >= start_date.start_date) AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time zone))
>>>> -> Hash (cost=1083.35..1083.35 rows=5866 width=109) (actual time=20.106..20.109 rows=20949 loops=1)
>>>> Buckets: 32768 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3112kB
>>>> -> Hash Join (cost=343.29..1083.35 rows=5866 width=109) (actual time=1.620..14.539 rows=20949 loops=1)
>>>> Hash Cond: (b.restaurant_id = r.restaurant_id)
>>>> -> Hash Join (cost=2.26..726.91 rows=5866 width=109) (actual time=0.029..8.597 rows=20949 loops=1)
>>>> Hash Cond: (b.city_id = c.city_id)
>>>> -> Seq Scan on branch b (cost=0.00..668.49 rows=20949 width=88) (actual time=0.004..1.609 rows=20949 loops=1)
>>>> -> Hash (cost=1.56..1.56 rows=56 width=29) (actual time=0.020..0.021 rows=56 loops=1)
>>>> Buckets: 1024 Batches: 1 Memory Usage: 12kB
>>>> -> Seq Scan on city c (cost=0.00..1.56 rows=56 width=29) (actual time=0.004..0.010 rows=56 loops=1)
>>>> -> Hash (cost=233.42..233.42 rows=8609 width=8) (actual time=1.575..1.575 rows=8609 loops=1)
>>>> Buckets: 16384 Batches: 1 Memory Usage: 465kB
>>>> -> Index Only Scan using "restaurant_idx$$_274b003d" on restaurant r (cost=0.29..233.42 rows=8609 width=8) (actual time=0.006..0.684 rows=8609 loops=1)
>>>> Heap Fetches: 0
>>>> -> Hash (cost=33000.09..33000.09 rows=1273009 width=13) (never executed)
>>>> -> Seq Scan on order_offer_map oom (cost=0.00..33000.09 rows=1273009 width=13) (never executed)
>>>> Planning Time: 1.180 ms
>>>> Execution Time: 1057.535 ms
>>>>
>>>> could some one explain why it is slow, if I insert 50k records the execution time reaches 20 seconds
>>>>
>>>>
>>> -> Bitmap Heap Scan on restaurant_order ro
>>> (cost=5427.94..3353966.60 rows=19275986 width=108) (actual
>>> time=1036.793..1036.793 rows=0 loops=1)
>>> Recheck Cond: ((date_time >= start_date.start_date)
>>> AND (date_time <= '2021-06-04 08:05:32.784199+00'::timestamp with time
>>> zone))
>>> Rows Removed by Index Recheck: 5039976
>>> Heap Blocks: lossy=275230
>>> -> Bitmap Index Scan on rest_ord_date_brin
>>> (cost=0.00..608.94 rows=19359111 width=0) (actual time=14.037..14.038
>>> rows=2917120 loops=1)
>>> Index Cond: ((date_time >=
>>> start_date.start_date) AND (date_time <= '2021-06-04
>>> 08:05:32.784199+00'::timestamp with time zone))
>>>
>>> Looks so the BRIN index is not in good condition. Maybe you need
>>> reindex, maybe BRIN index is not good format for your data.
>>>
>>> There are lot of data - few millions of rows
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>
>>
>>
--
--------------------------------------------------------------------
Sun Certified Enterprise Architect 1.5
Sun Certified Java Programmer 1.4
Microsoft Certified Systems Engineer 2000
http://in.linkedin.com/pub/ayub-khan/a/811/b81
mobile:+966-502674604
----------------------------------------------------------------------
It is proved that Hard Work and kowledge will get you close but attitude
will get you there. However, it's the Love
of God that will put you over the top!!
From | Date | Subject | |
---|---|---|---|
Next Message | Taras Savchuk | 2021-06-04 11:53:12 | PgSQL 12 on WinSrv ~3x faster than on Linux |
Previous Message | Pavel Stehule | 2021-06-04 08:40:30 | Re: slow query with inline function on AWS RDS with RDS 24x large |