Re: Increasing work_mem slows down query, why?

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Silvio Moioli <moio(at)suse(dot)de>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Increasing work_mem slows down query, why?
Date: 2020-03-30 10:12:33
Message-ID: CAFj8pRADkvF1BwpX7Dv3cYOkcEFAHNNcRX1rM3eqbkGe7rbTbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

po 30. 3. 2020 v 10:12 odesílatel Silvio Moioli <moio(at)suse(dot)de> napsal:

> On 3/30/20 8:56 AM, Pavel Stehule wrote:
> > please, can you send explain in text form?
>
> Sure. With work_mem = 80MB:
>
>
> QUERY PLAN
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=608228.26..608228.27 rows=2 width=36) (actual
> time=13360.241..13360.454 rows=1100 loops=1)
> Output: wanted_capability.ordering, inserted_capability.id
> Buffers: shared hit=14448
> CTE wanted_capability
> -> Values Scan on "*VALUES*" (cost=0.00..552.75 rows=1100 width=68)
> (actual time=0.001..0.246 rows=1100 loops=1)
> Output: "*VALUES*".column1, "*VALUES*".column2,
> "*VALUES*".column3
> CTE missing_capability
> -> Merge Left Join (cost=300263.57..303282.17 rows=1 width=68)
> (actual time=6686.320..6686.320 rows=0 loops=1)
> Output: wanted_capability_2.ordering, wanted_capability_2.name,
> wanted_capability_2.version
> Merge Cond: (wanted_capability_2.name = (
> rhnpackagecapability_1.name)::text)
> Join Filter: (NOT (wanted_capability_2.version IS DISTINCT FROM
> (rhnpackagecapability_1.version)::text))
> Filter: (rhnpackagecapability_1.id IS NULL)
> Rows Removed by Filter: 1100
> Buffers: shared hit=7222
> -> Sort (cost=1155.57..1158.32 rows=1100 width=68) (actual
> time=10.011..10.053 rows=1100 loops=1)
> Output: wanted_capability_2.ordering,
> wanted_capability_2.name, wanted_capability_2.version
> Sort Key: wanted_capability_2.name
> Sort Method: quicksort Memory: 203kB
> Buffers: shared hit=5
> -> CTE Scan on wanted_capability wanted_capability_2
> (cost=0.00..1100.00 rows=1100 width=68) (actual time=0.000..0.516 rows=1100
> loops=1)
> Output: wanted_capability_2.ordering,
> wanted_capability_2.name, wanted_capability_2.version
> -> Sort (cost=299108.00..300335.41 rows=490964 width=79)
> (actual time=6475.147..6494.111 rows=462600 loops=1)
> Output: rhnpackagecapability_1.name,
> rhnpackagecapability_1.version, rhnpackagecapability_1.id
> Sort Key: rhnpackagecapability_1.name
> Sort Method: quicksort Memory: 79862kB
> Buffers: shared hit=7217
> -> Seq Scan on public.rhnpackagecapability
> rhnpackagecapability_1 (cost=0.00..252699.00 rows=490964 width=79) (actual
> time=0.016..59.976 rows=490964 loops=1)
> Output: rhnpackagecapability_1.name,
> rhnpackagecapability_1.version, rhnpackagecapability_1.id
> Buffers: shared hit=7217
> CTE inserted_capability
> -> Insert on public.rhnpackagecapability rhnpackagecapability_2
> (cost=0.00..1.51 rows=1 width=1080) (actual time=6686.322..6686.322 rows=0
> loops=1)
> Output: rhnpackagecapability_2.id, rhnpackagecapability_2.name,
> rhnpackagecapability_2.version
> Conflict Resolution: NOTHING
> Tuples Inserted: 0
> Conflicting Tuples: 0
> Buffers: shared hit=7222
> -> Subquery Scan on "*SELECT*" (cost=0.00..1.51 rows=1
> width=1080) (actual time=6686.321..6686.321 rows=0 loops=1)
> Output: "*SELECT*".nextval, "*SELECT*".name,
> "*SELECT*".version, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP
> Buffers: shared hit=7222
> -> CTE Scan on missing_capability (cost=0.00..1.00
> rows=1 width=72) (actual time=6686.320..6686.320 rows=0 loops=1)
> Output:
> nextval('rhn_pkg_capability_id_seq'::regclass), missing_capability.name,
> missing_capability.version
> Buffers: shared hit=7222
> -> Sort (cost=304391.82..304391.83 rows=2 width=36) (actual
> time=13360.240..13360.283 rows=1100 loops=1)
> Output: wanted_capability.ordering, inserted_capability.id
> Sort Key: wanted_capability.ordering, inserted_capability.id
> Sort Method: quicksort Memory: 100kB
> Buffers: shared hit=14448
> -> Append (cost=1.50..304391.81 rows=2 width=36) (actual
> time=13357.167..13360.051 rows=1100 loops=1)
> Buffers: shared hit=14442
> -> Hash Join (cost=1.50..1108.64 rows=1 width=36) (actual
> time=6686.340..6686.340 rows=0 loops=1)
> Output: wanted_capability.ordering,
> inserted_capability.id
> Hash Cond: (wanted_capability.name = (
> inserted_capability.name)::text)
> Join Filter: (NOT (wanted_capability.version IS
> DISTINCT FROM (inserted_capability.version)::text))
> Buffers: shared hit=7225
> -> CTE Scan on wanted_capability
> (cost=0.00..1100.00 rows=1100 width=68) (actual time=0.002..0.002 rows=1
> loops=1)
> Output: wanted_capability.ordering,
> wanted_capability.name, wanted_capability.version
> -> Hash (cost=1.00..1.00 rows=1 width=1064) (actual
> time=6686.323..6686.323 rows=0 loops=1)
> Output: inserted_capability.id,
> inserted_capability.name, inserted_capability.version
> Buckets: 1024 Batches: 1 Memory Usage: 8kB
> Buffers: shared hit=7222
> -> CTE Scan on inserted_capability
> (cost=0.00..1.00 rows=1 width=1064) (actual time=6686.322..6686.322 rows=0
> loops=1)
> Output: inserted_capability.id,
> inserted_capability.name, inserted_capability.version
> Buffers: shared hit=7222
> -> Merge Join (cost=300263.57..303282.17 rows=1 width=10)
> (actual time=6670.825..6673.642 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> rhnpackagecapability.id
> Merge Cond: (wanted_capability_1.name = (
> rhnpackagecapability.name)::text)
> Join Filter: (NOT (wanted_capability_1.version IS
> DISTINCT FROM (rhnpackagecapability.version)::text))
> Buffers: shared hit=7217
> -> Sort (cost=1155.57..1158.32 rows=1100 width=68)
> (actual time=9.430..9.474 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> wanted_capability_1.name, wanted_capability_1.version
> Sort Key: wanted_capability_1.name
> Sort Method: quicksort Memory: 203kB
> -> CTE Scan on wanted_capability
> wanted_capability_1 (cost=0.00..1100.00 rows=1100 width=68) (actual
> time=0.001..0.066 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> wanted_capability_1.name, wanted_capability_1.version
> -> Sort (cost=299108.00..300335.41 rows=490964
> width=79) (actual time=6458.988..6477.151 rows=462600 loops=1)
> Output: rhnpackagecapability.id,
> rhnpackagecapability.name, rhnpackagecapability.version
> Sort Key: rhnpackagecapability.name
> Sort Method: quicksort Memory: 79862kB
> Buffers: shared hit=7217
> -> Seq Scan on public.rhnpackagecapability
> (cost=0.00..252699.00 rows=490964 width=79) (actual time=0.012..50.467
> rows=490964 loops=1)
> Output: rhnpackagecapability.id,
> rhnpackagecapability.name, rhnpackagecapability.version
> Buffers: shared hit=7217
> Planning time: 2.110 ms
> Execution time: 13362.965 ms
>
>
>
> With work_mem = 5MB:
>
>
> QUERY PLAN
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=648953.89..648953.91 rows=2 width=36) (actual
> time=221.127..221.337 rows=1100 loops=1)
> Output: wanted_capability.ordering, inserted_capability.id
> Buffers: shared hit=7226 read=7217
> CTE wanted_capability
> -> Values Scan on "*VALUES*" (cost=0.00..552.75 rows=1100 width=68)
> (actual time=0.001..0.266 rows=1100 loops=1)
> Output: "*VALUES*".column1, "*VALUES*".column2,
> "*VALUES*".column3
> CTE missing_capability
> -> Hash Right Join (cost=1652.75..323644.99 rows=1 width=68)
> (actual time=137.544..137.544 rows=0 loops=1)
> Output: wanted_capability_2.ordering, wanted_capability_2.name,
> wanted_capability_2.version
> Hash Cond: ((rhnpackagecapability_1.name)::text =
> wanted_capability_2.name)
> Join Filter: (NOT (wanted_capability_2.version IS DISTINCT FROM
> (rhnpackagecapability_1.version)::text))
> Filter: (rhnpackagecapability_1.id IS NULL)
> Rows Removed by Filter: 1100
> Buffers: shared read=7217
> -> Seq Scan on public.rhnpackagecapability
> rhnpackagecapability_1 (cost=0.00..252699.00 rows=490964 width=79) (actual
> time=0.016..77.305 rows=490964 loops=1)
> Output: rhnpackagecapability_1.id,
> rhnpackagecapability_1.name, rhnpackagecapability_1.version,
> rhnpackagecapability_1.created, rhnpackagecapability_1.modified
> Buffers: shared read=7217
> -> Hash (cost=1100.00..1100.00 rows=1100 width=68) (actual
> time=0.812..0.812 rows=1100 loops=1)
> Output: wanted_capability_2.ordering,
> wanted_capability_2.name, wanted_capability_2.version
> Buckets: 2048 Batches: 1 Memory Usage: 134kB
> -> CTE Scan on wanted_capability wanted_capability_2
> (cost=0.00..1100.00 rows=1100 width=68) (actual time=0.000..0.574 rows=1100
> loops=1)
> Output: wanted_capability_2.ordering,
> wanted_capability_2.name, wanted_capability_2.version
> CTE inserted_capability
> -> Insert on public.rhnpackagecapability rhnpackagecapability_2
> (cost=0.00..1.51 rows=1 width=1080) (actual time=137.546..137.546 rows=0
> loops=1)
> Output: rhnpackagecapability_2.id, rhnpackagecapability_2.name,
> rhnpackagecapability_2.version
> Conflict Resolution: NOTHING
> Tuples Inserted: 0
> Conflicting Tuples: 0
> Buffers: shared read=7217
> -> Subquery Scan on "*SELECT*" (cost=0.00..1.51 rows=1
> width=1080) (actual time=137.545..137.545 rows=0 loops=1)
> Output: "*SELECT*".nextval, "*SELECT*".name,
> "*SELECT*".version, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP
> Buffers: shared read=7217
> -> CTE Scan on missing_capability (cost=0.00..1.00
> rows=1 width=72) (actual time=137.544..137.545 rows=0 loops=1)
> Output:
> nextval('rhn_pkg_capability_id_seq'::regclass), missing_capability.name,
> missing_capability.version
> Buffers: shared read=7217
> -> Sort (cost=324754.64..324754.65 rows=2 width=36) (actual
> time=221.126..221.165 rows=1100 loops=1)
> Output: wanted_capability.ordering, inserted_capability.id
> Sort Key: wanted_capability.ordering, inserted_capability.id
> Sort Method: quicksort Memory: 100kB
> Buffers: shared hit=7226 read=7217
> -> Append (cost=1.50..324754.63 rows=2 width=36) (actual
> time=169.421..220.870 rows=1100 loops=1)
> Buffers: shared hit=7220 read=7217
> -> Hash Join (cost=1.50..1108.64 rows=1 width=36) (actual
> time=137.573..137.573 rows=0 loops=1)
> Output: wanted_capability.ordering,
> inserted_capability.id
> Hash Cond: (wanted_capability.name = (
> inserted_capability.name)::text)
> Join Filter: (NOT (wanted_capability.version IS
> DISTINCT FROM (inserted_capability.version)::text))
> Buffers: shared hit=3 read=7217
> -> CTE Scan on wanted_capability
> (cost=0.00..1100.00 rows=1100 width=68) (actual time=0.002..0.002 rows=1
> loops=1)
> Output: wanted_capability.ordering,
> wanted_capability.name, wanted_capability.version
> -> Hash (cost=1.00..1.00 rows=1 width=1064) (actual
> time=137.547..137.547 rows=0 loops=1)
> Output: inserted_capability.id,
> inserted_capability.name, inserted_capability.version
> Buckets: 1024 Batches: 1 Memory Usage: 8kB
> Buffers: shared read=7217
> -> CTE Scan on inserted_capability
> (cost=0.00..1.00 rows=1 width=1064) (actual time=137.547..137.547 rows=0
> loops=1)
> Output: inserted_capability.id,
> inserted_capability.name, inserted_capability.version
> Buffers: shared read=7217
> -> Hash Join (cost=1652.75..323644.99 rows=1 width=10)
> (actual time=31.846..83.234 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> rhnpackagecapability.id
> Hash Cond: ((rhnpackagecapability.name)::text =
> wanted_capability_1.name)
> Join Filter: (NOT (wanted_capability_1.version IS
> DISTINCT FROM (rhnpackagecapability.version)::text))
> Buffers: shared hit=7217
> -> Seq Scan on public.rhnpackagecapability
> (cost=0.00..252699.00 rows=490964 width=79) (actual time=0.007..29.702
> rows=490964 loops=1)
> Output: rhnpackagecapability.id,
> rhnpackagecapability.name, rhnpackagecapability.version,
> rhnpackagecapability.created, rhnpackagecapability.modified
> Buffers: shared hit=7217
> -> Hash (cost=1100.00..1100.00 rows=1100 width=68)
> (actual time=0.257..0.257 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> wanted_capability_1.name, wanted_capability_1.version
> Buckets: 2048 Batches: 1 Memory Usage: 134kB
> -> CTE Scan on wanted_capability
> wanted_capability_1 (cost=0.00..1100.00 rows=1100 width=68) (actual
> time=0.001..0.067 rows=1100 loops=1)
> Output: wanted_capability_1.ordering,
> wanted_capability_1.name, wanted_capability_1.version
> Planning time: 3.232 ms
> Execution time: 221.668 ms
>
> > Probably, there is a problem in wrong estimation.
>
> Yes, that's what I would also assume.
>
> > What can be expected because CTE is optimization fence in this version
> I am aware of that, but would not expect it to really be a problem in this
> specific case. Fact that CTE is an optimization fence is true regardless of
> work_mem, so ATM I cannot see why it would lead to slow down the query in
> high work_mem case.
>
> I am sure I am still missing something...
>

This parts looks strange

-> Sort (cost=299108.00..300335.41 rows=490964
width=79) (actual time=6458.988..6477.151 rows=462600 loops=1)
Output: rhnpackagecapability.id,
rhnpackagecapability.name, rhnpackagecapability.version
Sort Key: rhnpackagecapability.name
Sort Method: quicksort Memory: 79862kB
Buffers: shared hit=7217
-> Seq Scan on public.rhnpackagecapability
(cost=0.00..252699.00 rows=490964 width=79) (actual time=0.012..50.467
rows=490964 loops=1)
Output: rhnpackagecapability.id,
rhnpackagecapability.name, rhnpackagecapability.version
Buffers: shared hit=7217

I did some test case

postgres=# explain (analyze, buffers) select * from foo2 join foo3 on
foo2.name = foo3.name;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------
Hash Join (cost=188.62..33869.93 rows=866330 width=78) (actual
time=6.247..369.081 rows=934000 loops=1)
Hash Cond: (foo2.name = foo3.name)
Buffers: shared hit=2224 read=4092
-> Seq Scan on foo2 (cost=0.00..12518.00 rows=625000 width=48) (actual
time=0.095..70.174 rows=625000 loops=1)
Buffers: shared hit=2176 read=4092
-> Hash (cost=110.50..110.50 rows=6250 width=30) (actual
time=6.116..6.116 rows=6250 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 447kB
Buffers: shared hit=48
-> Seq Scan on foo3 (cost=0.00..110.50 rows=6250 width=30)
(actual time=0.014..1.801 rows=6250 loops=1)
Buffers: shared hit=48
Planning Time: 1.190 ms
Execution Time: 414.264 ms
(12 rows)

postgres=# explain (analyze, buffers) select * from foo2 join foo3 on
foo2.name = foo3.name;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=73189.73..86215.92 rows=866330 width=78) (actual
time=1499.805..1835.262 rows=934000 loops=1)
Merge Cond: (foo3.name = foo2.name)
Buffers: shared hit=2261 read=4060, temp read=13104 written=10023
-> Sort (cost=504.55..520.18 rows=6250 width=30) (actual
time=21.313..21.895 rows=6250 loops=1)
Sort Key: foo3.name
Sort Method: quicksort Memory: 763kB
Buffers: shared hit=53
-> Seq Scan on foo3 (cost=0.00..110.50 rows=6250 width=30)
(actual time=0.017..1.802 rows=6250 loops=1)
Buffers: shared hit=48
-> Sort (cost=72685.18..74247.68 rows=625000 width=48) (actual
time=1478.480..1602.358 rows=933999 loops=1)
Sort Key: foo2.name
Sort Method: external sort Disk: 40088kB
Buffers: shared hit=2208 read=4060, temp read=12196 written=10023
-> Seq Scan on foo2 (cost=0.00..12518.00 rows=625000 width=48)
(actual time=0.039..63.340 rows=625000 loops=1)
Buffers: shared hit=2208 read=4060
Planning Time: 1.116 ms
Execution Time: 1884.985 ms
(17 rows)

And looks little bit strange the cost on seq scan on foo2 12K against cost
of your public.rhnpackagecapability - 252K.

Do you have some planner variables changed - like seq_page_cost?

I did some tests and it looks so a penalization for sort long keys is not
too high. In your case it is reason why sort is very slow (probably due
slow locales). Then the cost of hash join and sort is similar, although in
reality it is not true.

On your plan is strange the cost of seq scan. It is surprisingly high.

Regards

Pavel

> Thanks!
>
> Regards,
> --
> Silvio Moioli
> SUSE Manager Development Team
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Silvio Moioli 2020-03-30 13:09:12 Re: Increasing work_mem slows down query, why?
Previous Message Silvio Moioli 2020-03-30 08:12:42 Re: Increasing work_mem slows down query, why?