From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | OOM with many sorts |
Date: | 2019-07-08 16:44:01 |
Message-ID: | 20190708164401.GA22387@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
One of our instances was killed:
| Out of memory: Kill process 27670 (postmaster) score 381 or sacrifice child
| Killed process 27670 (postmaster) total-vm:13207796kB, anon-rss:5736kB, file-rss:0kB, shmem-rss:12421696kB
The VM has: 32GB RAM, shared_buffers=12GB, work_mem=128MB
Running only pg11.3
Trying to diagnose the issue by running a query on a narrower range of dates,
it looks to me that's due to many sorts nodes per worker.
Part of the query plan looks like this:
[...]
-> Finalize GroupAggregate (actual rows=1788 loops=1)
Group Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, 'YYYY-MM'::text))
-> Gather Merge (actual rows=56921 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (actual rows=18974 loops=3)
Sort Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, 'YYYY-MM'::text))
Sort Method: quicksort Memory: 36499kB
Worker 0: Sort Method: quicksort Memory: 40275kB
Worker 1: Sort Method: quicksort Memory: 40263kB
-> Parallel Append (actual rows=18974 loops=3)
-> Partial GroupAggregate (actual rows=1783 loops=1)
Group Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, 'YYYY-MM'::text))
-> Sort (actual rows=42792 loops=1)
Sort Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, 'YYYY-MM'::text))
Worker 0: Sort Method: quicksort Memory: 44997kB
-> Parallel Seq Scan on huawei_umts_ucell_20181217 (actual rows=42792 loops=1)
Filter: ((start_time >= '2018-12-01 00:00:00'::timestamp without time zone) AND (start_time < '2019-01-01 00:00:00'::timestamp without time zone))
-> Partial GroupAggregate (actual rows=1783 loops=1)
Group Key: huawei_umts_ucell_20181216.sect_id, (to_char(huawei_umts_ucell_20181216.start_time, 'YYYY-MM'::text))
-> Sort (actual rows=42792 loops=1)
Sort Key: huawei_umts_ucell_20181216.sect_id, (to_char(huawei_umts_ucell_20181216.start_time, 'YYYY-MM'::text))
Worker 1: Sort Method: quicksort Memory: 44997kB
-> Parallel Seq Scan on huawei_umts_ucell_20181216 (actual rows=42792 loops=1)
Filter: ((start_time >= '2018-12-01 00:00:00'::timestamp without time zone) AND (start_time < '2019-01-01 00:00:00'::timestamp without time zone))
[...many partial GroupAggregate/Sort repeated here for ~200 child tables...]
This isn't the first time I've had to reduce work_mem on a parallel query to
avoid OOM, but it seems unreasonable to be concerned with 50MB sorts.
It looks like the results of each Sort node stay in RAM, during processing of
additional sort nodes (is that required?)
I SET max_parallel_workers_per_gather=0 and dumped mem context:
...
PortalContext: 203948152 total in 25756 blocks; 176600 free (319 chunks); 203771552 used:
...
Grand total: 1918118632 bytes in 30716 blocks; 192472208 free (3685 chunks); 1725646424 used
The total shows ~1700MB but the largest context is only 200MB.
"Caller tuples" accounts for 1400MB.
awk '/Caller tuples/{s+=$3}END{print s/1024^3}' /tmp/mem-context
1.44043
I'm attaching full plan and mem context.
It doesn't seem to be changed under pg12b2, which is concerning since PG12
advertizes that it gracefully supports "thousands" of partitions. It seems to
me that would require tiny work_mem, which would be devastating to some
workloads. This is not a contrived test case, it's one of our reports run
across ~10 months of history with weekly granularity.
Thanks in advance for any advice.
Justin
Attachment | Content-Type | Size |
---|---|---|
sql-2019-07-03.5.ex-an | text/plain | 33.7 KB |
mem-context | text/plain | 24.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Guyren Howe | 2019-07-08 16:54:06 | Elastic Search much faster at statistics? |
Previous Message | Tom Lane | 2019-07-08 16:29:28 | Re: Measuring the Query Optimizer Effect: Turning off the QO? |