AW: Re: BUG #18471: Possible JIT memory leak resulting in signal 11:Segmentation fault on ARM

From: joachim(dot)haecker-becker(at)arcor(dot)de
To: "Dmitry Dolgov" <9erthalion6(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: AW: Re: BUG #18471: Possible JIT memory leak resulting in signal 11:Segmentation fault on ARM
Date: 2024-05-22 06:40:07
Message-ID: 673ce6ea298c4f989a34d63e1d13a8fa@arcor.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

<html><head></head><body><div style="font-family: arial,helvetica,sans-serif; font-size: 12px;"><title></title>Hi Dmitry,<br><br>thanks for looking into this.&nbsp;<br><br>Maybe it is a combination of JIT and some other postgres config changes we have in our environment?<br>I will try to reproduce with a blank config and only change the JIT settings.<br><br>This is where the source is &lt;&gt; default:<br>&nbsp;<table border="0" cellpadding="0" cellspacing="0" width="438"><tbody><tr height="19"><td height="19" width="258">name</td><td width="116">setting</td><td width="64">unit</td></tr><tr height="19"><td height="19">autovacuum_analyze_scale_factor</td><td align="right">0.03</td><td>&nbsp;</td></tr><tr height="19"><td height="19">autovacuum_max_workers</td><td align="right">6</td><td>&nbsp;</td></tr><tr height="19"><td height="19">autovacuum_naptime</td><td align="right">300</td><td>s</td></tr><tr height="19"><td height="19">autovacuum_vacuum_insert_scale_factor</td><td align="right">0.05</td><td>&nbsp;</td></tr><tr height="19"><td height="19">autovacuum_vacuum_scale_factor</td><td align="right">0.03</td><td>&nbsp;</td></tr><tr height="19"><td height="19">autovacuum_vacuum_threshold</td><td align="right">1000</td><td>&nbsp;</td></tr><tr height="19"><td height="19">client_connection_check_interval</td><td align="right">30000</td><td>ms</td></tr><tr height="19"><td height="19">default_text_search_config</td><td>pg_catalog.english</td><td>&nbsp;</td></tr><tr height="19"><td height="19">dynamic_shared_memory_type</td><td>posix</td><td>&nbsp;</td></tr><tr height="19"><td height="19">effective_cache_size</td><td align="right">1048576</td><td>8kB</td></tr><tr height="19"><td height="19">enable_partitionwise_aggregate</td><td>on</td><td>&nbsp;</td></tr><tr height="19"><td height="19">enable_partitionwise_join</td><td>on</td><td>&nbsp;</td></tr><tr height="19"><td height="19">hash_mem_multiplier</td><td align="right">1.5</td><td>&nbsp;</td></tr><tr height="19"><td height="19">jit</td><td>on</td><td>&nbsp;</td></tr><tr height="19"><td height="19">jit_above_cost</td><td align="right">1</td><td>&nbsp;</td></tr><tr height="19"><td height="19">jit_inline_above_cost</td><td align="right">1</td><td>&nbsp;</td></tr><tr height="19"><td height="19">jit_optimize_above_cost</td><td align="right">1</td><td>&nbsp;</td></tr><tr height="19"><td height="19">listen_addresses</td><td>*</td><td>&nbsp;</td></tr><tr height="19"><td height="19">log_destination</td><td>jsonlog</td><td>&nbsp;</td></tr><tr height="19"><td height="19">log_file_mode</td><td align="right">640</td><td>&nbsp;</td></tr><tr height="19"><td height="19">log_lock_waits</td><td>on</td><td>&nbsp;</td></tr><tr height="19"><td height="19">log_rotation_size</td><td align="right">102400</td><td>kB</td></tr><tr height="19"><td height="19">log_timezone</td><td>Etc/UTC</td><td>&nbsp;</td></tr><tr height="19"><td height="19">logging_collector</td><td>on</td><td>&nbsp;</td></tr><tr height="19"><td height="19">maintenance_work_mem</td><td align="right">1048576</td><td>kB</td></tr><tr height="19"><td height="19">max_connections</td><td align="right">150</td><td>&nbsp;</td></tr><tr height="19"><td height="19">max_locks_per_transaction</td><td align="right">1024</td><td>&nbsp;</td></tr><tr height="19"><td height="19">max_parallel_workers</td><td align="right">8</td><td>&nbsp;</td></tr><tr height="19"><td height="19">max_parallel_workers_per_gather</td><td align="right">2</td><td>&nbsp;</td></tr><tr height="19"><td height="19">max_wal_size</td><td align="right">2048</td><td>MB</td></tr><tr height="19"><td height="19">min_wal_size</td><td align="right">80</td><td>MB</td></tr><tr height="19"><td height="19">random_page_cost</td><td align="right">1</td><td>&nbsp;</td></tr><tr height="19"><td height="19">shared_buffers</td><td align="right">786432</td><td>8kB</td></tr><tr height="19"><td height="19">TimeZone</td><td>Etc/UTC</td><td>&nbsp;</td></tr><tr height="19"><td height="19">work_mem</td><td align="right">512000</td><td>kB</td></tr></tbody></table><br><br>The docker container has a 6gb shm_size.<br><br>Let me know if there is anything else I can provide to get this resolved.<br><br>&nbsp;<div class="replyHeader" style="line-height:20px;padding:5px;border-top:1px solid #dfdfdf;"><b style="width:80px;display:inline-block;font-size:95%;">Von:</b> Dmitry Dolgov &lt;9erthalion6(at)gmail(dot)com&gt;<br><b style="width:80px;display:inline-block;font-size:95%;">Gesendet:</b> 21.05.2024 18:08<br><b style="width:80px;display:inline-block;font-size:95%;">An:</b> &lt;joachim(dot)haecker-becker(at)arcor(dot)de&gt;,&lt;pgsql-bugs(at)lists(dot)postgresql(dot)org&gt;<br><b style="width:80px;display:inline-block;font-size:95%;">Betreff:</b> Re: BUG #18471: Possible JIT memory leak resulting in signal 11:Segmentation fault on ARM</div>&nbsp;<div>&gt; On Fri, May 17, 2024 at 01:13:06PM +0000, PG Bug reporting form wrote:<br>&gt; The following bug has been logged on the website:<br>&gt;<br>&gt; Bug reference: 18471<br>&gt; Logged by: Joachim Haecker-Becker<br>&gt; Email address: <a href="mailto:joachim(dot)haecker-becker(at)arcor(dot)de">joachim(dot)haecker-becker(at)arcor(dot)de</a><br>&gt; PostgreSQL version: 16.3<br>&gt; Operating system: Debian Bookworm<br>&gt; Description:<br>&gt;<br>&gt; We have a reproducible way to force a postgres process to consume more and<br>&gt; more RAM until it crashes on ARM.<br>&gt; The same works on X86 without any issue.<br>&gt; With jit=off it runs on ARM as well.<br>&gt;<br>&gt; We run into this situation in a real-life database situation with a lot of<br>&gt; joins and aggregate functions.<br>&gt; The following code is just a mock to reproduce a similar situation without<br>&gt; needing access to our real data.<br>&gt; This issue blocks us from upgrading or ARM-hosted databases into something<br>&gt; newer than 14.7.<br><br>I think it would be useful to know how much memory difference are we<br>talking about and, just to make everything clear, how exactly postgres<br>crashes (OOM kill I assume)? It's important to differentiate between the<br>case "ARM with jit crashes, ARM without jit doesn't" and "ARM with jit<br>crashes, ARM without jit crashes with even more columns" (the same goes<br>for x86).<br><br>I've tried to reproduce it on an arm64 VM (16.3 build with llvm 17), and<br>although I could observe some difference in memory consumption between<br>JIT on/off, but it wasn't huge (around 10% or so). Running it under<br>valgrind shows only complains about memory allocated for bitcode<br>modules, which is expected -- as far as I recall postgres is somewhat<br>wasteful when it comes to allocating memory for those modules, even more<br>so for parallel workers. This is the case here, where there is growing<br>number of parallel hash workers. This would not explain any difference<br>from x86 of course, but there might be different baseline memory<br>consumption for different architectures.</div></div></body></html>

Attachment Content-Type Size
unknown_filename text/html 6.8 KB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Rahul Pandey 2024-05-22 11:37:17 Re: [EXTERNAL] Re: Windows Application Issues | PostgreSQL | REF # 48475607
Previous Message Tender Wang 2024-05-22 03:46:52 Re: BUG #18377: Assert false in "partdesc->nparts >= pinfo->nparts", fileName="execPartition.c", lineNumber=1943