From: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Segfault in jit tuple deforming on arm64 due to LLVM issue |
Date: | 2024-08-26 14:16:41 |
Message-ID: | CAO6_XqqFuE7eo1kq58eieF4UGYFe89KD0Uab4UxVNFsk1-HqgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 26, 2024 at 4:33 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> IIUC this one is a random and rare crash depending on malloc() and
> perhaps also the working size of your virtual memory dart board.
> (Annoyingly, I had tried to reproduce this quite a few times on small ARM
> systems when earlier reports came in, d'oh!).
allocateMappedMemory used when creating sections will eventually call
mmap[1], not malloc. So the amount of shared memory configured may be
a factor in triggering the issue.
My first attempts to reproduce the issue from scratch weren't
successful either. However, trying again with different values of
shared_buffers, I've managed to trigger the issue somewhat reliably.
On a clean Ubuntu jammy, I've compiled the current PostgreSQL
REL_14_STABLE (6bc2bfc3) with the following options:
CLANG=clang-14 ../configure --enable-cassert --enable-debug --prefix
~/.local/ --with-llvm
Set "shared_buffers = '4GB'" in the configuration. More may be needed
but 4GB was enough for me.
Create a table with multiple partitions with pgbench. The goal is to
have a jit module big enough to trigger the issue.
pgbench -i --partitions=64
Then run the following query with jit forcefully enabled:
psql options=-cjit_above_cost=0 -c 'SELECT count(bid) from pgbench_accounts;'
If the issue was successfully triggered, it should segfault or be
stuck in an infinite loop.
> Ultimately, if it doesn't work, and doesn't get fixed, it's hard for
> us to do much about it. But hmm, this is probably madness... I wonder
> if it would be feasible to detect address span overflow ourselves at a
> useful time, as a kind of band-aid defence...
There's a possible alternative, but it's definitely in the same
category as the hot-patching idea. llvmjit uses
LLVMOrcCreateRTDyldObjectLinkingLayerWithSectionMemoryManager to
create the ObjectLinkingLayer and it will be created with the default
SectionMemoryManager[2]. It should be possible to provide a modified
SectionMemoryManager with the change to allocate sections in a single
block and it could be restricted to arm64 architecture. A part of me
tells me this is probably a bad idea but on the other hand, LLVM
provides this way to plug a custom allocator and it would fix the
issue...
[1] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/Support/Unix/Memory.inc#L115-L117
[2] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp#L967-L973
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-08-26 14:19:06 | Re: Removing log_cnt from pg_sequence_read_tuple() |
Previous Message | David E. Wheeler | 2024-08-26 14:06:59 | Re: RFC: Additional Directory for Extensions |