Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Segfault in jit tuple deforming on arm64 due to LLVM issue
Date: 2024-08-29 12:18:01
Message-ID: CAO6_XqqxEQ=JY+tYO-KQn3_pKQ3O-mPojcwG54L5eptiu1cSJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 28, 2024 at 12:24 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> 2. I tested against LLVM 10-18, and found that 10 and 11 lack some
> needed symbols. So I just hid this code from them. Even though our
> stable branches support those and even older versions, I am not sure
> if it's worth trying to do something about that for EOL'd distros that
> no one has ever complained about. I am willing to try harder if
> someone thinks that's important...

I would also assume that people using arm64 are more likely to use
recent versions than not.

I've done some additional tests on different LLVM versions with both
the unpatched version (to make sure the crash was triggered) and the
patched version. I'm joining the test scripts I've used as reference.
They target a kubernetes pod since it was the easiest way for me to
get a test ubuntu Jammy:
- setup_pod.sh: Install necessary packages, get multiple llvm
versions, fetch and compile master and patched version of postgres on
different LLVM version
- run_test.sh: go through all LLVM versions for both unpatched and
patched postgres to run the test_script.sh
- test_script.sh: ran inside the pod to setup the db with the
necessary tables and check if the crash happens

This generated the following output:
Test unpatched version on LLVM 19, : Crash triggered
Test unpatched version on LLVM 18, libLLVM-18.so.18.1: Crash triggered
Test unpatched version on LLVM 17, libLLVM-17.so.1: Crash triggered
Test unpatched version on LLVM 16, libLLVM-16.so.1: Crash triggered
Test unpatched version on LLVM 15, libLLVM-15.so.1: Crash triggered
Test unpatched version on LLVM 14, libLLVM-14.so.1: Crash triggered
Test unpatched version on LLVM 13, libLLVM-13.so.1: Crash triggered

Test patched version on LLVM 19, : Query ran successfully
Test patched version on LLVM 18, libLLVM-18.so.18.1: Query ran successfully
Test patched version on LLVM 17, libLLVM-17.so.1: Query ran successfully
Test patched version on LLVM 16, libLLVM-16.so.1: Query ran successfully
Test patched version on LLVM 15, libLLVM-15.so.1: Query ran successfully
Test patched version on LLVM 14, libLLVM-14.so.1: Query ran successfully
Test patched version on LLVM 13, libLLVM-13.so.1: Query ran successfully

I try to print the libLLVM linked to llvm.jit in the output to double
check whether I test on the correct version. The LLVM 19 package only
provides static libraries (probably because it's still a release
candidate?) so it shows as empty in the output. There was no LLVM 12
available when using the llvm.sh script so I couldn't test it. As for
the result, prepatch PG all crashed as expected while the patched
version was able to run the query successfully.

> Next, I think we should wait to see if the LLVM project commits that
> PR, this so that we can sync with their 19.x stable branch, instead of
> using code from a PR. Our next minor release is in November, so we
> have some time. If they don't commit it, we can consider it anyway: I
> mean, it's crashing all over the place in production, and we see that
> other projects are shipping this code already.

The PR[1] just received an approval and it sounds like they are ok to
eventually merge it.

[1] https://github.com/llvm/llvm-project/pull/71968

Attachment Content-Type Size
test_script.sh text/x-sh 974 bytes
run_test.sh text/x-sh 1.2 KB
setup_pod.sh text/x-sh 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-08-29 12:40:09 Re: Eager aggregation, take 3
Previous Message Peter Eisentraut 2024-08-29 12:15:50 Re: Virtual generated columns