Re: BUG #16971: Incompatible datalayout errors with llvmjit

From: Andres Freund <andres(at)anarazel(dot)de>
To: tstellar(at)redhat(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #16971: Incompatible datalayout errors with llvmjit
Date: 2021-04-20 19:29:37
Message-ID: 20210420192937.3zu4wpdemxwfvo4u@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

Thanks to tgl for pointing me this thread...

On 2021-04-19 18:29:52 +0000, PG Bug reporting form wrote:
> In our Fedora builds, we are getting errors[1] in the postgresql tests due
> to incompatible datalayouts between the JIT engine and the LLVM modules
> being compiled. The problem is that the JIT engine is being created with
> host specific CPU and features, while the datalayout for the compiled module
> is being taken from llvmjit_types.bc which is compiled without any specified
> CPU type or features.

It's very odd that features would change the data layout - analogizing
with plain C code that'd mean that you cannot link a binary compiled
with something like -mavx2 against a library compiled without. To me
this smells like a bug somewhere lower level.

Reformatting the error yields:
ERROR: failed to JIT module: Added modules have incompatible data layouts:
E-m:e-i1:8:16-i8:8:16-i64:64-f128:64- a:8:16-n32:64 (module) vs
E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)

The -v128:64 is about how to align vectors. Skimming the relevant LLVM
code I don't see why it'd be included in JIted code but not native code.

Just to be sure, I take
checking for llvm-config... /usr/bin/llvm-config
checking for clang... /usr/bin/clang
are for llvm & clang compiled from the same code?

I temporarily had access to s390x to debug an unrelated issue in the
past, and there I did hit this problem, but IIRC only because of clang
vs llvm version mismatches.

Unfortunately it is a bit hard to debug without access to a s390x
box... Can you provide that? If not I might ask the Debian folks for
access to one of their porter machines.

FWIW, the Debian s390x build seems to succeed at the moment:
https://buildd.debian.org/status/fetch.php?pkg=postgresql-13&arch=s390x&ver=13.2-1&stamp=1613044202&raw=0

> One way to fix this would be to add -march=native to the %.bc rules in
> src/Makefile.global.in. However, this will only work when the build system
> and the run system are the same.

> I think to fix this correctly, the llvmjit_types.bc file will need to be
> compiled when the JIT engine is initialized at runtime, so that it can use
> the same datalayout as the JIT engine.

That'd require headers to be present, which I don't think we should
require... And more importantly, it'd not go very far, because we also
have lot of other .bc files that will have the layout embedded (for
inlining functions/operators into JITed code). Which'd then require all
the source code to be present and to be compiled into bitcode. Not an,
uh, satisfying option.

> [1] https://kojipkgs.fedoraproject.org//work/tasks/2182/66082182/build.log

Random thing I noticed while scrolling through the log:
> configure: WARNING: unrecognized options: --disable-dependency-tracking

PG requires dependency tracking to be explicitly enabled, and it's a
different flag name (--enable-depend).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Stellard 2021-04-20 21:42:28 Re: BUG #16971: Incompatible datalayout errors with llvmjit
Previous Message Tom Lane 2021-04-20 18:48:12 Re: BUG #16973: Backward compatibility: pg_restore: [archiver] unsupported version (1.14) in file header