Re: [RFC] building postgres with meson

From: Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] building postgres with meson
Date: 2021-10-12 15:21:50
Message-ID: CAFp7QwrqaRkMju9FHfZ35b2Xz0ATa035Bdw1ZXgrJ4xpr3TN4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

.

út 12. 10. 2021 v 10:37 odesílatel Andres Freund <andres(at)anarazel(dot)de> napsal:
>
> Hi,
>
> For the last year or so I've on and off tinkered with $subject. I think
> it's in a state worth sharing now. First, let's look at a little
> comparison.
>
> My workstation:
>
> non-cached configure:
> current: 11.80s
> meson: 6.67s
>
> non-cached build (world-bin):
> current: 40.46s
> ninja: 7.31s
>
> no-change build:
> current: 1.17s
> ninja: 0.06s
>
> test world:
> current: 105s
> meson: 63s
>
>
> What actually started to motivate me however were the long times windows
> builds took to come back with testsresults. On CI, with the same machine
> config:
>
> build:
> current: 202s (doesn't include genbki etc)
> meson+ninja: 140s
> meson+msbuild: 206s
>
>
> test:
> current: 1323s (many commands)
> meson: 903s (single command)
>
> (note that the test comparison isn't quite fair - there's a few tests
> missing, but it's just small contrib ones afaik)
>
>
> The biggest difference to me however is not the speed, but how readable
> the output is.
>
> Running the tests with meson in a terminal, shows the number of tests
> that completed out of how many total, how much time has passed, how long
> the currently running tests already have been running.
>
> At the end of a testrun a count of tests is shown:
>
> 188/189 postgresql:tap+pg_basebackup / pg_basebackup/t/010_pg_basebackup.pl OK 39.51s 110 subtests passed
> 189/189 postgresql:isolation+snapshot_too_old / snapshot_too_old/isolation OK 62.93s
>
>
> Ok: 188
> Expected Fail: 0
> Fail: 1
> Unexpected Pass: 0
> Skipped: 0
> Timeout: 0
>
> Full log written to /tmp/meson/meson-logs/testlog.txt
>
>
> The log has the output of the tests and ends with:
>
> Summary of Failures:
> 120/189 postgresql:tap+recovery / recovery/t/007_sync_rep.pl ERROR 7.16s (exit status 255 or signal 127 SIGinvalid)
>
>
> Quite the difference to make check-world -jnn output.
>
>
> So, now that the teasing is done, let me explain a bit what lead me down
> this path:
>
> Autoconf + make is not being actively developed. Especially autoconf is
> *barely* in maintenance mode - despite many shortcomings and bugs. It's
> also technology that very few want to use - autoconf m4 is scary, and
> it's scarier for people that started more recently than a lot of us
> committers for example.
>
> Recursive make as we use it is hard to get right. One reason the clean
> make build is so slow compared to meson is that we had to resort to
> .NOTPARALLEL to handle dependencies in a bunch of places. And despite
> that, I quite regularly see incremental build failures that can be
> resolved by retrying the build.
>
> While we have incremental build via --enable-depend, they don't work
> that reliable (i.e. misses necessary rebuilds) and yet is often too
> aggressive. More modern build system can keep track of the precise
> command used to build a target and rebuild it when that command changes.
>
>
> We also don't just have the autoconf / make buildsystem, there's also
> the msvc project generator - something most of us unix-y folks do not
> like to touch. I think that, combined with there being no easy way to
> run all tests, and it being just different, really hurt our windows
> developer appeal (and subsequently the quality of postgres on
> windows). I'm not saying this to ding the project generator - that was
> well before there were decent "meta" buildsystems out there (and in some
> ways it is a small one itself).
>
>
> The last big issue I have with the current situation is that there's no
> good test integration. make check-world output is essentially unreadable
> / not automatically parseable. Which led to the buildfarm having a
> separate list of things it needs to test, so that failures can be
> pinpointed and paired with appropriate logs. That approach unfortunately
> doesn't scale well to multi-core CPUs, slowing down the buildfarm by a
> fair bit.
>
>
> This all led to me to experiment with improvements. I tried a few
> somewhat crazy but incremental things like converting our buildsystem to
> non-recursive make (I got it to build the backend, but it's too hard to
> do manually I think), or to not run tests during the recursive make
> check-world, but to append commands to a list of tests, that then is run
> by a helper (can kinda be made to work). In the end I concluded that
> the amount of time we'd need to invest to maintain our more-and-more
> custom buildsystem going forward doesn't make sense.
>
>
> Which lead me to look around and analyze which other buildsystems there
> are that could make some sense for us. The halfway decent list includes,
> I think:
> 1) cmake
> 2) bazel
> 3) meson
>
>
> cmake would be a decent choice, I think. However, I just can't fully
> warm up to it. Something about it just doesn't quite sit right with
> me. That's not a good enough reason to prevent others from suggesting to
> use it, but it's good enough to justify not investing a lot of time in
> it myself.
>
> Bazel has some nice architectural properties. But it requires a JVM to
> run - I think that basically makes it insuitable for us. And the build
> information seems quite arduous to maintain too.
>
> Which left me with meson. It is a meta-buildsystem that can do the
> actual work of building via ninja (the most common one, also targeted by
> cmake), msbuild (visual studio project files, important for GUI work)
> and xcode projects (I assume that's for a macos IDE, but I haven't tried
> to use it). Meson roughly does what autoconf+automake did, in a
> python-esque DSL, and outputs build-instructions for ninja / msbuild /
> xcode. One interesting bit is that meson itself is written in python (
> and fairly easy to contribute too - I got a few changes in now).
>
>
> I don't think meson is perfect architecturally - e.g. its insistence on
> not having functions ends up making it a bit harder to not end up
> duplicating code. There's some user-interface oddities that are now hard
> to fix fully, due to the faily wide usage. But all-in-all it's pretty
> nice to use.
>
>
> Its worth calling out that a lot of large open source projects have been
> / are migrating to meson. qemu/kvm, mesa (core part of graphics stack on
> linux and also widely used in other platforms), a good chunk of GNOME,
> and quite a few more. Due to that it seems unlikely to be abandoned
> soon.
>
>
> As far as I can tell the only OS that postgres currently supports that
> meson doesn't support is HPUX. It'd likely be fairly easy to add
> gcc-on-hpux support, a chunk more to add support for the proprietary
> ones.
>
>
> The attached patch (meson support is 0016, the rest is prerequisites
> that aren't that interesting at this stage) converts most of postgres to
> meson. There's a few missing contrib modules, only about half the
> optional library dependencies are implemented, and I've only built on
> x64. It builds on freebsd, linux, macos and windows (both ninja and
> msbuild) and cross builds from linux to windows. Thomas helped make the
> freebsd / macos pieces a reality, thanks!
>
> I took a number of shortcuts (although there used to be a *lot*
> more). So this shouldn't be reviewed to the normal standard of the
> community - it's a prototype. But I think it's in a complete enough
> shape that it allows to do a well-informed evaluation.
>
> What doesn't yet work/ build:
>
> - plenty optional libraries, contrib, NLS, docs build
>
> - PGXS - and I don't yet know what to best do about it. One
> backward-compatible way would be to continue use makefiles for pgxs,
> but do the necessary replacement of Makefile.global.in via meson (and
> not use that for postgres' own build). But that doesn't really
> provide a nicer path for building postgres extensions on windows, so
> it'd definitely not be a long-term path.
>
> - JIT bitcode generation for anything but src/backend.
>
> - anything but modern-ish x86. That's proably a small amount of work,
> but something that needs to be done.
>
> - exporting all symbols for extension modules on windows (the stuff for
> postgres is implemented). Instead I marked the relevant symbols als
> declspec(dllexport). I think we should do that regardless of the
> buildsystem change. Restricting symbol visibility via gcc's
> -fvisibility=hidden for extensions results in a substantially reduced
> number of exported symbols, and even reduces object size (and I think
> improves the code too). I'll send an email about that separately.
>
>
>
>
> There's a lot more stuff to talk about, but I'll stop with a small bit
> of instructions below:
>
>
> Demo / instructions:
> # Get code
> git remote add andres git(at)github(dot)com:anarazel/postgres.git
> git fetch andres
> git checkout --track andres/meson
>
> # setup build directory
> meson setup build --buildtype debug
> cd build
>
> # build (uses automatically as many cores as available)
> ninja

I'm getting errors at this step. You can find my output at
https://pastebin.com/Ar5VqfFG. Setup went well without errors. Is that
expected for now?

> # change configuration, build again
> meson configure -Dssl=openssl
> ninja
>
> # run all tests
> meson test
>
> # run just recovery tests
> meson test --suite setup --suite recovery
>
> # list tests
> meson test --list
>
>
> Greetings,
>
> Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-10-12 15:28:04 Re: [RFC] building postgres with meson
Previous Message Stephen Frost 2021-10-12 15:19:55 Re: pg14 psql broke \d datname.nspname.relname