Re: Speed up clean meson builds by ~25%

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up clean meson builds by ~25%
Date: 2024-04-09 22:33:10
Message-ID: 20240409223310.q7o4tctltsnqcm4j@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-09 17:13:52 +1200, Thomas Munro wrote:
> On Tue, Apr 9, 2024 at 5:01 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > On Mon, Apr 08, 2024 at 12:23:56PM +0300, Nazir Bilal Yavuz wrote:
> > > On Mon, 8 Apr 2024 at 11:00, Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> > >> As I wrote in [1], I didn't observe the issue with clang-18, so maybe it
> > >> is fixed already.
> > >> Perhaps it's worth rechecking...
> > >>
> > >> [1] https://www.postgresql.org/message-id/d2bf3727-bae4-3aee-65f6-caec2c4ebaa8%40gmail.com
> > >
> > > I had this problem on my local computer. My build times are:
> > >
> > > gcc: 20s
> > > clang-15: 24s
> > > clang-16: 105s
> > > clang-17: 111s
> > > clang-18: 25s
> >
> > Interesting. A parallel build of ecpg shows similar numbers here:
> > clang-16: 101s
> > clang-17: 112s
> > clang-18: 14s
> > gcc: 10s
>
> I don't expect it to get fixed BTW, because it's present in 16.0.6,
> and .6 is the terminal release, if I understand their system
> correctly. They're currently only doing bug fixes for 18, and even
> there not for much longer. Interesting that not everyone saw this at
> first, perhaps the bug arrived in a minor release that some people
> didn't have yet? Or perhaps there is something special required to
> trigger it?

I think we need to do something about the compile time of this file, even with
gcc. Our main grammar already is an issue and stacking all the ecpg stuff on
top makes it considerably worse.

ISTM there's a bunch of pretty pointless stuff in the generated preproc.y,
which do seem to have some impact on compile time. E.g. a good bit of the file
is just stuff like

reserved_keyword:
ALL
{
$$ = mm_strdup("all");
}
...

Why are strduping all of these? We could instead just use the value of the
token, instead of forcing the compiler to generate branches for all individual
keywords etc.

I don't know off-hand if the keyword lookup machinery ends up with an
uppercase keyword, but if so, that'd be easy enough to change.

It actually looks to me like the many calls to mm_strdup() might actually be
what's driving clang nuts. I hacked up preproc.y to not need those calls for
unreserved_keyword
col_name_keyword
type_func_name_keyword
reserved_keyword
bare_label_keyword
by removing the actions and defining those tokens to be of type str. There are
many more such calls that could be dealt with similarly.

That alone reduced compile times with
clang-16 -O1 from 18.268s to 12.516s
clang-16 -O2 from 345.188 to 158.084s
clang-19 -O2 from 26.018s to 15.200s

I suspect what is happening is that clang tries to optimize the number of
calls to mm_strdup(), by separating the argument setup from the function
call. Which leads to a control flow graph with *many* incoming edges to the
basic block containing the function call to mm_strdup(), triggering a normally
harmless O(N^2) or such.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-04-09 22:58:39 Re: Allow non-superuser to cancel superuser tasks.
Previous Message Kirill Reshke 2024-04-09 21:52:19 Re: Allow non-superuser to cancel superuser tasks.