From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: autovectorize page checksum code included elsewhere |
Date: | 2023-11-12 01:00:14 |
Message-ID: | 20231112010014.v3pjqh2aedjol7ck@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-11-06 20:47:34 -0600, Nathan Bossart wrote:
> Separately, I'm wondering whether we should consider using CFLAGS_VECTORIZE
> on the whole tree. Commit fdea253 seems to be responsible for introducing
> this targeted autovectorization strategy, and AFAICT this was just done to
> minimize the impact elsewhere while optimizing page checksums. Are there
> fundamental problems with adding CFLAGS_VECTORIZE everywhere? Or is it
> just waiting on someone to do the analysis/benchmarking?
Historically sometimes vectorization ended up hurting in a bunch of
places. But I think that was in the gcc 4 era, which long has
passed.
IME these days using -O3 yields decent improvements over -O2 when used tree
wide - even if there are perhaps a few isolated cases where the code is a bit
worse, they're far outweighed by the improved code.
Compile time wise it's noticeably slower, but not catastrophically so. On an
older but decent laptop, while on battery:
O2:
800.29user 41.99system 0:59.17elapsed 1423%CPU (0avgtext+0avgdata 282324maxresident)k
152inputs+4408176outputs (95major+13359282minor)pagefaults 0swaps
O3:
911.80user 44.71system 1:06.79elapsed 1431%CPU (0avgtext+0avgdata 278660maxresident)k
82624inputs+4571480outputs (571major+14004898minor)pagefaults 0swaps
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-11-12 01:09:02 | Re: maybe a type_sanity. sql bug |
Previous Message | Michael Paquier | 2023-11-12 00:22:20 | Re: pgsql: Don't trust unvalidated xl_tot_len. |