Re: CI and test improvements

From: Andres Freund <andres(at)anarazel(dot)de>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Noah Misch <noah(at)leadboat(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, samay sharma <smilingsamay(at)gmail(dot)com>
Subject: Re: CI and test improvements
Date: 2022-11-05 01:59:46
Message-ID: 20221105015946.yrxijqb7h4rqhp6d@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-04 18:54:12 -0500, Justin Pryzby wrote:
> Subject: [PATCH 1/8] meson: PROVE is not required
> Subject: [PATCH 3/8] meson: rename 'main' tasks to 'regress' and 'isolation'

Pushed, thanks for the patches.

> Subject: [PATCH 2/8] meson: other fixes for cygwin
>
> XXX: what about HAVE_BUGGY_STRTOF ?

What are you wondering about here? Shouldn't that continue to be set via
src/include/port/cygwin.h?

> diff --git a/src/test/regress/meson.build b/src/test/regress/meson.build
> index 3dcfc11278f..6ec3c77af53 100644
> --- a/src/test/regress/meson.build
> +++ b/src/test/regress/meson.build
> @@ -10,7 +10,7 @@ regress_sources = pg_regress_c + files(
> # patterns like ".*-.*-mingw.*". We probably can do better, but for now just
> # replace 'gcc' with 'mingw' on windows.
> host_tuple_cc = cc.get_id()
> -if host_system == 'windows' and host_tuple_cc == 'gcc'
> +if host_system in ['windows', 'cygwin'] and host_tuple_cc == 'gcc'
> host_tuple_cc = 'mingw'
> endif

This doesn't quite seem right - shouldn't it say cywin? Not that it makes a
difference right now, given the contents of resultmap:
float4:out:.*-.*-cygwin.*=float4-misrounded-input.out
float4:out:.*-.*-mingw.*=float4-misrounded-input.out

> From 0acbbd2fdd97bbafc5c4552e26f92d52c597eea9 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> Date: Wed, 25 May 2022 21:53:22 -0500
> Subject: [PATCH 4/8] cirrus/windows: add compiler_warnings_script
>
> I'm not sure how to write this test in windows shell; it's also not easy to
> write it in posix sh, since windows shell is somehow interpretting && and ||...
>
> https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de
>
> See also:
> 8a1ce5e54f6d144e4f8e19af7c767b026ee0c956
> https://cirrus-ci.com/task/6241060062494720
> https://cirrus-ci.com/task/6496366607204352
>
> ci-os-only: windows
> ---
> .cirrus.yml | 10 +++++++++-
> src/tools/ci/windows-compiler-warnings | 24 ++++++++++++++++++++++++
> 2 files changed, 33 insertions(+), 1 deletion(-)
> create mode 100755 src/tools/ci/windows-compiler-warnings
>
> diff --git a/.cirrus.yml b/.cirrus.yml
> index 9f2282471a9..99ac09dc679 100644
> --- a/.cirrus.yml
> +++ b/.cirrus.yml
> @@ -451,12 +451,20 @@ task:
>
> build_script: |
> vcvarsall x64
> - ninja -C build
> + ninja -C build |tee build/meson-logs/build.txt
> + REM Since pipes lose exit status of the preceding command, rerun compilation,
> + REM without the pipe exiting now if it fails, rather than trying to run checks
> + ninja -C build > nul

This seems mighty grotty :(. but I guess it's quick enough not worry about,
and I can't come up with a better plan.

It doesn't seem quite right to redirect into meson-logs/ to me, my
interpretation is that that's "meson's namespace". Why not just store it in
build/?

> From e85fe83fd8a4b4c79a96d2bf66cd6a5e1bdfcd1e Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> Date: Sat, 26 Feb 2022 19:34:35 -0600
> Subject: [PATCH 5/8] cirrus: build docs as a separate task..
>
> This will run the doc build if any docs have changed, even if Linux
> fails, to allow catch doc build failures.
>
> This'll automatically show up as a separate "column" on cfbot.
>
> Also, in the future, this will hopefully upload each patch's changed HTML docs
> as an artifact, for easy review.
>
> Note that this is currently building docs with both autoconf and meson.
>
> ci-os-only: html
> ---
> .cirrus.yml | 62 +++++++++++++++++++++++++++++++++++++----------------
> 1 file changed, 44 insertions(+), 18 deletions(-)
>
> diff --git a/.cirrus.yml b/.cirrus.yml
> index 99ac09dc679..37fd79e5b77 100644
> --- a/.cirrus.yml
> +++ b/.cirrus.yml
> @@ -472,6 +472,9 @@ task:
> type: text/plain
>
>
> +###
> +# Test that code can be built with gcc/clang without warnings
> +###
> task:
> name: CompilerWarnings
>
> @@ -515,10 +518,6 @@ task:
> #apt-get update
> #DEBIAN_FRONTEND=noninteractive apt-get -y install ...
>
> - ###
> - # Test that code can be built with gcc/clang without warnings
> - ###
> -

Why remove this?

> setup_script: echo "COPT=-Werror" > src/Makefile.custom
>
> # Trace probes have a history of getting accidentally broken. Use the
> @@ -580,20 +579,6 @@ task:
> make -s -j${BUILD_JOBS} clean
> time make -s -j${BUILD_JOBS} world-bin
>
> - ###
> - # Verify docs can be built
> - ###
> - # XXX: Only do this if there have been changes in doc/ since last build
> - always:
> - docs_build_script: |
> - time ./configure \
> - --cache gcc.cache \
> - CC="ccache gcc" \
> - CXX="ccache g++" \
> - CLANG="ccache clang"
> - make -s -j${BUILD_JOBS} clean
> - time make -s -j${BUILD_JOBS} -C doc
> -
> ###
> # Verify headerscheck / cpluspluscheck succeed
> #
> @@ -617,3 +602,44 @@ task:
>
> always:
> upload_caches: ccache
> +
> +
> +###
> +# Verify docs can be built
> +# changesInclude() will skip this task if none of the commits since
> +# CIRRUS_LAST_GREEN_CHANGE touched any relevant files. The comparison appears
> +# to be like "git log a..b -- ./file", not "git diff a..b -- ./file"
> +###
> +
> +task:
> + name: Documentation
> +
> + env:
> + CPUS: 1
> + BUILD_JOBS: 1
> +
> + only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*(docs|html).*'
> + skip: "!changesInclude('.cirrus.yml', 'doc/**')"

Perhaps we should introduce something other than ci-os-only if we want that to
include things like "docs and html". At least this should update
src/tools/ci/README.

> + sysinfo_script: |
> + id
> + uname -a
> + cat /proc/cmdline
> + ulimit -a -H && ulimit -a -S
> + export

I think we can skip this here.

> + # Exercise HTML and other docs:
> + ninja_docs_build_script: |
> + mkdir build.ninja
> + cd build.ninja

Perhaps build-ninja instead? build.ninja is the filename for ninja's build
instructions, so it seems a bit confusing.

> From adebe93a4409990e929f2775d45c6613134a4243 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> Date: Tue, 26 Jul 2022 20:30:02 -0500
> Subject: [PATCH 6/8] cirrus/ccache: add explicit cache keys..
>
> Since otherwise, building with ci-os-only will probably fail to use the
> normal cache, since the cache key is computed using both the task name
> and its *index* in the list of caches (internal/executor/cache.go:184).

Seems like this would potentially better addressed by reporting a bug to the
cirrus folks?

> ccache_cache:
> folder: ${CCACHE_DIR}
> + fingerprint_key: ccache/linux
> + reupload_on_changes: true

There's enough copies of this to make it worth deduplicating. If we use
something like
fingerprint_script: echo ccache/$CIRRUS_BRANCH/$CIRRUS_OS
we can use a yaml ref?

I think you experimented with creating a 'base' ccache dir (e.g. on the master
branch) and then using branch specific secondar caches? How did that turn out?
I think cfbot's caches constantly get removed due to overrunning the global
space.

> From f16739bc5d2087847129baf663aa25fa9edb8449 Mon Sep 17 00:00:00 2001
> From: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
> Date: Sun, 3 Apr 2022 00:10:20 -0500
> Subject: [PATCH 7/8] cirrus/ccache: disable compression and show stats

> Since v4.0, ccache enables zstd compression by default, saving roughly
> 2x-3x. But, cirrus caches are compressed as tar.gz, so we could disable
> ccache compression, allowing cirrus to gzip the uncompressed data
> (better than ccache's default of zstd-1).
>
> With default compression enabled (https://cirrus-ci.com/task/6692342840164352)
> linux/debian/bullseye has 4.2; 99MB cirrus cache; cache_size_kibibyte 109616
> macos has 4.5.1: 47MB cirrus cache; cache_size_kibibyte 52500
> freebsd has 3.7.12: 42MB cirrus cache; cache_size_kibibyte 134064
> XXX windows has 4.7.2; 180MB cirrus cache; cache_size_kibibyte 51179
> todo: compiler warnings
>
> With compression disabled (https://cirrus-ci.com/task/4614182514458624)
> linux: 91MB cirrus cache; cache_size_kibibyte 316136
> macos: 41MB cirrus cache; cache_size_kibibyte 118068
> windows: 42MB cirrus cache; cache_size_kibibyte 134064
> freebsd is the same

I'm still somewhat doubtful this is a good idea. The mingw cache is huge, for
example, and all that additional IO and memory usage is bound to show up.

> The stats should be shown and/or logged.
> ccache --show-stats shows the *cumulative* stats (including prior
> compilations)
> ccache --zero-stats clears out not only the global stats, but the
> per-file cache stats (from which the global stats are derived) - which
> obviously makes the cache work poorly.
>
> Note that ccache 4.4 supports CCACHE_STATSLOG, which seems ideal.
> The log should be written *outside* the ccache folder - it shouldn't be
> preserved across cirrusci task invocations.

I assume we don't have a new enough ccache everywhere yet?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-11-05 02:41:46 Re: Add tracking of backend memory allocated to pg_stat_activity
Previous Message Andres Freund 2022-11-05 00:38:28 Re: Direct I/O