how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: how to speed up 002_pg_upgrade.pl and 025_stream_regress.pl under valgrind
Date: 2024-09-15 18:20:01
Message-ID: a8472297-2975-4760-9757-2f0dc477371d@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've been doing a lot of tests under valgrind lately, and it made me
acutely aware of how long check-world takes. I realize valgrind is
inherently expensive and slow, and maybe the reasonable reply is to just
run a couple tests that are "interesting" for a patch ...

Anyway, I did a simple experiment - I ran check-world with timing info
for TAP tests, both with and without valgrind, and if I plot the results
I get the attached charts (same data, second one has log-scale axes).

The basic rule is that valgrind means a very consistent 100x slowdown. I
guess it might vary a bit depending on compile flags, but not much.

But there are two tests very clearly stand out - not by slowdown, that's
perfectly in line with the 100x figure - but by total duration. I've
labeled them on the linear-scale chart.

It's 002_pg_upgrade and 027_stream_regress. I guess the reasons for the
slowness are pretty clear - those are massive tests. pg_upgrade creates,
dumps and restores many objects (thousands?), stream_regress runs the
whole regress test suite on primary, and cross-checks what gets
replicated to standby. So it's expected to be somewhat expected.

Still, I wonder if there might be faster way to do these tests, because
these two tests alone add close to 3h of the valgrind run. Of course,
it's not just about valgrind - these tests are slow even in regular
runs, taking almost a minute each, but it's a different scale (minutes
instead of hours). Would be nice to speed it up too, though.

I don't have a great idea how to speed up these tests, unfortunately.
But one of the problems is that all the TAP tests run serially - one
after each other. Could we instead run them in parallel? The tests setup
their "private" clusters anyway, right?

regards

--
Tomas Vondra

Attachment Content-Type Size
image/png 30.2 KB
image/png 37.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-09-15 18:22:14 Re: Trim the heap free memory
Previous Message shawn wang 2024-09-15 17:48:44 Re: Trim the heap free memory