From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: Robocopy might be not robust enough for never-ending testing on Windows |
Date: | 2024-09-16 06:00:00 |
Message-ID: | 8b724988-ba94-25b4-8064-068b6c4b0520@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Thomas,
14.09.2024 23:32, Thomas Munro wrote:
> On Sun, Sep 15, 2024 at 1:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
>> (That is, 0.1-0.2 MB leaks per one robocopy run.)
>>
>> I observed this on Windows 10 (Version 10.0.19045.4780), with all updates
>> installed, but not on Windows Server 2016 (10.0.14393.0). Moreover, using
>> robocopy v14393 on Windows 10 doesn't affect the issue.
> I don't understand Windows but that seems pretty weird to me, as it
> seems to imply that a driver or something fairly low level inside the
> kernel is leaking objects (at least by simple minded analogies to
> operating systems I understand better). Either that or robocop.exe
> has userspace stuff involving at least one thread still running
> somewhere after it's exited, but that seems unlikely as I guess you'd
> have noticed that...
Yes, I see no robocopy process left after the test, and I think userspace
threads would not survive logoff.
> Just a thought: I was surveying the block cloning landscape across
> OSes and filesystems while looking into clone-based CREATE DATABASE
> (CF #4886) and also while thinking about the new TAP test initdb
> template copy trick, is that robocopy.exe tries to use Windows' block
> cloning magic, just like cp on recent Linux and FreeBSD systems (at
> one point I was wondering if that was causing some funky extra flush
> stalls on some systems, I need to come back to that...). It probably
> doesn't actually work unless you have Windows 11 kernel with DevDrive
> enabled (from reading, no Windows here), but I guess it still probably
> uses the new system interfaces, probably something like CopyFileEx().
> Does it still leak if you use /nooffload or /noclone?
I tested the following (with the script above):
Windows 10 (Version 10.0.19045.4780):
robocopy.exe (10.0.19041.4717) /NOOFFLOAD
iteration 1
496611328
...
iteration 1000
609701888
That is, it leaks
/NOCLONE is not supported by that robocopy version:
ERROR : Invalid Parameter #1 : "/NOCLONE"
Then, Windows 11 (Version 10.0.22000.613), robocopy 10.0.22000.469:
iteration 1
141217792
...
iteration 996
151670784
...
iteration 997
152817664
...
iteration 1000
151674880
That is, it doesn't leak.
robocopy.exe /NOOFFLOAD
iteration 1
152666112
...
iteration 1000
153341952
No leak.
/NOCLONE is not supported by that robocopy version:
Then I updated that Windows 11 to Version 10.0.22000.2538 (with KB5031358),
robocopy 10.0.22000.1516:
iteration 1
122753024
...
iteration 1000
244674560
It does leak.
robocopy /NOOFFLOAD
iteration 1
167522304
...
iteration 1000
283484160
It leaks as well.
Finally, I've installed newest Windows 11 Version 10.0.22631.4169, with
robocopy 10.0.22621.3672:
Non-paged pool increased from 133 to 380 MB after 1000 robocopy runs.
robocopy /OFFLOAD leaks too.
/NOCLONE is not supported by that robocopy version:
So this leak looks like a recent and still existing defect.
(Sorry for a delay, fighting with OS updates/installation took me a while.)
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-09-16 06:26:37 | Re: Support LIKE with nondeterministic collations |
Previous Message | Amit Kapila | 2024-09-16 05:43:24 | Re: Allow logical failover slots to wait on synchronous replication |