Re: Robocopy might be not robust enough for never-ending testing on Windows

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Robocopy might be not robust enough for never-ending testing on Windows
Date: 2024-09-17 01:01:17
Message-ID: CA+hUKGJ3iRm2HfRfwb25K0vk4teD4Oi39UEBogT9th3vN76vKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2024 at 6:00 PM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> So this leak looks like a recent and still existing defect.

From my cartoon-like understanding of Windows, I would guess that if
event handles created by a program are leaked after it has exited, it
would normally imply that they've been duplicated somewhere else that
is still running (for example see the way that PostgreSQL's
dsm_impl_pin_segment() calls DuplicateHandle() to give a copy to the
postmaster, so that the memory segment continues to exist after the
backend exits), and if it's that, you'd be able to see the handle
count going up in the process monitor for some longer running process
somewhere (as seen in this report from the Chrome hackers[1]). And if
it's not that, then I would guess it would have to be a kernel bug
because something outside userspace must be holding onto/leaking
handles. But I don't really understand Windows beyond trying to debug
PostgreSQL at a distance, so my guesses may be way off. If we wanted
to try to find a Windows expert to look at a standalone repro, does
your PS script work with *any* source directory, or is there something
about the initdb template, in which case could you post it in a .zip
file so that a non-PostgreSQL person could see the failure mode?

[1] https://randomascii.wordpress.com/2021/07/25/finding-windows-handle-leaks-in-chromium-and-others/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-09-17 01:27:24 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Masahiko Sawada 2024-09-17 00:38:16 Re: Conflict detection for update_deleted in logical replication