Re: pgsql: Fix unlink() for STATUS_DELETE_PENDING on Windows.

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Fix unlink() for STATUS_DELETE_PENDING on Windows.
Date: 2022-10-25 23:57:54
Message-ID: 20221025235753.GT16921@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On Wed, Oct 26, 2022 at 11:15:16AM +1300, Thomas Munro wrote:
> On Wed, Oct 26, 2022 at 10:31 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > On Tue, Oct 25, 2022 at 04:21:02PM -0500, Justin Pryzby wrote:
> > > > Restore that behavior with an explicit check, to see if it fixes the
> > > > occasional 'directory not empty' failures seen in the pg_upgrade tests
> > > > on CI. Further improvements are possible with proposed upgrades to
> > > > modern Windows APIs that would replace this convoluted code.
>
> > > If I'm not wrong, this didn't fix the issue you said it fixed.
> >
> > s/said/hoped/sorry
>
> Drat. More theories needed then. Perhaps it has nothing to do with
> my recent changes. I am starting to wonder if we should have an
> rmdir() wrapper that waits for zombie files to go away, and spits out
> the name of the file that's in the way, and also the name/pid of the
> process that has it open[1] if it times out, so we have a fighting
> chance of debugging this type of stuff.
>
> [1] https://devblogs.microsoft.com/oldnewthing/20120217-00/?p=8283

FWIW, your description of the problem sounded a bit off to me.

You seemed to say that the problem is with rmdir() on a directory, which
contains a file which was "recently removed", but still opened by
something.

But as I recall, at the point that pg_upgrade is running rmdir(), the
contained files have been unlinked, and the postgres process has ended:

| # Running: pg_upgrade --no-sync -d C:\cirrus\build/testrun/pg_upgrade/002_pg_upgrade\data/t_002_pg_upgrade_old_node_data/pgdata -D C:\cirrus\build/testrun/pg_upgrade/002_pg_upgrade\data/t_002_pg_upgrade_new_node_data/pgdata -b C:/cirrus/build/tmp_install/bin -B C:/cirrus/build/tmp_install/bin -s C:/Users/ContainerAdministrator/AppData/Local/Temp/f67bWmckRH -p 57611 -P 57612 --check
| Performing Consistency Checks
| -----------------------------
| Checking cluster versions ok
| Checking database user is the install user ok
| Checking database connection settings ok
| Checking for prepared transactions ok
| Checking for system-defined composite types in user tables ok
| Checking for reg* data types in user tables ok
| Checking for contrib/isn with bigint-passing mismatch ok
| Checking for presence of required libraries ok
| Checking database user is the install user ok
| Checking for prepared transactions ok
| Checking for new cluster tablespace directories ok
|
| *Clusters are compatible*
| pg_upgrade: warning: could not remove file or directory "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20220925T193346.158/log": Directory not empty
| pg_upgrade: warning: could not remove file or directory "C:/cirrus/build/testrun/pg_upgrade/002_pg_upgrade/data/t_002_pg_upgrade_new_node_data/pgdata/pg_upgrade_output.d/20220925T193346.158": Directory not empty

My thinking has always been that this is essentially a bug/deficiency in
the windows FS. It seems absurd that unlink/rmdir would be
asynchronous.

It'd be swell if we could use a separate device in CI, to be used for
running tests. Bonus points if it supports COW.

--
Justin

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2022-10-26 00:42:04 pgsql: Fix ordering issue with WAL operations in GIN fast insert path
Previous Message Thomas Munro 2022-10-25 22:15:16 Re: pgsql: Fix unlink() for STATUS_DELETE_PENDING on Windows.