Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction
Date: 2020-03-28 17:13:54
Message-ID: 24244.1585415634@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> Thanks for fixing my test case and pushing.

The buildfarm just showed up another instability in the test cases
we added:

=========================== regression.diffs ================
diff -U3 /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out
--- /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out 2020-03-17 08:14:50.292037956 +0100
+++ /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out 2020-03-28 13:55:12.490024822 +0100
@@ -169,11 +169,7 @@

select (w).size = :segsize as ok
from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
- ok
-----
- t
-(1 row)
-
+ERROR: could not stat file "pg_wal/000000010000000000000078": No such file or directory
select count(*) >= 0 as ok from pg_ls_archive_statusdir();
ok
----

It's pretty obvious what happened here: concurrent activity renamed or
removed the WAL segment between when we saw it in the directory and
when we tried to stat() it.

This seems like it would be just as much of a hazard for field usage
as it is for regression testing, so I propose that we fix these
directory-scanning functions to silently ignore ENOENT failures from
stat(). Are there any for which we should not do that?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-28 17:26:03 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Ivan N. Taranov 2020-03-28 16:26:08 Re: [PATCH] postgresql.conf.sample->postgresql.conf.sample.in