From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Kenneth Marshall <ktm(at)rice(dot)edu>, Larry Rosenman <ler(at)lerctr(dot)org>, Pgsql hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Strange issue with NFS mounted PGDATA on ugreen NAS |
Date: | 2025-01-02 15:38:29 |
Message-ID: | 267244.1735832309@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> I now suspect this specific readdir() problem is in FreeBSD's NFS
> client. See below. There have also been reports of missed files from
> (IIRC) Linux clients without much analysis, but that doesn't seem too
> actionable from here unless someone can come up with a repro or at
> least some solid details to investigate; those involved unspecified
> (possibly appliance/cloud) NFS and CIFS file servers.
I forgot to report back, but yesterday I spent time unsuccessfully
trying to reproduce the problem with macOS client and NFS server
using btrfs (a Synology NAS running some no-name version of Linux).
So that lends additional weight to your conclusion that it isn't
specifically a btrfs bug.
> I see this issue here with a FreeBSD client talking to a Debian server
> exporting BTRFS or XFS, even with dirreadsize set high so that
> multi-request paging is not expected. Looking at Wireshark and the
> NFS spec (disclaimer: I have never studied NFS at this level before,
> addito salis grano), what I see is a READDIR request with cookie=0
> (good), and which receives a response containing the whole directory
> listing and a final entry marker eof=1 (good), but then FreeBSD
> unexpectedly (to me) sends *another* READDIR request with cookie=662,
> which is a real cookie that was received somewhere in the middle of
> the first response on the entry for "13816_fsm", and that entry was
> followed by an entry for "13816_vm". The second request gets a
> response that begins at "13816_vm" (correct on the server's part).
> Then the client sends REMOVE (unlink) requests for some but not all of
> the files, including "13816_fsm" but not "13816_vm". Then it sends
> yet another READDIR request with cookie=0 (meaning go from the top),
> and gets a non-empty directory listing, but immediately sends RMDIR,
> which unsurprisingly fails NFS3ERR_NOTEMPTY. So my best guess so far
> is that FreeBSD's NFS client must be corrupting its directory cache
> when files are unlinked, and it's not the server's fault. I don't see
> any obvious problem with the way the cookies work. Seems like
> material for a minimised bug report elsewhere, and not our issue.
Yeah, that seems pretty damning. Thanks for looking into it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-01-02 16:15:32 | Re: IWYU annotations |
Previous Message | Matheus Alcantara | 2025-01-02 15:29:30 | read stream on amcheck |