Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Davinder Singh <davinder(dot)singh(at)enterprisedb(dot)com>, Dilip Kumar <dilip(dot)kumar(at)enterprisedb(dot)com>
Subject: Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)
Date: 2024-11-22 23:43:48
Message-ID: CA+hUKGLpkcNRkrwMCXuJu1jc5-38bvN-yEq5B0FpSV91U9ZwMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 22, 2024 at 10:55 PM Jakub Wartak
<jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> but with attached _lseeki64 dirty patch I've got success and resulting tarball greater than 2^31 too:
>
> C:\compiledpg\bin>pg_basebackup.exe -U postgres -D "c:\backup7" -F t -P -X stream -c fast --compress=none --create-slot --slot=slot11
> Password:
> 18134035/18134035 kB (100%), 1/1 tablespace
> C:\compiledpg\bin>
> C:\compiledpg\bin>dir c:\backup7\pg_wal.tar
> [..]
> 11/22/2024 10:37 AM 4,026,778,112 pg_wal.tar
>
> PoC patch is attached and I'll register CFentry to see how the tests will go (all of this MSVC/MinGW stuff is frightening to me, of course this will fail on non-Win32...).

Great news.

This will indeed require solving a few archeology puzzles to
back-patch. For lseek(), I think it should be OK to just point it to
_lseeki64() in win32_port.h as I showed earlier, because _lseeki64()
was present in ancient msvcrt.dll. It's technically possible that an
extension might be upset by the change but it'd have to be doing
something pretty strange, like using lseek's type, or other such
assumptions, not terribly likely. A more cautious change would do
that just inside that translation unit to avoid upsetting anything
else (or fixing anything else...), but that's probably too cautious,
let's just do it in win32_port.h.

For ftruncate(), though you didn't see a problem with that, Jacob is
quite right that it must be able to corrupt stuff, and I retract that
replacement code I showed earlier, it's not really OK to move the file
position around. I would like to use _s_chsize() instead, but it
arrived in msvcr80.dll so I don't think old MinGW can use it. Here's
a new idea (sketch code not tested):

static inline int
ftruncate(int fd, pgoff_t length)
{
#if defined(_UCRT) || defined(_MSC_VER)
/* MinGW + UCRT and all supported MSVC versions have this. */
errno = _s_chsize(fd, length);
return errno == 0 ? 0 : -1;
#else
/* MinGW + ancient msvcrt.dll has only _chsize, limited by off_t (long). */
if (length > LONG_MAX)
{
/* A clear error is better than silent corruption. */
errno = EFBIG;
return - 1;
}
return _chsize(fd, length);
#endif
}

I'm not sure if we'd ever know if we broke MinGW + msvcrt.dll in the
back branches, ie if any computer actually existing on earth would
ever compile and run the second branch, and if so, be used for serious
database work and then be able to reach the failure case. I think an
honest but disappointing errno is probably OK in that hypothetical
case. It's hard to know how far to go when programming ghost
computers. Thoughts, anyone?

(Obviously an exorcism is overdue in master, working on it...)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-11-23 02:44:34 Re: Allow non-superuser to cancel superuser tasks.
Previous Message Andres Freund 2024-11-22 23:13:16 Re: Allow non-superuser to cancel superuser tasks.