Re: Error while copying a large file in pg_rewind

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Error while copying a large file in pg_rewind
Date: 2017-07-21 06:22:28
Message-ID: CAB7nPqS10xkwam1-Gg=CcW_RE-stki=_w=sYk1pbv_D5nqk3pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 20, 2017 at 9:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I was initially surprised that your testing managed to pass, but then
> I noticed that this sanity test is using && where it should really be
> using ||; it will only fail if ALL of the data types are wrong. Oops.

Oh, oh. If this was right from the beginning I would have hit this
issue. Yes that's a clear oversight of the original code.

> This code plays pretty fast and loose with off_t vs. size_t vs. uint64
> vs. int64, but those definitely don't all agree in signedness and may
> not even agree in width (size_t, at least, might be 4 bytes). We use
> stat() to get the size of a file as an off_t, and then store it into a
> uint64. Perhaps off_t is always uint64 anyway, but I'm not sure.
> Then, that value gets passed to fetch_file_range(), still as a uint64,
> and it then gets sent to the server to be stored into an int8 column,
> which is signed rather than unsigned. That will error out if there's
> a file size >= 2^63, but filesystem holding more than 8 exabytes are
> unusual and will probably remain so for some time. The server sends
> back an int64 which we pass to write_target_range(), whose argument is
> declared as off_t, which is where we started. I'm not sure there's
> any real problem with all of that, but it's a little nervous-making
> that we keep switching types. Similarly, libpqProcessFileList gets
> the file size as an int64 and then passes it to process_source_file()
> which is expecting size_t. So we manage to use 4 different data types
> to represent a file size. On most systems, all of them except SQL's
> int8 are likely to be 64-bit unsigned integers, but I'm not sure what
> will happen on obscure platforms.

Yes, I felt uneasy as well when hacking the patch with all the type
switches that have been done, but I kept my focus on bringing out a
minimally invasive patch to fix the issue and any other holes I found.
One thing that I think could be added in the patch is an overflow
check in libpqGetFile(), because I think that we still want to use
size_t for this code path. What do you think?

Note I am not much worrying on signedness of the values yet (Postgres
still lacks a proper in-core unsigned type, this gives an argument for
one), as long as the value are correctly stored on 8 bytes, and we
still have some margin until we reach that. We could add a comment in
the code to underline that assumption, though I am not sure that this
is really necessary...

> Still, it can't be worse than the status quo, where instead of int64
> we're using int and int32, so maybe we ought to back-patch it as-is
> for now and look at any further cleanup that is needed as a
> master-only improvement.

Yes. I don't like playing much with the variable types on
back-branches, as long as the initial amount of bytes is large enough
we will be safe for some time.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2017-07-21 06:24:33 Re: Partition-wise join for join between (declaratively) partitioned tables
Previous Message Rafia Sabih 2017-07-21 06:12:28 Re: Partition-wise join for join between (declaratively) partitioned tables