Re: pg_dump and large files - is this a problem?

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Giles Lean <giles(at)nemeton(dot)com(dot)au>
Subject: Re: pg_dump and large files - is this a problem?
Date: 2002-10-23 05:02:22
Message-ID: 200210230502.g9N52Ms21420@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


OK, you are saying if we don't have fseeko(), there is no reason to use
off_t, and we may as well use long. What limitations does that impose,
and are the limitations clear to the user.

What has me confused is that I only see two places that use a non-zero
fseeko, and in those cases, there is a non-fseeko code path that does
the same thing, or the call isn't actually required. Both cases are in
pg_dump/pg_dump_custom.c. It appears seeking in the file is an
optimization that prevents all the blocks from being read. That is
fine, but we shouldn't introduce failure cases to do that.

If BSD/OS is the only problem OS, I can deal with that, but I have no
idea if other OS's have the same limitation, and because of the way our
code exists now, we are not even checking to see if there is a problem.

I did some poking around, and on BSD/OS, fgetpos/fsetpos use fpos_t,
which is actually off_t, and interestingly, lseek() uses off_t too.
Seems only fseek/ftell is limited to long. I can easily implemnt
fseeko/ftello using fgetpos/fsetpos, but that is only one OS.

One idea would be to patch up BSD/OS in backend/port/bsdi and add a
configure tests that actually fails if fseeko doesn't exist _and_
sizeof(off_t) > sizeof(long). That would at least catch OS's before
they make >2gig backups that can't be restored.

---------------------------------------------------------------------------

Philip Warner wrote:
> At 10:46 PM 22/10/2002 -0400, Bruce Momjian wrote:
> >Uh, not exactly. I have off_t as a quad, and I don't have fseeko, so
> >the above conditional doesn't work. I want to use off_t, but can't use
> >fseek().
>
> Then when you create dumps, they will be invalid since I assume that ftello
> is also broken in the same way. You need to fix _getFilePos as well. And
> any other place that uses an off_t needs to be looked at very carefully.
> The code was written assuming that if 'hasSeek' was set, then we could
> trust it.
>
> Given that you say you do have support for some kind of 64 bt offset, I
> would be a lot happier with these changes if you did something akin to my
> original sauggestion:
>
> #if defined(HAVE_FSEEKO)
> #define FILE_OFFSET off_t
> #define FSEEK fseeko
> #elseif defined(HAVE_SOME_OTHER_FSEEK)
> #define FILE_OFFSET some_other_offset
> #define FSEEK some_other_fseek
> #else
> #define FILE_OFFSET long
> #define FSEEK fseek
> #end if
>
> ...assuming you have a non-broken 64 bit fseek/tell pair, then this will
> work in all cases, and make the code a lot less ugly (assuming of course
> the non-broken version can be shifted).
>
>
>
> ----------------------------------------------------------------
> Philip Warner | __---_____
> Albatross Consulting Pty. Ltd. |----/ - \
> (A.B.N. 75 008 659 498) | /(@) ______---_
> Tel: (+61) 0500 83 82 81 | _________ \
> Fax: (+61) 0500 83 82 82 | ___________ |
> Http://www.rhyme.com.au | / \|
> | --________--
> PGP key available upon request, | /
> and from pgp5.ai.mit.edu:11371 |/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Warner 2002-10-23 05:41:57 Re: pg_dump and large files - is this a problem?
Previous Message Philip Warner 2002-10-23 04:38:18 Re: pg_dump and large files - is this a problem?