Re: 64-bit API for large objects

From: Jeremy Drake <pgsql(at)jdrake(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Mark Dilger <pgsql(at)markdilger(dot)com>
Subject: Re: 64-bit API for large objects
Date: 2005-09-24 19:47:26
Message-ID: Pine.LNX.4.63.0509241229030.5158@garibaldi.apptechsys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 24 Sep 2005, Tom Lane wrote:

> Jeremy Drake <pgsql(at)jdrake(dot)com> writes:
> > In any case, are there any comments on the changes below libpq (the
> > functions visible to queries on down)?
>
> Within the backend, I don't see the point in maintaining a distinction
> between 32- and 64-bit APIs for inv_api.c: you should just switch 'em
> to use int64. You did it that way for inv_getsize but then proceeded
> to add separate inv_seek64/tell64 functions, which is inconsistent.

Right. I did it the way you describe my first cut (I did this several
times and changed my mind and started over). I was concerned (perhaps
needlessly) about breaking the signatures of the inv_* functions which are
visible outside of the .c file. This is why I did the getsize differently
- it is static. But it sounds like there is no concern about changing the
signatures of these functions, so I will change my patch to not maintain
the seperate code paths in inv_api.c.

> The submitted version of lo_tell64 isn't even calling the right one :-(

Oops. That's what I get for lots of copy/paste and doing it multiple
times... Bonehead mistake, thanks for catching it.

> The 32-bit version of lo_tell will need to check and error out if the
> value it'd need to return doesn't fit in int32. (Come to think of it,
> so should the 32-bit version of lo_lseek.)

That makes sense. Or it could return some value (INT_MAX?) which could
mean that it is outside the range, so someone could still get at the data
even if they are using a backwards client box? I don't know if that makes
sense at all, it sounds pretty wacky since these clients would have no
way of knowing where they are in the file. Erroring would probably be
best.

> All of the LO code needs to be eyeballed to make sure it still behaves
> sanely if "int64" is really only 32 bits.

Of course.

> It would probably be a good idea also to introduce some overflow checks
> to detect cases where the current LO offset would overflow int64 after a
> read, write, or seek. This is missing from the existing code :-(
> It is possible to code overflow checks that work regardless of the size
> of "int64"; see int8.c for some inspiration.

Yes. That would be good. These would be in the inv_* functions, correct?

> I'd suggest also that the
> offset be defined as signed not unsigned (int64 not uint64) as this will
> simplify the overflow checks and eliminate the prospect of scenarios
> where lo_tell64 cannot return a correct value.

I intended to do that. I think the only place I made uint64 vs int64 was
the getsize function, and that could be int64 also. I will look at that
code and make sure I am not mixing them in ways that are not necessary and
useful.

I will take these suggestions and make another revision of the patch
shortly. Also, I was considering exposing up an lo_getsize or lo_stat
function which would tell you how big a large object was without having to
seek to the end and look at the return value, requiring you to have the
large object open at the time, to loose your old seek position (unless you
do a tell first), and requires several more server round-trips (if not
open, would involve open/seek/close, if open, could require
tell/seek/seek).

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Drake 2005-09-24 20:00:10 Re: 64-bit API for large objects
Previous Message Andrew Dunstan 2005-09-24 19:14:03 Re: 2 forks for md5?