Re: Re: COPY BINARY file format proposal

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: COPY BINARY file format proposal
Date: 2000-12-07 02:34:03
Message-ID: 3.0.5.32.20001207133403.02d6acc0@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 21:12 6/12/00 -0500, Tom Lane wrote:
>Philip Warner <pjw(at)rhyme(dot)com(dot)au> writes:
>>> OK, we can do it that way. I'm still going to pick a magic number that
>>> looks different depending on endianness, however ;-).
>
>> What does the smiley mean in this context?
>
>Just thinking that the only way an endianness flag inside the header
>would be useful is if we pick a magic number that's a bytewise
>palindrome.

You could just read the 1st, 2nd, 3rd, etc bytes and require that they be
'P', 'G', 'C', 'P', 'Y' or some such. I *think* reading five bytes and
doing a strcmp works...ie. don't rely on the integer value, use a string.

>> - floating point representation (for portability)
>
>Specified how? (For that matter, determined how?)

I'd recommend a crystal ball. You did ask a question about the future ;-}.

>> - flag for compressed or uncompressed toast fields (I assume you dump them
>> uncompressed?)
>
>Yes, I want COPY to force 'em to uncompressed so as to avoid problems
>with cross-version changes of compression algorithm. (Right at the
>moment it gets that wrong.)

Sounds reasonable, but there could be an advantage in allowing a binary
compressed dump for short-term work.

>> - version number may be important if we dump a subset of fields (ie. we'll
>> need to store the field names somewhere).
>
>No we don't. ASCII COPY format doesn't store field names either ... at
>least not as part of the data stream ... and should not IMHO. Don't you
>want to be able to reload into a table that you've changed the column
>names of?

This is essential if we ever allow subsets of columns - even if it is only
for displaying information to the user. If I dump 5 out of 7 columns then
rename half of them, I'd say I'm asking for trouble. At least with the
names available, you have a chance of working out what goes where. But
again, without copy-a-subset-of-columns, this also requires a crystal ball.

It all gets back to whether it's a good idea to overload a magic number.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2000-12-07 02:52:30 RE: AW: beta testing version
Previous Message Tom Lane 2000-12-07 02:12:58 Re: Re: COPY BINARY file format proposal