COPY BINARY is broken...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: COPY BINARY is broken...
Date: 2000-12-01 22:35:33
Message-ID: 15452.975710133@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've just noticed that COPY BINARY is pretty thoroughly broken by TOAST,
because what it does is to dump out verbatim the bytes making up each
tuple of the relation. In the case of a moved-off value, you'll get
the toast reference, which is not going to be too helpful for reloading
the table data. In the case of a compressed-in-line datum, you'll at
least have all the data there, but the COPY BINARY reader will crash
and burn when it sees it.

Fixing this while retaining backwards compatibility with the existing
COPY BINARY file format is possible, but it seems rather a headache:
we'd need to detoast all the toasted columns, then heap_formtuple a
new tuple containing the expanded data, and finally write that out.
(Can't do it on a field-by-field basis because the file format requires
the total tuple size to precede the tuple data.) Kind of ugly.

The existing COPY BINARY file format is entirely brain-dead anyway; for
example, it wants the number of tuples to be stored at the front, which
means we have to scan the whole relation an extra time to get that info.
Its handling of nulls is bizarre, too. I'm thinking this might be a
good time to abandon backwards compatibility and switch to a format
that's a little easier to read and write. Does anyone have an opinion
pro or con about that?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikheev, Vadim 2000-12-01 22:47:48 RE: COPY BINARY is broken...
Previous Message Tom Lane 2000-12-01 22:07:42 Re: Rules with Conditions: Bug, or Misunderstanding