pgtcl large object read/write corrupts binary data

From: ljb <ljb220(at)mindspring(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: pgtcl large object read/write corrupts binary data
Date: 2003-10-27 02:33:58
Message-ID: bni06l$2jh$2@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-interfaces

[Using PostgreSQL-7.3.4 and -7.4beta5, Tcl-8.4.x.]

Binary data written to a Large Object with libpgtcl's pg_lo_write is
corrupted. Tcl is mangling the data - something to do with UTF-8
conversion. Example: 0x80 becomes 0xc2 0x80, and 0xff becomes 0xc3 0xbf.

The problem with pg_lo_read is more subtle. If you compare the expected and
actual data with == or [string equal], they do not match, but if you check
byte by byte, or write the two values to files, they do match. I believe
this is happening because pg_lo_read is returning an object which is
inconsistent between its Tcl "string rep" and internal byte array.

Here are 2 test scripts to show the problem. They assume your environment
variables are set up to allow a connection to PostgreSQL with an empty
'conninfo' string.

Quick test script for pg_lo_write problem:
========================
# Write to large object with pg_lo_write, export with pg_lo_export:
set data "\x80\xffzzzz"
set datalen 6
set conn [pg_connect -conninfo ""]
pg_execute $conn begin
set loid [pg_lo_creat $conn INV_READ|INV_WRITE]
set lofd [pg_lo_open $conn $loid w]
pg_lo_write $conn $lofd $data $datalen
pg_lo_close $conn $lofd
pg_lo_export $conn $loid lo.out
pg_lo_unlink $conn $loid
pg_execute $conn commit
pg_disconnect $conn
========================
Run this script with pgtclsh, then hexdump the file "lo.out".
Expected result: file contains "0x80 0xff 0x7a 0x7a 0x7a 0x7a"
Observed result: file contains "0xc2 0x80 0xc3 0xbf 0x7a 0x7a"

Quick test script for pg_lo_read problem:
========================
# Import large object with pg_lo_import, read back with pg_lo_read:
set data "\x80\xffzzzz"
set datalen 6
set f [open lo.in w]
fconfigure $f -translation binary
puts -nonewline $f $data
close $f
set conn [pg_connect -conninfo ""]
pg_execute $conn begin
set loid [pg_lo_import $conn lo.in]
set lofd [pg_lo_open $conn $loid r]
pg_lo_read $conn $lofd buf $datalen
pg_lo_close $conn $lofd
pg_lo_unlink $conn $loid
pg_execute $conn commit
pg_disconnect $conn
if {[string equal $buf $data]} { puts Match } else { puts Differ }
set f [open lo.in2 w]
fconfigure $f -translation binary
puts -nonewline $f $buf
close $f
========================
Run this script with pgtclsh.
Expected result: prints "Match"
Observed result: prints "Differ"
But hexdump the files "lo.in" and "lo2.in" to see identical contents.

Proposed Patch: (I think this requires Tcl >= 8.1)
===================
--- src/interfaces/libpgtcl/pgtclCmds.c.orig 2003-08-03 22:40:16.000000000 -0400
+++ src/interfaces/libpgtcl/pgtclCmds.c 2003-10-25 20:36:58.000000000 -0400
@@ -1215,7 +1215,7 @@
buf = ckalloc(len + 1);

nbytes = lo_read(conn, fd, buf, len);
- bufObj = Tcl_NewStringObj(buf, nbytes);
+ bufObj = Tcl_NewByteArrayObj(buf, nbytes);

if (Tcl_ObjSetVar2(interp, bufVar, NULL, bufObj,
TCL_LEAVE_ERR_MSG | TCL_PARSE_PART1) == NULL)
@@ -1307,7 +1307,7 @@
if (Tcl_GetIntFromObj(interp, objv[2], &fd) != TCL_OK)
return TCL_ERROR;

- buf = Tcl_GetStringFromObj(objv[3], &nbytes);
+ buf = Tcl_GetByteArrayFromObj(objv[3], &nbytes);

if (Tcl_GetIntFromObj(interp, objv[4], &len) != TCL_OK)
return TCL_ERROR;
===================

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bas Scheffers 2003-10-27 09:02:48 minor: ~ not resolved in psql
Previous Message Gaetano Mendola 2003-10-26 23:35:52 Autocomplete <TAB> on Postgres7.4beta5 not working?

Browse pgsql-interfaces by date

  From Date Subject
Next Message Tomasz Myrta 2003-10-27 07:10:27 Re: LIBPQ
Previous Message "." 2003-10-27 00:38:09 Re: LIBPQ Question