Progress on char(n) default-value problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Progress on char(n) default-value problem
Date: 1999-05-14 00:56:58
Message-ID: 13915.926643418@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

It looks like the problem is that the default value is getting inserted
without benefit of conversion, ie, whatever the given text is will get
dropped into the finished tuple without padding/truncation to the
specified char(n) length.

Later, when we try to read out the tuple, the tuple access routines
figure they know how big a char(n) is, so they don't actually look
to see what the varlena count is. This results in misalignment of
following fields, causing either wrong data readout or a full-bore
crash.

Test case:

CREATE TABLE test (
plt int2 PRIMARY KEY,
state CHAR(5) NOT NULL DEFAULT 'new',
used boolean NOT NULL DEFAULT 'f',
id int4
);

INSERT INTO test (plt, id) VALUES (2, 3);

Examination of the stored tuple shows it contains 32 bytes of data:

0x400d7f30: 0x00 0x02 0x00 0x00 0x00 0x00 0x00 0x07
0x400d7f38: 0x6e 0x65 0x77 0x00 0x00 0x00 0x00 0x03

which deconstructs as follows:

00 02 int2 '2' (bigendian hardware here)
00 00 pad space to align varlena char field to long boundary
00 00 00 07 varlena header, size 7 => 3 bytes of actual data (whoops)
6e 65 77 ASCII 'new'
00 boolean 'f' (no pad needed for bool)
00 00 00 03 int4 '3' (no pad, it's on a long boundary already)

But the tuple readout routines will assume without looking that char(5)
occupies 9 bytes altogether, so they pick up the bool field 2 bytes over
from where it actually was put and pick up the int4 field 4 bytes over
from where it should be (due to alignment); result is garbage. If there
were another varlena field after the char(n) field, they'd pick up a
wrong field length and probably crash.

So, the question still remains "where and why"? My guess at this point
is that this is a bad side-effect of the fact that text and char(n) are
considered binary-equivalent. Probably, whatever bit of code ought to
be coercing the default value into the correct type for the column is
deciding that it doesn't have to do anything because they're already
equivalent types. I'm not sure where to look for that code (help
anyone?). But I am sure that it needs to be coercing the value to the
specified number of characters for char(n).

It also strikes me that there should be a check in the low-level
tuple construction routines that what they are handed for a char(n)
field is the right length. If tuple readout is going to assume that
char(n) is always n bytes of data, good software engineering dictates
that the tuple-writing code ought to enforce that assumption. At
the very least there should be an Assert() for it.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 1999-05-14 00:58:16 Re: [HACKERS] Report on NetBSD/mac port of Postgres 6.4.2
Previous Message Tom Lane 1999-05-14 00:01:45 Some progress on INSERT/SELECT/GROUP BY bugs