From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Vlad Romascanu <vromascanu(at)accurev(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: One-off attempt at catalog hacking to turn bytea column into text |
Date: | 2011-05-11 08:11:08 |
Message-ID: | 20110511081108.GA26552@tornado.gateway.2wire.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, May 10, 2011 at 04:31:37PM -0400, Vlad Romascanu wrote:
> As a one-off attempt to change a large table's 'bytea' column to
> 'text' with minimal I/O (where the 'bytea' contents is already valid
> UTF8 and the database encoding is also UTF8, and the column is not
> part of any index or anything involving collation), how unsafe is the
> following?
>
> UPDATE pg_attribute SET atttypid='text'::regtype::oid WHERE
> attrelid=('schema_name.table_name')::regclass AND attname='col_name'
> AND atttypid='bytea'::regtype::oid;
Do a "LOCK TABLE schema_name.table_name" earlier in the same transaction.
Any of the following potentially complicates things:
- data is not valid text in the server encoding, including NUL bytes
- use in an index (you mentioned this does not apply)
- use in a CHECK or FOREIGN KEY constraint
- default expression on the column
- views or rules referencing the column
- referenced directly in a trigger definition (UPDATE OF col_name, WHEN ...)
A useful, though not foolproof, way to check whether you've snagged on any of
those is to take schema dumps of a test database after (a) doing the change
normally and (b) doing it this way, then compare.
> Additionally, if the 'bytea' happenned to also explicitly contain a
> trailing NUL prior to the "conversion" (i.e. after the hack, the last
> byte in the 'text' value would be NUL), would there be any obvious
> problems with the above hack?
Yes; this will break things in the general case. Two text datums that differ
only in the presence of this trailing NUL will compare as unequal by texteq().
The octet_length() function will include the NUL byte in its count. If you use
the column in very restricted ways, you might end up okay. Note that you could
avoid these problems by updating the bytea columns to remove these NUL bytes
before making the catalog update. It could still be much I/O, but at least you
wouldn't have a table lock while doing it.
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Stanislav Raskin | 2011-05-11 11:19:38 | full text search to_tsquery performance with ispell dictionary |
Previous Message | mephysto | 2011-05-11 08:03:25 | Custom types and JDBC |