May I have som syntactic sugar, please ?

From: Emmanuel Charpentier <charpent(at)bacbuc(dot)dyndns(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: May I have som syntactic sugar, please ?
Date: 2003-02-10 21:54:53
Message-ID: b2973c$1jue$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The recent addition of drop column makes possible to "change" column type.
E. g. : I have written a small utility reading Medline queries results and
dumping them in a Postgres database for further analysis.

[ Two remarks before starting ; 1) I *know* that I'm reinventing the wheel
and that you can get those results in a nice XML format ; but the problem
is that I do not (yet) know XLM, and have no time writing and using an XML
parser, while I know how to parse a simple line-oriented format ; and 2)
no, the possible fields are not a known fixed set : Medline has alreadu
changer it's formats thrice in the las two years, so the reading routines
have to be dynamic. Alas ... ]

These results are a "field-by-field" enumeration. i.e eachrecord is a set
of <tag> - <Value> pairs (well, not exactly : some fields have more than
one line, but I'm scketchy ...). Some fields may be missing, some other may
be repeated.

The obvious solution is to create a temporary table with a set of keys
identifying record, tag and file source and a "value" column, which has
better to be of "text" type.

It is almost trivial to create (programactically) the structure of the main
table (a record identifier, one or two columns identifying the source, an a
column for each field present at most once in the DB) and the auxilliary
tables containing repeated fields.

But this table has initially all values "text", which is a) Horribly
inefficient, b) hard to use from outer programs (M$ tools will choke on an
indexed text field) and c) quite inadapted to certain queries (e. g.
comparisons of dates.

Furthermore, this kind of modifications cannot always be made t read-time.
It happens that only the *semantics* of the problem at hand allows to
choose the "right* column type. In my example, the date of last Medline
indexing is formally a timestamp ; but for all practical purposes, a date
is more than sufficient.

"drop columns" allows to fix this dynmically, along the line of "alter
table x add column new_y newtype; update x set new_y=y::newtype, alter
table x drop column y; alter table x rename column new_y to y;". This has
two side effects :

The first one is unimportant (at least from a relational point of view) :
the order of columns in the table is lost.

The other one is *most* important : any Postgres object using the column
definition (I'm thinking of indexes, views and rules but there might be
others) will have to be redefined.

What I'm thinking of would be something along the line of "alter table x
alter column y retype to newtype", doing essentially the same as above, but
*keeping the old column id* and *propagating* changes as needed in defined
view and indices (maybe with a "cascade" option, deleting unupdtable
objects as needed ?).

Am I dreaming awake ? Or is this implementable ? An sufficiently "generally
useful" ?

Comments ?

Emmanuel Charpentier
--
Emmanuel Charpentier

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Brown 2003-02-11 02:45:19 Re: Irix 6.2, Postgres 7.3.1, some brokenness
Previous Message Peter Eisentraut 2003-02-10 19:13:29 Re: Irix 6.2, Postgres 7.3.1, some brokenness