From: | Darren Duncan <darren(at)darrenduncan(dot)net> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: VARIANT / ANYTYPE datatype |
Date: | 2011-05-06 21:54:36 |
Message-ID: | 4DC46E1C.3070405@darrenduncan.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks for picking up on my mentioning union types; I wasn't sure if anyone did.
Merlin Moncure wrote:
> On Fri, May 6, 2011 at 3:18 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>
>> On 05/06/2011 04:08 PM, Alvaro Herrera wrote:
>>> Excerpts from Darren Duncan's message of mié may 04 15:33:33 -0300 2011:
>>>
>>>> I see VARIANT/ANYTYPE as the most general case of supporting union types,
>>>> which,
>>>> say, could have more specific examples of "allow any number or date here
>>>> but
>>>> nothing else". If VARIANT is supported, unions in general ought to be
>>>> also.
>>> Okay, so aside from the performance (storage reduction) gained, there's
>>> this argument for having variant/union types. It seems to me that this
>>> is indeed possible to build. Completely general VARIANT, though, is
>>> rather complex. A declared union, where you specify exactly which types
>>> can be part of the union, can be catalogued, so that the system knows
>>> exactly where to look when a type needs to be modified. A general
>>> VARIANT however looks complex to me to solve.
>>>
>>> The problem is this: if an user attempts to drop a type, and this type
>>> is used in a variant somewhere, we would lose the stored data. So the
>>> drop needs to be aborted. Similarly, if we alter a type (easy example:
>>> a composite type) used in a variant, we need to cascade to modify all
>>> rows using that composite.
>>>
>>> If the unions that use a certain type are catalogued, we at least know
>>> what tables to scan to cascade.
>>>
>>> In a general variant, the system catalogs do not have the information of
>>> what type each variant masquerades as. We would need to examine the
>>> variant's masqueraded types on each insert; if the current type is not
>>> found, add it. This seems a bit expensive.
>> So how is a declared union going to look and operate? Something like this?
>>
>> CREATE TYPE foo AS UNION (ival int, tval text, tsval timestamptz):
>> CREATE TABLE bar (myunion foo);
>> INSERT INTO bar (myunion) VALUES (ival=>1), (tval=>'some text');
>> UPDATE bar SET myunion.tsval = CURRENT_TIMESTAMP;
>>
>>
>> Something like that could actually be quite nice for a number of purposes.
>
> using your hypothetical example, could you cast types to the union?
>
> select 1::int::foo;
>
> record types would presumably work as well? you could do a lot of
> *really* neat stuff that way...
Like with other respondents to this topic, I consider it much more useful and
important, as well as practical, to support explicitly defined type unions than
a wide-open ANYTYPE.
As for how it would operate: The example above implies a union type
implementation that is like C's concept of such, where you have to explicitly
state how you want the value bit pattern to be interpreted, by naming ival/etc,
rather than the DBMS just knowing that a particular value is of a specific type,
because per-value stored metadata says so (like with SQLite).
While that might be best in practice for implementation, I had envisioned
something more like set unions, so you could instead do it like this:
CREATE TYPE foo AS UNION (int, text, timestamptz):
CREATE TABLE bar (myunion foo);
INSERT INTO bar (myunion) VALUES (1), ('some text');
UPDATE bar SET myunion = CURRENT_TIMESTAMP;
Unlike a record type, where multiple attributes may have the same time,
presumably with a union, they would all be distinct, and so you could use the
type name itself to refer to each option; you don't have to make up "ival" or
whatever ... unless there are situations where types don't have names.
When doing operations that are type-generic, such as equality test or
assignment, especially assignment, between 2 things that are both declared to be
of type foo, you could just do it with no verbosity same as if you were doing 2
int or text or whatever.
When doing operations specific to int or text or whatever, or if you are
assigning a foo-declared thing to an int/text-declared thing, then you'd want an
explicit cast or assertion, such as "select myunion::int + 3 as answer from bar
where is_a(myunion,'int')".
Whether you want an explicit cast to go the other way, I would assume you don't
need to, like when you have DOMAINs; eg, I would expect the 4th line above to
just work, because the system knows the type of CURRENT_TIMESTAMP and it knows
that this is a member of the union type of myunion. I see a UNION type as being
like a DOMAIN type in reverse.
-- Darren Duncan
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-05-06 21:58:04 | Re: patch for new feature: Buffer Cache Hibernation |
Previous Message | Robert Haas | 2011-05-06 21:54:09 | Re: Prefered Types |