Re: Inconsistent behavior on Array & Is Null?

From: Joe Conway <mail(at)joeconway(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent behavior on Array & Is Null?
Date: 2004-04-03 20:03:14
Message-ID: 406F1882.2040904@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>I agree. I had always envisioned something exactly like that once we supported
>>NULL elements. As far as the implementation goes, I think it would be very
>>similar to tuples -- a null bitmask that would exist if any elements are NULL.
>
> Well you might still want to store an internal "all indexes below this are
> null". That way update foo set a[1000]=1 doesn't require storing even a bitmap
> for the first 999 elements. Though might make maintaining the bitmap kind of a
> pain. Maintaining the bitmap might be kind of a pain anyways though because
> unlike tuples the array size isn't constant.

I don't think it will be worth the complication to do other than a
straight bitmap -- at least not the first attempt.

>>A related question is how to deal with non-existing array elements. Until now,
>>you could do:
>
> I would have to think about it some more, but my first reaction is that
> looking up [0] should generate an error if there can never be a valid entry at
> [0]. But looking up indexes above the highest index should return NULL.
>
> There are two broad use cases I see for arrays. Using them to represent tuples
> where a[i] means something specific for each i, and using them to represent
> sets where order doesn't matter.
>
> In the former case I might want to initialize my column to an empty array and
> set only the relevant columns as needed. In that case returning NULL for
> entries that haven't been set yet whether they're above the last entry set or
> below is most consistent.

Maybe, but you're still going to need to explicitly set the real upper
bound element in order for the length/cardinality to be correct. In
other words, if you really want an array with elements 1 to 1000, but 2
through 1000 are NULL, you'll need to explicitly set A[1000] = NULL;
otherwise we'll have no way of knowing that you really want 1000
elements. Perhaps we'll want some kind of array_init function to create
an array of a given size filled with all NULL elements (or even some
arbitrary constant element).

I'd think given the preceding, it would make more sense to throw an
error whenever trying to access an element greater than the length.

> In the latter case you really don't want to be looking up anything past the
> end and don't want to be storing NULLs at all. So it doesn't really matter
> what the behaviour is for referencing elements past the end, but you might
> conceivably want to write code like "while (e = a[i++]) ...".

See reasoning as above. And if you did somehow wind up with a "real"
NULL element in this scenario, you'd never know about it. The looping
could always be:
while (i++ <= length)
or
for (i = 1; i <= length, i++)

Joe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2004-04-03 21:35:26 Re: Inconsistent behavior on Array & Is Null?
Previous Message Joe Conway 2004-04-03 19:42:09 Re: Better support for whole-row operations and composite