Re: 7.2 - changed array_out() - quotes vs no quotes

From: David Gould <dg(at)nextbus(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Elein <elein(at)mail(dot)office(dot)nextbus(dot)com>, Laurette Cisneros <laurette(at)mail(dot)office(dot)nextbus(dot)com>, David Gould <dg(at)mail(dot)office(dot)nextbus(dot)com>
Subject: Re: 7.2 - changed array_out() - quotes vs no quotes
Date: 2002-02-08 08:02:36
Message-ID: 20020208000236.A27779@crown.corp.nextbus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 08, 2002 at 12:29:46AM -0500, Tom Lane wrote:
> David Gould <dg(at)nextbus(dot)com> writes:
> > Somewhere after 7.2b2 (it looks like for 7.2b4) a change was made to
> > array_out() in:
>
> I will take the blame for that.

Well, it would have been nice if it had been in the release notes...

> > I view this change as a bug and would like to see it backed out.
>
> The old behavior was certainly broken and I will not accept a proposal
> to back out the change entirely.

I am not defending the old behaviour, just the fact that it _was_
predictable. It was simple to know which types are pass by ref and it is
simple to allow for it. I am not saying that pass by ref is the right
criteria, just that predictable is good.

> What you are really saying is that you'd prefer the choice of quotes or
> no quotes to be driven by the datatype rather than by the data value.

Yes. I think it is not excessive to insist that types have stable,
predicatable representations. The other types do, why should arrays be
even more special?

> That's a legitimate gripe, but where shall we get the knowledge of whether
> the datatype might sometimes emit strings that need quoting? Using the

Hmmm, maybe the type designers should deal with this. And if they don't,
well quote em.

Or, you don't even need the quotes, you could just promise never
to insert white space and to always escape embedded commas and curlys.
So a dumb client could simply split on un-escaped commas and be done.
I could live with that as long as it never changed again, at least without
some warning.

> pass-by-reference flag is *completely* wrong; the fact that it chanced
> not to fail in your application does not make it less wrong.

"chanced not to fail". Nice.

> The only way I could see to make the behavior totally predictable at
> the datatype level (while not being broken) is to always quote every
> array element. However, that would likely break some other people's

Fine with me. That is what it did before.

> oversimplified parsers, so I'm not convinced it's a net win. Perhaps
> you should fix your application code rather than relying on a
> never-documented behavior.

"relying on a never-documented behavior" ..., so it is now my fault there
is no documentation? Heh.

Besides, this isn't even strictly true, in the html documentation
(doc/arrays.html "chapter 6") that ships with 7.2 there is a nice example:

SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';

schedule
--------------------
{{"meeting"},{""}}
(1 row)

which shows quotes around the strings. Likewise, the doc quotes strings in
every example of an insert. Further, it says:

"Observe that to write an array value, we enclose the element values
within curly braces and separate them by commas. If you know C, this
is not unlike the syntax for initializing structures."

Which suggests that the syntax should be similar to C. Which requires
strings to be quoted.

And, since the "undocumented behaviour" has been there since the dawn
of time, or at least since:

src/backend/utils/adt/arrayfuncs.c
Revision 1.1, Tue Jul 9 06:22:03 1996 UTC (5 years, 7 months ago) by scrappy

it is likely that _not_ _changing_ _it_ would not break any of the last
six years worth of existing "other people's oversimplified parsers".

... breathes ...

I don't mean to be rude about this, I have a lot of respect for postgres,
all of the postgres developers and the overall process.

But to slip a client visible change late in a beta cycle to a specific
format that has been stable since UC Berkeley freed the code, and then
to suggest that is just a some silly user relying on
"never-documented behavior" is almost comical.

Seriously, one point of a database is to insulate client applications
from the exact representation and layout of the data. Which is not
accomplished by making arbitrary changes to simple things like strings
that make them take a yards and yards of code to parse.

I think a clearly explained rule about what types are quoted (and either
qoted or not quoted, not sometimes this and sometimes that) would be
a nice addition to the documentation, especially if the code then
was made to match it. And it was in the release notes.

But, as it stands, I still see a bug.

-dg

--
David Gould dg(at)nextbus(dot)com (or davidg(at)dnai(dot)com)
'Some people, when confronted with a problem, think "I know, I'll use
regular expressions". Now they have two problems.' -- Jamie Zawinski

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2002-02-08 08:09:55 Re: Threaded PosgreSQL server
Previous Message mkscott 2002-02-08 07:14:01 Re: Threaded PosgreSQL server