Re: patch (for 9.1) string functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>
Subject: Re: patch (for 9.1) string functions
Date: 2010-07-26 15:07:50
Message-ID: AANLkTikTma8hZfsFyDtN+We1fXW8Zj82KPyi0X_uv6XQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>>> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else
>>>> implements here?  And why does CONCAT() take a variadic "ANY"
>>>> argument?  Shouldn't that be variadic TEXT?
>>>
>>> What does that accomplish, besides forcing you to sprinkle every
>>> concat call with text casts (maybe that's not a bad thing?)?
>>
>> You could ask the same thing about the existing || operator.  And in
>> fact, we used to have that behavior.  We changed it in 8.3.  Perhaps
>> that was a good decision and perhaps it wasn't, but I don't think
>> using CONCAT() to make an end-run around that decision is the way to
>> go.
>
> It was absolutely a good decision because it prevented type inference
> in ways that were ambiguous or surprising (for a canonical case see:
> http://www.mail-archive.com/pgsql-general(at)postgresql(dot)org/msg93224.html)
>
> || operator is still pretty tolerant in the 8.3+ world.
> select interval_col || bool_col; -- error
> select interval_col::text || bool_col; -- text concatenation
> select text_col || interval_col || bool_col; -- text concatenation
>
> variadic text would require text casts on EVERY non 'unknown' argument
> which drops it below the threshold of usefulness IMO -- it would be
> far stricter than vanilla || concatenation.  Ok, pure bikeshed here
> (shutting my trap now!), but concat() is one of those wonder functions
> that you want to make as usable and terse as possible.  I don't see
> the value in making it overly strict.

I'm just very skeptical that we should give our functions argument
types that are essentially fantasy. CONCAT() doesn't concatenate
integers or intervals or boxes: it concatenates strings, and only
strings. Surely the right fix, if our casting rules are too
restrictive, is to fix the casting rules; not to start adding a bunch
of kludgery in every function we define.

The problem with your canonical example (and the other examples of
this type I've seen) is that they are underspecified. Given two
identically-named operators, one of which accepts types T1 and T2, and
the other of which accepts types T3 and T4, what is one to do with T1
OP T4? You can cast T1 to T3 and call the first operator or you can
cast T4 to T2 and call the second one, and it's really not clear which
is right, so you had better thrown an error. The same applies to
functions: given foo(text) and foo(date) and the call
foo('2010-04-15'), you had better complain, or you may end up with a
very sad user. But sometimes our current casting rules require casts
in situations where they don't seem necessary, such as
LPAD(integer_column, '0', 5). That's not ambiguous because there's
only one definition of LPAD, and the chances that the user will be
unpleasantly surprised if you call it seem quite low.

In other words, this problem is not unique to CONCAT().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2010-07-26 15:39:40 Re: patch (for 9.1) string functions
Previous Message Robert Haas 2010-07-26 14:45:36 Re: SSL cipher and version