From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | david(dot)g(dot)johnston(at)gmail(dot)com |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Typmod associated with multi-row VALUES constructs |
Date: | 2016-12-06 01:36:39 |
Message-ID: | 20161206.103639.203449204.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
At Mon, 5 Dec 2016 14:42:39 -0700, "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> wrote in <CAKFQuwZXyyPLaO0wyn94WihcjZCUsv8nr0FsCFrQ=oO1DkpBuA(at)mail(dot)gmail(dot)com>
> On Mon, Dec 5, 2016 at 2:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > > On Mon, Dec 5, 2016 at 1:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >> In order to fix this, we first have to decide what the semantics ought
> > >> to be. I think there are two plausible definitions:
> > >> 1. If all the expressions in the VALUES column share the same typmod,
> > >> use that typmod, else use -1.
> > >> 2. Use -1 whenever there is more than one VALUES row.
> >
> > > Can we be precise enough to perform #2 if the top-level (or immediate
> > > parent) command is an INSERT - the existing table is going to enforce its
> > > own typemod anyway, otherwise go with #1?
> >
> > I dunno if that's "precise" or just "randomly inconsistent" ;-)
> >
>
> :)
>
> How does "targeted optimization" sound?
(sorry I don't understand what the "targetted optimization" is..)
FWIW, different from the UNION case, I don't see a reason that
every row in a VALUES clause shares anything common with any
other rows. Of course typmod and even type are not to be
shared. (Type is shared, though.)
On the other hand, if we make all values to be strictly typed (I
mean that every value brings its own type information along
with), values also can consider strict type. But currently the
following command is ignoring the type of the first value.
=# select 'bar'::varchar(4) || 'eeee';
?column?
----------
bareeee
> > > Lacking that possibility I'd say that documenting that our treatment of
> > > typemod in VALUES is similar to our treatment of typemod in function
> > > arguments would be acceptable. This suggests a #3 - simply use "-1"
> > > regardless of the number of rows in the VALUES expression.
> >
> > I'm a bit concerned about whether that would introduce overhead that we
> > avoid today, in particular for something like
> >
> > insert into foo (varchar20col) values ('bar'::varchar(20));
> >
> > I think if we throw away the knowledge that the VALUES row produces the
> > right typmod already, we'd end up adding an unnecessary runtime coercion
> > step.
Is it means that something like this?
insert into foo (varchar20col)
select a::varchar(20) from (values ('barrrrrrrrrrrrrrrrrrrrr')) as a;
Even though I'm not sure about SQL standard here but my
feeling is something like the following.
| FROM (
| VALUES (row 1), .. (row n))
| AS foo (colname *type*, ..)
for this case,
| create temporary table product_codes as select *
| from (
| values
| ('abcdefg'),
| ('012345678901234567ABCDEFGHIJKLMN')
| ) csv_data (product_code character varying(20));
Myself have gotten errors for this false syntax several times:(
> Unnecessary maybe, but wouldn't it be immaterial given we are only able to
> be efficient when inserting exactly one row.
>
> There is also a #4 here to consider - if the first (or any) row is not type
> unknown, and the remaining rows are all unknown, use the type and typemod
> of the known row AND attempt coerce all of the unknowns to that same type.
> I'd suggest this is probably the most user-friendly option (do as I mean,
> not as I say). The OP query would then fail since the second literal is
> too long to fit in a varchar(20) - I would not want the value truncated so
> an actual cast wouldn't work.
1 has a type of int, 1.0 has a type of float and '1' has a type
of text. So I don't see a situation where only the first row is
detectably typed. Or is it means that only the first row is
explicitly typed? I agree that it would be an option but I prefer
the above syntax (#5?) instead for the same purpose.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-12-06 01:37:06 | Re: Select works only when connected from login postgres |
Previous Message | Adrian Klaver | 2016-12-06 01:33:01 | Re: Select works only when connected from login postgres |