From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
Cc: | amul sul <sulamul(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>, Bruce Momjian <bruce(at)momjian(dot)us>, amul sul <sul_amul(at)yahoo(dot)co(dot)in>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bug in to_timestamp(). |
Date: | 2016-11-04 18:36:37 |
Message-ID: | 8744.1478284597@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> writes:
> I attached new version of the patch, which fix is_char_separator()
> declaration too.
I did some experimenting using
http://rextester.com/l/oracle_online_compiler
It appears that Oracle will consider a single space in the pattern
to match zero or more spaces in the input, as all of these produce
the expected result:
SELECT to_timestamp('2000JUN', 'YYYY MON') FROM dual
SELECT to_timestamp('2000 JUN', 'YYYY MON') FROM dual
SELECT to_timestamp('2000 JUN', 'YYYY MON') FROM dual
Also, a space in the pattern will match a single separator character
in the input, but not multiple separators:
SELECT to_timestamp('2000-JUN', 'YYYY MON') FROM dual
-- ok
SELECT to_timestamp('2000--JUN', 'YYYY MON') FROM dual
ORA-01843: not a valid month
And you can have whitespace along with that single separator:
SELECT to_timestamp('2000 + JUN', 'YYYY MON') FROM dual
-- ok
SELECT to_timestamp('2000 + JUN', 'YYYY MON') FROM dual
-- ok
SELECT to_timestamp('2000 ++ JUN', 'YYYY MON') FROM dual
ORA-01843: not a valid month
You can have leading whitespace, but not leading separators:
SELECT to_timestamp(' 2000 JUN', 'YYYY MON') FROM dual
-- ok
SELECT to_timestamp('/2000 JUN', 'YYYY MON') FROM dual
ORA-01841: (full) year must be between -4713 and +9999, and not be 0
These all work:
SELECT to_timestamp('2000 JUN', 'YYYY/MON') FROM dual
SELECT to_timestamp('2000JUN', 'YYYY/MON') FROM dual
SELECT to_timestamp('2000/JUN', 'YYYY/MON') FROM dual
SELECT to_timestamp('2000-JUN', 'YYYY/MON') FROM dual
but not
SELECT to_timestamp('2000//JUN', 'YYYY/MON') FROM dual
ORA-01843: not a valid month
SELECT to_timestamp('2000--JUN', 'YYYY/MON') FROM dual
ORA-01843: not a valid month
which makes it look a lot like Oracle treats separator characters in the
pattern the same as spaces (but I haven't checked their documentation to
confirm that).
The proposed patch doesn't seem to me to be trying to follow
these Oracle behaviors, but I think there is very little reason for
changing any of this stuff unless we move it closer to Oracle.
Some other nitpicking:
* I think the is-separator function would be better coded like
static bool
is_separator_char(const char *str)
{
/* ASCII printable character, but not letter or digit */
return (*str > 0x20 && *str < 0x7F &&
!(*str >= 'A' && *str <= 'Z') &&
!(*str >= 'a' && *str <= 'z') &&
!(*str >= '0' && *str <= '9'));
}
The previous way is neither readable nor remarkably efficient, and it
knows much more about the ASCII character set than it needs to.
* Don't forget the cast to unsigned char when using isspace() or other
<ctype.h> functions.
* I do not see the reason for throwing an error here:
+ /* Previous character was a backslash */
+ if (in_backslash)
+ {
+ /* After backslash should go non-space character */
+ if (isspace(*str))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid escape sequence")));
+ in_backslash = false;
Why shouldn't backslash-space be a valid quoting sequence?
I'll set this back to Waiting on Author.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-11-04 19:15:33 | Re: Mention column name in error messages |
Previous Message | Doug Doole | 2016-11-04 17:52:25 | Re: Improving executor performance |