Re: PostgreSQL Gotchas

From: Chris Travers <chris(at)travelamericas(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chris Travers <chris(at)travelamericas(dot)com>, Scott Marlowe <smarlowe(at)g2switchworks(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, "Aly S(dot)P Dharshi" <aly(dot)dharshi(at)telus(dot)net>, "Gavin M(dot) Roy" <gmr(at)ehpg(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: PostgreSQL Gotchas
Date: 2005-10-13 21:04:39
Message-ID: 434ECBE7.7010608@travelamericas.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ok. here are some indepth thoughts after reviewing as many prior threads
as I could find on the archives.

Tom Lane wrote:

>Chris Travers <chris(at)travelamericas(dot)com> writes:
>
>
>>Tom Lane wrote:
>>
>>
>>>Since the end reward for all this work would be having to read CATALOGS
>>>WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>>>interested ...
>>>
>>>
>>>
>>Why would this be required?
>>
>>
>
>If you write, say,
>
> select max(relpages) from pg_class;
>
>
I have gone back and read the previous discussion and I still do not
understand what the real impediment is (to at least putting it on the
TODO list, at least). I see a lot of reasons that don't make any sense
to me, and have come across one substantial obstacle that has not been
mentioned yet to my knowledge.

I understood this exmaple from your previous comment. But aside from
the aggregate issue, I fail to see why fixing it is a requirement.
Perhaps I am being unclear in my thoughts and we are talking past eachother.

If I write that statement, and it gets back an error saying that no
table is named PG_CLASS, that is my fault as an individual developer.
After all I set the configuration option to fold to uppercase, right?

The relevant question is very simple. Where does one draw the line
between between the responsibility of the programmer and the
responsibility of the DBA? Personally I think it is important to offer
modes that offer as much standards-compliance as possible.

From the previous discussion, it was mentioned that the backend treats
identifiers as quoted. It seems to me that this should make it *easier*
rather than *harder* to impliment because this is largely a change
regarding what a given SQL statement means. I.e. most of the work I had
foreseen has already been done. (My proposal would have been to have
the backend treat identifiers used internally as already double-quoted.)

If you are folding to lower case, your example, select max(relpages)
from pg_class, is the same as SELECT MAX("relpages") FROM "pg_class";

If you are folding to upper case, your example is the same as SELECT
MAX("RELPAGES") FROM "PG_CLASS"; Of course, we don't expect this to
give us any results today. In essence, I don't see why we would expect
this to return any results. If you issue an incorrect SQL statement,
whose fault is that?

Now, the one place where this might create a problem is in the
information_schema. The problem here is not the same as any issue I
have seen discussed before, but the fact that case folding could create
non-standard behavior here absent other changes. The only option I see
here is to create a second INFORMATION_SCHEMA with upper case view names.

>and the lexer thinks that it should fold unquoted identifiers to upper
>case, then the catalog entries defining these names had better read
>PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
>today.
>
Ok. so the only problem out of these three that I see is with MAX.
PG_CLASS and RELPAGES are the responsibility of the developer, IMHO.
Also, with functions and aggregates, this is not the problem that it is
with tables (as the name isn't usually sent back to the client), so I
don't know how much logic it would be to differentiate between
table/column names which might need to be folded and
function/aggregates which could continue the way that they are currently
done at least for now. It might also be possible to create duplicate
entries to the catelogs for builtin functions/aggregates in the catalogs
so that this case folding is not causing the same type of problem.
However, for builtin functions/aggregates, I am not sure if this is
likely to have any significant performance hit.

> So this wouldn't be something you could flip on-the-fly --- at
>the latest, an installation would have to commit to upper or lower case
>at initdb time, because the initial contents of all the system catalogs
>would need to match the choice.
>
>
Ok, so I see the objection basically being that changing the semantics
of the SQL statement would need to be done in a way that prevents
user-issued queries from having to know which way this is done. There
is no way around this objection because it is inconsistant with the very
idea of semantic changes to the SQL parser. Yet we have done this in
the past in areas I have previously mentioned. So the question really
is what is really required to make this work in a semantically clean
way. IMO, the following requirement is acceptable:

SELECT MAX("relpages") FROM "pg_class"

But the following is not:

SELECT "max"("relpages") FROM "pg_class"

However, I am sure that there will be people who don't see that as OK.
So, I would suggest that if they neeed to avoid quoting relpages and
pg_class, then they can create a PG_CATALOG schema and a PG_CLASS view.
Maybe this could be a pgfoundry project even. But I don't see it as a
requirement of the core team. The bigger issue is with "MAX" v. "max"
and in "INFORMATION_SCHEMA" v. "information_schema." This might
require duplicate entries in the system catalogs.

>Please read the previous discussions on the topic, if you want to
>pontificate about it.
>
>
Is there still any specific reason why this does not belong on the TODO
list?

I am not arguing that this should be a priority in development. I am
however arguing that since the behavior is non-standard, it might be
worth acknowledging this and at least suggesting that it should be fixed
at some indefinite point in the future.... If this was a requirement
for me, I would hire someone to make the changes and submit a patch...
It is just an attempt to ensure that it is on the roadmap at this time.

Best Wishes,
Chris Travers
Metatron Technology Consulting

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Travers 2005-10-13 21:16:00 Re: PostgreSQL Gotchas
Previous Message han.holl 2005-10-13 20:49:39 Re: How to inject knowledge into a Postgres database