From: | Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Lexing with different charsets |
Date: | 2004-04-13 16:57:48 |
Message-ID: | Pine.LNX.4.44.0404131843530.4551-100000@zigo.dhs.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've spent some more time reading specs today. Together with Peter E's
explanataion (Thanks!) I think I've got a farily good understanding of the
parts talking about locales now.
My next question is about lexing. The spec says that one can use strings
of different charsets in the queries, like:
... WHERE field1 = _latin1'FooBar' and field2 = _utf8'Åäö'
I can see that the lexer either needs to be taught about all the
different charsets or this is not going to work very well.
What if one wants to include a string in utf-16 in the query, the lexer
can not handle that without understanding utf-16. The query can also be in
different charsets. If it's in utf-8 for example, then we can not embed
latin1 strings and still have a validating utf-8 query. With the above we
can not think of the query as being in a single charset anymore. That's
strange but okay I guess.
The new wire protocol allows us to send data seperatly from the query
which is nice, but the standard talked about strings as above so it's not
a solution to the problem.
Maybe I should have adressed this to Peter directly :-)
--
/Dennis Björklund
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2004-04-13 18:02:39 | query slows down with more accurate stats |
Previous Message | Stephan Szabo | 2004-04-13 16:47:02 | Re: make == as = ? |