Re: UTF8 national character data type support WIP patch and list of open issues.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Date: 2013-09-04 14:28:42
Message-ID: 904.1378304922@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com> writes:
> Hi, my task is implementing ANSI NATIONAL character string types as
> part of PostgreSQL core.

No, that's not a given. You have a problem to solve, ie store some UTF8
strings in a database that's mostly just 1-byte data. It is not clear
that NATIONAL CHARACTER is the best solution to that problem. And I don't
think that you're going to convince anybody that this is an improvement in
spec compliance, because there's too much gap between what you're doing
here and what it says in the spec.

> Both of these approach requires dump/restore the whole database which is
> not always an opinion.

That's a disadvantage, agreed, but it's not a large enough one to reject
the approach, because what you want to do also has very significant
disadvantages.

I think it is extremely likely that we will end up rejecting a patch based
on NATIONAL CHARACTER altogether. It will require too much duplicative
code, it requires too many application-side changes to make use of the
functionality, and it will break any applications that are relying on the
current behavior of that syntax. But the real problem is that you're
commandeering syntax defined in the SQL spec for what is in the end quite
a narrow usage. I agree that the use-case will be very handy for some
applications ... but if we were ever to try to achieve real spec
compliance for the SQL features around character sets, this doesn't look
like a step on the way to that.

I think you'd be well advised to take a hard look at the
specialized-database-encoding approach. From here it looks like a 99%
solution for about 1% of the effort; and since it would be quite
uninvasive to the system as a whole, it's unlikely that such a patch
would get rejected.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-04 14:33:23 Re: logical changeset generation v5
Previous Message Robert Haas 2013-09-04 14:02:05 Re: logical changeset generation v5