Quick Links

Re: UTF8 national character data type support WIP patch and list of open issues.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>
Cc:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: UTF8 national character data type support WIP patch and list of open issues.
Date:	2013-09-04 14:28:42
Message-ID:	904.1378304922@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Boguk, Maksym" <maksymb(at)fast(dot)au(dot)fujitsu(dot)com> writes:
> Hi, my task is implementing ANSI NATIONAL character string types as
> part of PostgreSQL core.

No, that's not a given. You have a problem to solve, ie store some UTF8
strings in a database that's mostly just 1-byte data. It is not clear
that NATIONAL CHARACTER is the best solution to that problem. And I don't
think that you're going to convince anybody that this is an improvement in
spec compliance, because there's too much gap between what you're doing
here and what it says in the spec.

> Both of these approach requires dump/restore the whole database which is
> not always an opinion.

That's a disadvantage, agreed, but it's not a large enough one to reject
the approach, because what you want to do also has very significant
disadvantages.

I think it is extremely likely that we will end up rejecting a patch based
on NATIONAL CHARACTER altogether. It will require too much duplicative
code, it requires too many application-side changes to make use of the
functionality, and it will break any applications that are relying on the
current behavior of that syntax. But the real problem is that you're
commandeering syntax defined in the SQL spec for what is in the end quite
a narrow usage. I agree that the use-case will be very handy for some
applications ... but if we were ever to try to achieve real spec
compliance for the SQL features around character sets, this doesn't look
like a step on the way to that.

I think you'd be well advised to take a hard look at the
specialized-database-encoding approach. From here it looks like a 99%
solution for about 1% of the effort; and since it would be quite
uninvasive to the system as a whole, it's unlikely that such a patch
would get rejected.

regards, tom lane

In response to

Re: UTF8 national character data type support WIP patch and list of open issues. at 2013-09-04 00:13:09 from Boguk, Maksym

Responses

Re: UTF8 national character data type support WIP patch and list of open issues. at 2013-09-16 12:49:52 from MauMau

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2013-09-04 14:33:23	Re: logical changeset generation v5
Previous Message	Robert Haas	2013-09-04 14:02:05	Re: logical changeset generation v5