From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: What is the maximum encoding-conversion growth rate, anyway? |
Date: | 2007-07-18 15:09:10 |
Message-ID: | 200707181509.l6IF9AE12790@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
This has been saved for the 8.4 release:
http://momjian.postgresql.org/cgi-bin/pgpatches_hold
---------------------------------------------------------------------------
Tatsuo Ishii wrote:
> The conclusion of the discussion appears that we could reduce
> MAX_CONVERSION_GROWTH from 4 to 3 safely with all existing built-in
> conversions.
>
> However, since user defined conversions could set arbitrary growth
> rate, probably it would be better leave it as it is now.
>
> For 8.4, maybe we could change conversion function's signature so that
> we don't need to have the fixed conversion rate as Tom suggested.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
>
> > Where are we on this?
> >
> > ---------------------------------------------------------------------------
> >
> > Tom Lane wrote:
> > > I just rearranged the code in mbutils.c a little bit to make it more
> > > robust if conversion of an over-length string is attempted, and noted
> > > this comment:
> > >
> > > /*
> > > * When converting strings between different encodings, we assume that space
> > > * for converted result is 4-to-1 growth in the worst case. The rate for
> > > * currently supported encoding pairs are within 3 (SJIS JIS X0201 half width
> > > * kanna -> UTF8 is the worst case). So "4" should be enough for the moment.
> > > *
> > > * Note that this is not the same as the maximum character width in any
> > > * particular encoding.
> > > */
> > > #define MAX_CONVERSION_GROWTH 4
> > >
> > > It strikes me that this is overly pessimistic, since we do not support
> > > 5- or 6-byte UTF8 characters, and AFAICS there are no 1-byte characters
> > > in any supported encoding that require 4 bytes in another. Could we
> > > reduce the multiplier to 3? Or even 2? This has a direct impact on the
> > > longest COPY lines we can support, so I'd like it not to be larger than
> > > necessary.
> > >
> > > regards, tom lane
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 4: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> > --
> > Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> > EnterpriseDB http://www.enterprisedb.com
> >
> > + If your life is a hard drive, Christ can be your backup. +
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2007-07-18 15:13:28 | Re: Comments on the HOT design |
Previous Message | Magnus Hagander | 2007-07-18 15:04:09 | Re: Future of krb5 authentication |