From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ICU integration |
Date: | 2017-02-16 06:13:57 |
Message-ID: | CAH2-Wzkt_bf5yBeoNMB95NG40tg_UNNvCpYLzhmAoNijG3LMPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut
<peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> I have changed it to use ucol_nextSortKeyPart() when appropriate.
I think it would be fine if the second last argument was
"sizeof(Datum)", not "Min(sizeof(Datum), sss->buflen2)" here:
> + if (GetDatabaseEncoding() == PG_UTF8)
> + {
> + UCharIterator iter;
> + uint32_t state[2];
> + UErrorCode status;
> +
> + uiter_setUTF8(&iter, sss->buf1, len);
> + state[0] = state[1] = 0; /* won't need that again */
> + status = U_ZERO_ERROR;
> + bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol,
> + &iter,
> + state,
> + (uint8_t *) sss->buf2,
> + Min(sizeof(Datum), sss->buflen2),
> + &status);
> + if (U_FAILURE(status))
> + ereport(ERROR,
> + (errmsg("sort key generation failed: %s", u_errorName(status))));
> + }
Does this really need to be in the strxfrm() loop at all? I don't see
why you need to ever retry here.
There is an issue of style here:
> - /* Just like strcoll(), strxfrm() expects a NUL-terminated string */
> memcpy(sss->buf1, authoritative_data, len);
> + /* Just like strcoll(), strxfrm() expects a NUL-terminated string.
> + * Not necessary for ICU, but doesn't hurt. */
I see that you have preserved strcoll() comparison caching (or should
I say ucol_strcollUTF8() comparison caching?), at the cost of having
to keep around a buffer which we must continue to copy every text
string into within varstr_abbrev_convert(). That was probably the
right trade-off. It doesn't hurt that it required minimal divergence
within new ICU code, too.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Workman | 2017-02-16 06:15:49 | Re: Possible issue with expanded object infrastructure on Postgres 9.6.1 |
Previous Message | Tom Lane | 2017-02-16 06:05:28 | Re: Patch to implement pg_current_logfile() function |