From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Get rid of USE_WIDE_UPPER_LOWER dependency in trigram constructi |
Date: | 2013-04-07 18:47:07 |
Message-ID: | E1UOucN-0001vC-Py@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Get rid of USE_WIDE_UPPER_LOWER dependency in trigram construction.
contrib/pg_trgm's make_trigrams() was coded to ignore multibyte character
boundaries and just make trigrams from bytes if USE_WIDE_UPPER_LOWER wasn't
defined. This is a bit odd, since there's no obvious reason why trigram
compaction rules should depend on the presence of towlower() and friends.
What's more, there was an Assert() that would fail if that code path was
fed any multibyte characters.
We need to do something about this since the pending regex-indexing patch
has an assumption that you get just one "trgm" from any three characters.
The best solution seems to be to remove the USE_WIDE_UPPER_LOWER
dependency, which shouldn't really have been there in the first place.
The second loop in make_trigrams() is now just a fast path and not a
potentially incompatible algorithm.
If there is anybody still using Postgres on machines without wcstombs() or
towlower(), and they have non-ASCII data indexed by pg_trgm, they'll need
to REINDEX those indexes after pg_upgrade to 9.3, else searches may fail
incorrectly. It seems likely that there are no such installations, though.
In passing, rename cnt_trigram to compact_trigram, which seems to better
describe its functionality, and improve make_trigrams' test for whether it
has to use the slow path or not (per a suggestion from Alexander Korotkov).
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/7844608e54a3a2e3dee461b00fd6ef028a845d7c
Modified Files
--------------
contrib/pg_trgm/trgm_op.c | 17 ++++++++++-------
1 files changed, 10 insertions(+), 7 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2013-04-07 21:27:38 | pgsql: Fix checksums for CLUSTER, VACUUM FULL etc. |
Previous Message | Tom Lane | 2013-04-07 02:29:06 | pgsql: In isolationtester, retry after EINTR return from select(2). |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-04-07 18:48:24 | Re: Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm |
Previous Message | Alexander Korotkov | 2013-04-07 18:02:38 | Re: Slightly insane use of USE_WIDE_UPPER_LOWER in pg_trgm |