From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Andrew Sullivan <andrew(at)libertyrms(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: default locale considered harmful? (was Re: [GENERAL] |
Date: | 2003-05-31 22:18:39 |
Message-ID: | 200305312218.h4VMIee21738@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > So, my understanding is that you would create something such as:
> > CREATE INDEX iix ON tab (LIKE col)
> > and that does LIKE lookups and knows how to do col LIKE 'abc%', but it
> > can't be used for >= or ORDER BY, but it can be used for equality tests?
>
> Hm. Right at the moment, it wouldn't be used for equality tests unless
> you spelled equality as "a ~=~ b". I wonder whether that's necessary
> though; couldn't we dispense with that operator and use ordinary
> equality as the BTEqual member of these opclasses? Are there any
> locales that claim that not-physically-identical strings are equal?
Let me see if I understand.
Our default indexes will be able to do =, >, <, ORDER BY, and the
special index will be able to do LIKE, ORDER BY, and maybe equals. Do I
have that correct?
Looking at CVS, I see the warning about non-C locales has been removed.
Should we instead mention the new LIKE index method?
# (Be sure to maintain the correspondence with locale_is_like_safe() in selfuncs.c.)
if test x`pg_getlocale COLLATE` != xC && test x`pg_getlocale COLLATE` != xPOSIX; then
echo "This locale setting will prevent the use of indexes for pattern matching"
echo "operations. If that is a concern, rerun $CMDNAME with the collation order"
echo "set to \"C\". For more information see the Administrator's Guide."
fi
Doing LIKE with single-byte encodings would be easy because it would be
only 256 compares to find the min/max char values, but that doesn't work
with multi-byte encodings, right?
This LIKE/encoding problem is a tricky one because it gives poor
performance with little warning to users.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Ron Johnson | 2003-06-01 01:31:41 | Re: Slashdot: SAP and MySQL Join Forces |
Previous Message | Jason Ziegler | 2003-05-31 22:14:56 | Re: pgAdmin3 snapshots |
From | Date | Subject | |
---|---|---|---|
Next Message | Sean Chittenden | 2003-06-01 01:43:23 | Re: [HACKERS] Are we losing momentum? |
Previous Message | Dave Page | 2003-05-31 19:12:56 | The Register moving to Bricolage + PostgreSQL... |