Re: 7.3 no longer using indexes for LIKE queries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Matthew Gabeler-Lee <mgabelerlee(at)zycos(dot)com>
Cc: "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: 7.3 no longer using indexes for LIKE queries
Date: 2002-12-04 20:41:54
Message-ID: 18114.1039034514@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Matthew Gabeler-Lee <mgabelerlee(at)zycos(dot)com> writes:
> How about, as an intermediate solution, a list of 'sane' locales in which
> the optimization applied to the C/POSIX locale works?

If you provide such a list, we'll be happy to improve locale_is_like_safe().

Offhand though, I suspect that *most* if not all non-C locales have
problems; even en_US, which has no character set issues, still manages
to insist on a multipass sort algorithm :-(. An example on a recent
Linux release:

[tgl(at)g3]$ echo -e "a\na a\naa\na a\nab\na b" | LC_ALL=en_US sort
a
aa
a a
a a
ab
a b

There's no way to use an index ordered like this to look for strings
beginning "a ", because the sorting of spaces depends on what comes
after them.

Making any real dent in the problem will probably require in-depth
analysis of common locale (mis)behaviors. For example, if space sorting
is the only thing that's funny about en_US, it might make sense for us
to support a modified form of the LIKE optimization that doesn't
consider space as a "safe" prefix character (ie, we could index only
for "a" not "a ", even if the pattern is LIKE 'a %').

I have no idea exactly what sort of compromises would be effective
though. Any localedef experts out there?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joseph Shraibman 2002-12-04 20:46:31 where did debug_print_query go in 7.3???
Previous Message Matthew Gabeler-Lee 2002-12-04 20:41:23 Re: 7.3 no longer using indexes for LIKE queries