Quick Links

Re: integrated tsearch doesn't work with non utf8 database

From:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: integrated tsearch doesn't work with non utf8 database
Date:	2007-09-10 14:56:31
Message-ID:	46E55B1F.3090207@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>>> Note the Seq Scan on pg_ts_config_map, with filter on ts_lexize(mapdict,
>>> $1). That means that it will call ts_lexize on every dictionary, which
>>> will try to load every dictionary. And loading danish_stem dictionary
>>> fails in latin2 encoding, because of the problem with the stopword file.
>
>> Attached patch should fix it, I hope.
>
> Uh, how will that help? AFAICS it still has to call ts_lexize with
> every dictionary.

No, ts_lexize is no longer in the seq scan filter, but in the sort key
that's calculated only for those rows that match the filter 'mapcfg=?
AND maptokentype=?'. It is pretty kludgey, though. The planner could
choose another plan, that fails, if the statistics were different.
Rewriting the function in C would be a more robust fix.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Re: integrated tsearch doesn't work with non utf8 database at 2007-09-10 14:37:26 from Tom Lane

Responses

Re: integrated tsearch doesn't work with non utf8 database at 2007-09-10 16:12:00 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2007-09-10 15:01:24	Re: A Silly Idea for Vertically-Oriented Databases
Previous Message	Mark Mielke	2007-09-10 14:38:02	Re: A Silly Idea for Vertically-Oriented Databases