Quick Links

Re: integrated tsearch doesn't work with non utf8 database

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc:	"Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: integrated tsearch doesn't work with non utf8 database
Date:	2007-09-10 16:12:00
Message-ID:	6603.1189440720@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> Uh, how will that help? AFAICS it still has to call ts_lexize with
>> every dictionary.

> No, ts_lexize is no longer in the seq scan filter, but in the sort key
> that's calculated only for those rows that match the filter 'mapcfg=?
> AND maptokentype=?'. It is pretty kludgey, though.

Oh, I see: in the original formulation the planner can push the "WHERE
dl.lex IS NOT NULL" clause down into the sorted subquery, and what with
one thing and another that clause ends up getting evaluated first in the
scan filter condition. We could prevent that by increasing ts_lexize's
procost from 1 to 2 (or so), which might be a a good thing anyway since
I suppose it's not especially cheap. It's still a klugy solution
though.

I think Teodor's solution is wrong as it stands, because if the subquery
finds matches for mapcfg and maptokentype, but none of those rows
produce a non-null ts_lexize result, it will instead emit one row with a
null result, which is not what should happen.

regards, tom lane

In response to

Re: integrated tsearch doesn't work with non utf8 database at 2007-09-10 14:56:31 from Heikki Linnakangas

Responses

Re: integrated tsearch doesn't work with non utf8 database at 2007-09-10 16:21:10 from Teodor Sigaev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-09-10 16:21:02	Re: invalidly encoded strings
Previous Message	Andrew Dunstan	2007-09-10 16:09:54	Re: invalidly encoded strings