From: | "Dan Langille" <dan(at)langille(dot)org> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: tsvector not giving expected results on one host |
Date: | 2022-12-17 20:33:33 |
Message-ID: | 5cd18dcb-c526-4c1f-ac4f-c56cfbdcd55e@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sat, Dec 17, 2022, at 3:22 PM, Dan Langille wrote:
> On Sat, Dec 17, 2022, at 3:14 PM, Dan Langille wrote:
>> On Sat, Dec 17, 2022, at 2:55 PM, Tom Lane wrote:
>>> "Dan Langille" <dan(at)langille(dot)org> writes:
>>>> pkgmessage_textsearchable2 | tsvector | | | generated always as (to_tsvector('english'::regconfig, translate(pkgmessage, '/'::text, ' '::text))) stored
>>>
>>> That is not likely to play well with this:
>>>
>>>> freshports.org=> show default_text_search_config ;
>>>> default_text_search_config
>>>> ----------------------------
>>>> pg_catalog.simple
>>>
>>> because "english" and "simple" will stem words differently.
>>>
>>> regression=# select websearch_to_tsquery('english', 'example');
>>> websearch_to_tsquery
>>> ----------------------
>>> 'exampl'
>>> (1 row)
>>>
>>> regression=# select websearch_to_tsquery('simple', 'example');
>>> websearch_to_tsquery
>>> ----------------------
>>> 'example'
>>> (1 row)
>>>
>>> If what is in your tsvector is 'exampl', then only the first of
>>> these will match. So IMO the question is not "why is it failing
>>> on prod?", it's "how the heck did it work on the other machine?".
>>> You won't get nice results if websearch_to_tsquery is using a
>>> different TS configuration than to_tsvector did.
>>
>> I think this shows why we are getting the results we see. Credit to ch
>> on IRC for asking this question.
>>
>> The problem host:
>>
>> freshports.org=> select websearch_to_tsquery('example');
>> websearch_to_tsquery
>> ----------------------
>> 'example'
>> (1 row)
>
> Ahh, this explains the differences and as to why it works where it shouldn't?
>
> freshports.org=> select setting, source from pg_settings where name =
> 'default_text_search_config';
> setting | source
> -------------------+---------
> pg_catalog.simple | default
> (1 row)
>
>
>>
>> The hosts on which this search works
>>
>> freshports.devgit=# select websearch_to_tsquery('example');
>> websearch_to_tsquery
>> ----------------------
>> 'exampl'
>> (1 row)
>
>
> freshports.devgit=# select setting, source from pg_settings where name
> = 'default_text_search_config';
> setting | source
> --------------------+--------------------
> pg_catalog.english | configuration file
> (1 row)
>
>
> At least now I know what I can play with to get all hosts in sync.
Here we go, on the problem database, create a new field, based on simple, not english.
ALTER TABLE public.ports
ADD COLUMN pkgmessage_textsearchable3 tsvector generated always as (to_tsvector('simple'::regconfig, translate(pkgmessage, '/'::text, ' '::text))) stored;
Index it:
CREATE INDEX ports_pkgmessage_textsearchable3_idx
ON public.ports USING gin
(pkgmessage_textsearchable3)
TABLESPACE pg_default;
CREATE INDEX
query it:
freshports.org=> SELECT id as port_id, element_pathname(element_id)
FROM ports
WHERE pkgmessage_textsearchable3 @@ websearch_to_tsquery('example');
port_id | element_pathname
---------+----------------------------------------------------------------------
34126 | /ports/head/security/pond
74559 | /ports/branches/2015Q3/emulators/linux_base-c6
60310 | /ports/branches/2020Q4/www/gitlab-ce
38345 | /ports/head/www/gitlab
46842 | /ports/branches/2018Q1/mail/postfix-sasl
51532 | /ports/branches/2019Q1/graphics/drm-legacy-kmod
Success. Thank you Mr Lane.
--
Dan Langille
dan(at)langille(dot)org
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Arnold | 2022-12-17 23:55:16 | Postgres Date Type Value |
Previous Message | Dan Langille | 2022-12-17 20:22:37 | Re: tsvector not giving expected results on one host |