From: | Dan Langille <dan(at)langille(dot)org> |
---|---|
To: | Oleg Bartunov <obartunov(at)postgrespro(dot)ru> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: tsvector not giving expected results on one host |
Date: | 2022-12-22 00:55:13 |
Message-ID: | 50996900-18f9-31c5-5e3f-e6b26245f69b@langille.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Oleg Bartunov wrote on 12/21/22 12:31 PM:
> I
>
> On Wed, Dec 21, 2022 at 1:12 PM Oleg Bartunov <obartunov(at)postgrespro(dot)ru> wrote:
>> Dan,
>>
>> it's always good to specify configuration name in a query to avoid
>> recheck, since
>> websearch_to_tsquery(regconfig, text) is immutable, while
>> websearch_to_tsquery(text) is stable.
I am glad to hear this because one of my decisions was:
* change the query to include configuration name
* change the default configuration name
I think I'm going to start using websearch_to_tsquery(regconfig, text)
> immutable function calculates once in planning time, but stable
> function calculates during running time,
> so the difference may be very big depending on how many tuples found.
Looking at my data, the largest tsvector collection has 453 elements.
About 40 have more
than 170. Total number of rows is about 2400.
Another issue discussed on IRC: why store my tsvector values? Why not
just index them instead?
At present: I have this column:
pkgmessage_textsearchable | tsvector | | | generated always as
(to_tsvector('english'::regconfig, pkgmessage)) stored
with this index: ports_pkgmessage_textsearchable_idx" gin
(pkgmessage_textsearchable)
Instead, I could replace that column and index with this index:
"testing" gin (to_tsvector('english'::regconfig, pkgmessage))
Simple testing showed it was comparable if not slightly faster.
The plan now: implement the index on to_tsvector, not a column, and
start specifying the configuration. That's in the near future.
I've written up this journey at
https://news.freshports.org/2022/12/18/when-tsvector-was-working-as-expected-on-most-hosts-but-not-one/
thank you
>> See the difference:
>>
>> [local]:5433 oleg(at)oleg=# explain (analyze,costs off) select title
>> from apod where websearch_to_tsquery('simple','galaxies') @@ fts;
>> QUERY PLAN
>> ---------------------------------------------------------------------------------------
>> Bitmap Heap Scan on apod (actual time=0.008..0.008 rows=0 loops=1)
>> Recheck Cond: ('''galaxies'''::tsquery @@ fts)
>> -> Bitmap Index Scan on gin_apod_fts_idx (actual time=0.007..0.007
>> rows=0 loops=1)
>> Index Cond: (fts @@ '''galaxies'''::tsquery)
>> Planning Time: 0.134 ms
>> Execution Time: 0.022 ms
>> (6 rows)
>>
>> Time: 0.369 ms
>> [local]:5433 oleg(at)oleg=# explain (analyze,costs off) select title
>> from apod where websearch_to_tsquery('galaxies') @@ fts;
>> QUERY PLAN
>> -----------------------------------------------------------------------------------------
>> Bitmap Heap Scan on apod (actual time=0.107..1.463 rows=493 loops=1)
>> Filter: (websearch_to_tsquery('galaxies'::text) @@ fts)
>> Heap Blocks: exact=276
>> -> Bitmap Index Scan on gin_apod_fts_idx (actual time=0.059..0.059
>> rows=493 loops=1)
>> Index Cond: (fts @@ websearch_to_tsquery('galaxies'::text))
>> Planning Time: 0.125 ms
>> Execution Time: 1.518 ms
>> (7 rows)
--
Dan Langille - dan(at)langille(dot)org
https://langille.org/
From | Date | Subject | |
---|---|---|---|
Next Message | Yi Sun | 2022-12-22 03:00:57 | pg_wal directory max size |
Previous Message | David Gilman | 2022-12-22 00:31:30 | Implementing foreign data wrappers and avoiding n+1 querying |