Re: procost for to_tsvector

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers\(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: procost for to_tsvector
Date: 2015-05-02 01:27:24
Message-ID: 87r3qzsrtu.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> In the OP, he suggested "on the order of 100". Maybe we could just
>> go with 100.

Tom> I'm OK with that in view of <87h9trs0zm(dot)fsf(at)news-spur(dot)riddles(dot)org(dot)uk>

Note that the results from that post suggest 100 as a bare minimum,
higher values would be quite reasonable.

Tom> and some experiments of my own, but I wonder why we are only
Tom> thinking of to_tsvector. Isn't to_tsquery, for example, just
Tom> about as expensive? What of other text search functions?

Making the same change for to_tsquery and plainto_tsquery would be
reasonable; that would help with the seqscan cost for cases like
to_tsvector('config',col) @@ to_tsquery('blah') where the non-immutable
form of to_tsquery is used. It doesn't seem to have shown up as an issue
in reports so far because the common usage patterns don't tend to have
it evaluated for each row (either the immutable form is used, or the
to_tsquery is evaluated in a different from-clause item).

I don't recall seeing cases of any of the other functions figuring into
planner decisions.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-05-02 03:01:43 Re: feature freeze and beta schedule
Previous Message Peter Geoghegan 2015-05-02 00:33:00 Re: feature freeze and beta schedule