Andres Freund <andres(at)anarazel(dot)de> wrote:
> I think you see no real benefit, because your strings are rather
> short - the documents I scanned when noticing the issue where
> rather long.
The document I used in the test which showed the regression was
672,585 characters, containing 10,000 URLs.
> A rather extreme/contrived example:
> postgres=# SELECT 1 FROM to_tsvector(array_to_string(ARRAY(SELECT
> 'andres(at)anarazel(dot)de http://www.postgresql.org/'::text FROM
> generate_series(1,
> 20000) g(i)), ' - '));
The most extreme of your examples uses a 979,996 character string,
which is less than 50% larger than my test. I am, however, able to
see the performance difference for this particular example, so I now
have something to work with. I'm seeing some odd behavior in terms
of when there is what sort of difference. Once I can categorize it
better, I'll follow up.
Thanks for the sample which shows the difference.
-Kevin