Quick Links

old bug in full text parser

From:	Oleg Bartunov <obartunov(at)gmail(dot)com>
To:	Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>
Subject:	old bug in full text parser
Date:	2016-02-10 09:28:22
Message-ID:	CAF4Au4wcDyvfFZ3qApnuX=YCfubowxo9xxCH5e1HPBJXVF14eQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

It looks like there is a very old bug in full text parser (somebody
pointed me on it), which appeared after moving tsearch2 into the core. The
problem is in how full text parser process hyphenated words. Our original
idea was to report hyphenated word itself as well as its parts and ignore
hyphen. That was how tsearch2 works.

This behaviour was changed after moving tsearch2 into the core:
1. hyphen now reported by parser, which is useless.
2. Hyphenated words with numbers ('4-dot', 'dot-4') processed differently
than ones with plain text words like 'four-dot', no hyphenated word itself
reported.

I think we should consider this as a bug and produce fix for all supported
versions.

After investigation we found this commit:

commit 73e6f9d3b61995525785b2f4490b465fe860196b
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Sat Oct 27 19:03:45 2007 +0000

Change text search parsing rules for hyphenated words so that digit
strings
containing decimal points aren't considered part of a hyphenated word.
Sync the hyphenated-word lookahead states with the subsequent
part-by-part
reparsing states so that we don't get different answers about how much
text
is part of the hyphenated word. Per my gripe of a few days ago.

8.2.23

8.3.23

Regards,
Oleg

Responses

Re: old bug in full text parser at 2016-02-10 10:04:07 from Oleg Bartunov
Re: old bug in full text parser at 2016-02-10 16:21:18 from Tom Lane
Re: old bug in full text parser at 2016-02-10 16:45:47 from Mike Rylander

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2016-02-10 09:36:43	Re: Support for N synchronous standby servers - take 2
Previous Message	Kyotaro HORIGUCHI	2016-02-10 08:34:50	Re: Support for N synchronous standby servers - take 2