From: | David Kellum <david(at)gravitext(dot)com> |
---|---|
To: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #14245: Segfault on weird to_tsquery |
Date: | 2016-07-12 20:54:53 |
Message-ID: | 1468356893.2574.7@smtp.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> david(at)gravitext(dot)com writes:
>> I am doing some (fuzz) testing of full text queries and managed to
>> generate the following case which causes a SEGFAULT on PostgreSQL
>> 9.6
>> beta1 and beta2:
>> select to_tsquery('!(a & !b) & c') as tsquery
>> This weird query outputs the following on 9.5.2, instead of
>> crashing:
>> "!( !'b' ) & 'c'"
>
> Note that while crashing is certainly not good, the pre-9.6 behavior
> can hardly be called correct either. What happened to 'a'?
'a' is a stopword, dropped by to_tsquery() as described here:
https://www.postgresql.org/docs/9.6/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
> The difference is that while basic tsquery input takes the tokens at
> face value, to_tsquery normalizes each token into a lexeme using the
> specified or default configuration, and discards any tokens that are
> stop words according to the configuration.
...and I believe I want this behavior. Otherwise queries with stopword
in '&' condition will not match anything. In truth I have no reason to
want to support this kind of weird double negative, on any version, and
will also look at filtering it out in my code before calling
to_tsquery().
It might be worth noting that these other slightly different cases are
fine on 9.6:
select to_tsquery('!(apple & !b) & c'); ---> !( 'appl' & !'b' ) & 'c'
select to_tsquery('!(apple & !a) & c'); ---> !'appl' & 'c'\
Clearly a pretty obscure case, but a crash nonetheless.
> Also, it looks like this is specific to to_tsquery; if you just feed
> the same thing to tsqueryin, it seems fine with it:
>
> # select '!(a & !b) & c'::tsquery;
> tsquery
> -----------------------
> !( 'a' & !'b' ) & 'c'
> (1 row)
Against another test table, English search config, I confirmed that 'a
& ball'::tsquery doesn't match anything, but to_tsquery('a & ball')
does.
Thanks,
David
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-07-12 21:11:32 | Re: BUG #14245: Segfault on weird to_tsquery |
Previous Message | Tom Lane | 2016-07-12 19:42:25 | Re: BUG #14245: Segfault on weird to_tsquery |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-07-12 20:59:23 | Re: pgbench - minor fix for meta command only scripts |
Previous Message | Fabrízio de Royes Mello | 2016-07-12 20:42:21 | Re: [COMMITTERS] Logical decoding |