Re: Behaviour of to_tsquery(stopwords only)

From: Richard Huxton <dev(at)archonet(dot)com>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Behaviour of to_tsquery(stopwords only)
Date: 2008-03-06 11:08:05
Message-ID: 47CFD095.7010405@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Further tsquery comparison fun:

=> SELECT q.qid, q.query, count(*) FROM doc.documents d, util.queries q
WHERE d.words @@ q.query AND (q.query::text=$$'tender'$$) GROUP BY
q.qid, q.query ;
qid | query | count
-----+----------+-------
195 | 'tender' | 374
248 | 'tender' | 374
257 | 'tender' | 374
332 | 'tender' | 374
401 | 'tender' | 374
409 | 'tender' | 374
519 | 'tender' | 374
557 | 'tender' | 374
736 | 'tender' | 374
749 | 'tender' | 374
869 | 'tender' | 374
879 | 'tender' | 374
926 | 'tender' | 374
(13 rows)

=> SELECT q.query, count(*) FROM doc.documents d, util.queries q WHERE
d.words @@ q.query AND (q.query::text=$$'tender'$$) GROUP BY q.query ;
query | count
----------+-------
'tender' | 1870
'tender' | 1496
'tender' | 1496
(3 rows)

It seems to be that the tsquery is remembering the shape of the original
query, even though it's been trimmed.

=> SELECT q.query, min(qid), max(qid), count(*) FROM doc.documents d,
util.queries q WHERE d.words @@ q.query AND (q.query::text=$$'tender'$$)
GROUP BY q.query ;
query | min | max | count
----------+-----+-----+-------
'tender' | 736 | 926 | 1870 (5 rows aggregated)
'tender' | 401 | 557 | 1496 (4 rows aggregated)
'tender' | 195 | 332 | 1496 (4 rows aggregated)
(3 rows)

=> SELECT * FROM util.queries WHERE qid IN (195,248, 257, 332,
401,409,519,557,736,749,869,879,926) ORDER BY qid;
qid | words | query
-----+---------------------+----------
195 | can & of & tenders | 'tender' (3 clauses)
248 | tender & the & this | 'tender' (3 clauses)
257 | have & tender & for | 'tender' (3 clauses)
332 | for & tenders & of | 'tender' (3 clauses)
401 | tender & with | 'tender' (2 clauses)
409 | tenders & to | 'tender' (2 clauses)
519 | tender & to | 'tender' (2 clauses)
557 | tenders & be | 'tender' (2 clauses)
736 | tenderer | 'tender' (1 clause)
749 | tender | 'tender' (1 clause)
869 | tender | 'tender' (1 clause)
879 | tender | 'tender' (1 clause)
926 | tender | 'tender' (1 clause)
(13 rows)

So - is this a bug, feature, "feature"?

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gevik Babakhani 2008-03-06 11:31:50 Re: Problem with site doc search
Previous Message Richard Huxton 2008-03-06 10:19:00 Behaviour of to_tsquery(stopwords only)