From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
---|---|
To: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
Cc: | Patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: tsearch refactorings |
Date: | 2007-09-05 15:55:22 |
Message-ID: | 46DED16A.9000505@sigaev.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Heikki, I see some strange changes in your patch, not related to tsearch at all:
contrib/pageinspect/pageinspect.sql.in
contrib/pageinspect/rawpage.c
> The usage of the QueryItem struct was very confusing. It was used for
> both operators and operands. For operators, "val" was a single character
> casted to a int4, marking the operator type. For operands, val was the
> CRC-32 of the value. Other fields were used only either for operands or
> for operators. The biggest change in the patch is that I broke the
> QueryItem struct into QueryOperator and QueryOperand. Type was really
...
> - Removed ParseQueryNode struct used internally by makepol and friends.
> push*-functions now construct QueryItems directly.
It's needed to set unused bytes in QueryItem to zero, it's common requiremens
for types in pgsql. After allocating space for tsquery in parse_tsquery you copy
just sizeof(QueryOperator) bytes and leave sizeof(QueryItem) -
sizeof(QueryOperator) bytes untouched. QueryOperand is a biggest component in
QueryItem union. I don't check other places.
> that? And parse_query always produces trees that are in prefix notation,
> so the left-field is really redundant, but using tsqueryrecv, you could
> inject queries that are not in prefix notation; is there anything in the
> code that depends on that?
It's used by TS_execute for optimization reason. With clear postfix notation you
should go through every nodes. For example:
FALSE FALSE & FALSE &
You will go to the end of query to produce correct result.
In fact, TSQuery is a prefix notation with pointer to another operand or, by
another words, just a plain view of tree where right operand of operation is
always placed after operation.
That notation allows to calculate only one of operand if it possible:
& FALSE & FALSE FALSE
1 2 3 4 5 --Nodes
After evaluating of second node you can return FALSE for whole expression and do
not evaluate nodes 3-5. For query
& TRUE & FALSE & FALSE
it's needed to evaluate 1,2,3,4 nodes. In most cases checking QI_VAL node is
much more expensive that QI_OPR
>
> - There's many internal intermediate representations of a query:
> TSQuery, a QTNode-tree, NODE-tree (in tsquery_cleanup.c), prefix
> notation stack of QueryItems (in parser), infix-tree. Could we remove
> some of these?
I havn't strong objections, QTNode and NODE are tree-like structures, but
TSQuery is a postfix notation for storage in plain memory. NODE is used only
cleanup stop-word placeholders, so it's a binary tree while QTNode represents
t-ary tree (with any number of children).
Thank you for your interesting in tsearch - after recheck of problem pointed
above I'll commit your patch.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2007-09-05 15:56:42 | Re: tsearch refactorings |
Previous Message | Tom Lane | 2007-09-05 15:24:47 | Re: GSS warnings |