Re: tsearch refactorings

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Patches" <pgsql-patches(at)postgresql(dot)org>
Cc: "Teodor Sigaev" <teodor(at)sigaev(dot)ru>
Subject: Re: tsearch refactorings
Date: 2007-09-05 12:33:32
Message-ID: 46DEA21C.1030603@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I wrote:
> I have a few more things in mind I'm planning to look into:
>
> - I'm not convinced that there's enough sanity checking in the input
> functions. I think you can send queries into the server using the binary
> protocol that you couldn't produce otherwise. For example, queries with
> multiple VAL nodes, with no operator to connect them. Does that wreak
> havoc in any of the functions?

I added code to check that the query tree is well-formed. It was indeed
possible to send malformed queries in binary mode, which produced all
kinds of strange results.

> And the left-field of QueryOperator is an
> int16, what happens if you have query that exceeds
> that?

Apparently the field wraps around and you get a backend crash when it
tries to do access an array using a negative index. You need to have a
large stack (and increase max_stack_depth now that we have the check for
that in there) to reproduce it. Not sure if it's an exploitable security
vulnerability, but it might be.

I fixed this by making the left-field a uint32. There's no reason to
arbitrarily limit it to 16-bits, and it won't increase the disk/memory
footprint either now that QueryOperator and QueryOperand are separate
structs.

> And parse_query always produces trees that are in prefix notation,
> so the left-field is really redundant, but using tsqueryrecv, you could
> inject queries that are not in prefix notation; is there anything in the
> code that depends on that?

At least infix-function seems to depend on it. By checking that the tree
is well-formed, and the fact that the left operand is implicitly the
next node in the array, we know that it is in prefix notation.

> - There's many internal intermediate representations of a query:
> TSQuery, a QTNode-tree, NODE-tree (in tsquery_cleanup.c), prefix
> notation stack of QueryItems (in parser), infix-tree. Could we remove
> some of these?

I haven't looked into this yet. Might leave it for 8.4.

> - There's a lot of recursive functions. Are they all using
> check_stack_depth?

I added check_stack_depth() call to all recursive functions I found.
Some of them might have a natural limit so that you can't force
arbitrarily deep recursions, but check_stack_depth() is cheap enough
that seems best to just stick it into anything that might be a problem.

Patch attached. It's on top of the tsearch-refactor-2.patch I posted
earlier.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
tsearch-incr-1.patch text/x-diff 11.9 KB

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Chris Browne 2007-09-05 15:11:31 Re: Lazy xid assignment V4
Previous Message Florian G. Pflug 2007-09-05 11:49:38 Re: Lazy xid assignment V4