Re: index scan with index cond on first column doesn't recognize sort order of second column

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: index scan with index cond on first column doesn't recognize sort order of second column
Date: 2003-02-14 04:23:16
Message-ID: 877kc3jxsb.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > But it might be better ordered. I think you may have missed the original
> > context. In a query like:
> > SELECT * FROM table_a join table_b using (col2) WHERE a.col1 = $0
> > Where there's an index on table_a(col1,col2) could use a merge join without a
> > sort.
>
> Yes. The appropriate way to recognize that (without breaking things)
> is to consider that the indexscan generates pathkeys (col2) because col1
> can be assumed constant. This is not truncate_useless_pathkeys's
> business, however. It would have to be done in the code that creates
> the indexscan path to begin with.

Er, right, I think that's the code I originally quoted from indxpath.c:

/*
* 2. Match the index against non-'or' restriction clauses.
*/
restrictclauses = group_clauses_by_indexkey(rel, index);

/*
* 3. Compute pathkeys describing index's ordering, if any, then
* see how many of them are actually useful for this query.
*/
index_pathkeys = build_index_pathkeys(root, rel, index,
ForwardScanDirection);
index_is_ordered = (index_pathkeys != NIL);
useful_pathkeys = truncate_useless_pathkeys(root, rel,
index_pathkeys);

> (Ideally, we'd mark the path to show that it could be considered to have
> either sort ordering (col1, col2) or sort ordering (col2), so that it could
> match to either requirement. I'm not sure if that's feasible with the
> current datastructures though. Might have to make two copies of the path
> :-()

I don't think that's a problem because the optimizer seems to be doing a
pretty good job of constant propagation and it tracks down the transitive
closure of all the equijoin clauses. So if we have a col1=$0 clause for one
table then any other tables that are equijoined on col1 will also have a
col1=$0 clause. I'm hoping that's been done already, I'm not sure.

> Actually you are barking up the wrong tree entirely; I'm pretty certain
> that truncate_useless_pathkeys *doesn't* remove this pathkey, because it
> should notice that it is relevant to the mergejoin condition.

<tries gdb out on postgres> Hm... <rebuilds postgres with -O0> Hm...

No, truncate_useless_pathkeys actually returns NULL. It appears it decides the
pathkey is actually completely useless for the mergejoin and both
pathkeys_useful_for_merging and pathkeys_useful_for_ordering return 0. I guess
ltruncate returns NULL if the first argument is 0.

I think indxpath.c has to try truncate_useless_pathkeys and if it returns null
try removing any leading elements that it has an = clause for. It should try
truncate_useless_pathkeys after removing each element. I think it can just
stop after finding the first match.

Basic postgres mm question, is there a cdr function, and is it safe to leave
references to the cdr of this list around? It all gets cleaned up with a
ripcord deallocator?

--
greg

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2003-02-14 04:32:42 Re: index scan with index cond on first column doesn't recognize sort order of second column
Previous Message Egyud Csaba 2003-02-14 04:21:23 Fw: set returning functions in v7.3