Re: int8 primary keys still not using index without manual

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tino Wildenhain <tino(at)wildenhain(dot)de>, "Craig O'Shannessy" <craig(at)ucw(dot)com(dot)au>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: int8 primary keys still not using index without manual
Date: 2003-11-07 18:27:05
Message-ID: 200311071827.hA7IR5503637@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > We can try removing most int2 functions and see if that makes such
> > conversions much easier, or try casting incoming constants to int2 to
> > see what happens.
>
> I already did that --- that was exactly the substance of the tests I was
> reporting in that message.
>
> I have been thinking lately that the fundamental approach is wrong
> anyway. Basically the idea was to make the world safe for
> single-datatype index handling by removing all the cross-type comparison
> operators. The reason
>
> SELECT ... WHERE int8col = 42
>
> isn't indexable is that the = operator is int84eq, which is not to be
> found in the set of operators associated with an index on int8. What
> we were thinking was that if we didn't have int84eq then the parser
> would be forced to promote the 42 to int8, and then the comparison using
> int8eq would be recognized as indexable.

Could we not always promote int4 to int8 for indexing purposes? I
realize OID has issues, though, as you mention in that URL:

This algorithm was wrong on both practical and theoretical levels;
in the first place it's not very helpful to only be able to handle
binary-compatible transformations, and in the second place there isn't
any good guarantee that it's not changing the semantics when it replaces
the operator. For instance int4 < and oid < do not act the same.
Depending on equality of operator names was a bad idea even then, and
would be quite unworkable now in the world of schema search paths.

Could we just promote int4 constants to int8 always? I just checked and
2^32-1 is already promoted to int8:

select 4294967295;

so the funny thing is that:

SELECT ... WHERE int8col = 4294967296;

already uses the int8 index.

I think the complex case you mentioned was oid. Let's look at the
various possible constant comparisons against an oid column:

-1 This constant would match no oid, so we could just allow the
sequential scan. I don't think anyone would expect good behavior for
such a comparison.


1 This could be promoted to oid cleanly.

2^31+1 This will come in a int8, so we can just downcast to oid
automatically.

I know your case was "<" comparison. It would be:

SELECT ... WHERE oidcol < -1;
SELECT ... WHERE oidcol < 1;
SELECT ... WHERE oidcol < 2147483649; 2^31+1

These all seem to work, I think. -1 will not use an index, which is OK.

I am concerned about having to add catalog maintenance for every index
case, which seems it could be a lot.

Here is my logic. I am having trouble getting the big picture on this:

int2 fits in the int4 range
int4 fits in the int8 range

oid fits only in the int8 range, some oids fit in int4

This means a valid oid could come in as int4 or int8.

I realize this requires hard-coded comparisons to C include defines to
get the maxium for each type. I know this breaks our type-neutral
style, but in this case, it seems it might be the cleanest way ---
abstracting this out into a table seems too hard.

Now for int2-based indexes. Can't we just downcast constants to int2 if
they fit in the int2 valid range?

> I think this might actually be workable for int8, but it's not going to
> work for int2 without changing the initial typing of small integer
> constants, and we already know that that opens a Pandora's box of other
> problems.
>
> But quite aside from the semantic difficulties of rejiggering all that
> stuff, it's going to break other parts of the optimizer if we do it.
> In particular it will interfere with handling of mergejoins and
> recognizing transitive equality. For example consider
>
> SELECT ... WHERE a.int8col = b.int4col AND b.int4col = 42;
>
> Currently we are able to deduce a.int8col = 42 (where the operator
> is int84eq). If we remove int84eq then the output of the parser for
> this example will look like
>
> SELECT ... WHERE a.int8col = b.int4col::int8 AND b.int4col = 42;
>
> and the transitive equality will not be recognized because
> b.int4col::int8 is not the same expression as b.int4col.

Seems we should keep those cross-type comparisons around for col op col
comparisons, at least, as well as internal optimizer use as you
described.

> So I'm currently thinking we'd be better off not to try to eliminate
> the cross-type comparison operators. Instead we need some solution
> that is narrowly focused on the problem of making a non-indexable
> comparison indexable, by converting a comparison value of the wrong
> datatype into the right datatype locally to the indexscan plan
> generation code. I posted some speculation about that here:
> http://archives.postgresql.org/pgsql-hackers/2003-09/msg00983.php

Agreed.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2003-11-07 18:34:34 Re: Service doesnt UP!!!!
Previous Message Jeff 2003-11-07 18:25:18 Re: Recovery Data Cant Be!!!