From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Tino Wildenhain <tino(at)wildenhain(dot)de>, "Craig O'Shannessy" <craig(at)ucw(dot)com(dot)au>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: int8 primary keys still not using index without manual |
Date: | 2003-11-07 18:27:05 |
Message-ID: | 200311071827.hA7IR5503637@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > We can try removing most int2 functions and see if that makes such
> > conversions much easier, or try casting incoming constants to int2 to
> > see what happens.
>
> I already did that --- that was exactly the substance of the tests I was
> reporting in that message.
>
> I have been thinking lately that the fundamental approach is wrong
> anyway. Basically the idea was to make the world safe for
> single-datatype index handling by removing all the cross-type comparison
> operators. The reason
>
> SELECT ... WHERE int8col = 42
>
> isn't indexable is that the = operator is int84eq, which is not to be
> found in the set of operators associated with an index on int8. What
> we were thinking was that if we didn't have int84eq then the parser
> would be forced to promote the 42 to int8, and then the comparison using
> int8eq would be recognized as indexable.
Could we not always promote int4 to int8 for indexing purposes? I
realize OID has issues, though, as you mention in that URL:
This algorithm was wrong on both practical and theoretical levels;
in the first place it's not very helpful to only be able to handle
binary-compatible transformations, and in the second place there isn't
any good guarantee that it's not changing the semantics when it replaces
the operator. For instance int4 < and oid < do not act the same.
Depending on equality of operator names was a bad idea even then, and
would be quite unworkable now in the world of schema search paths.
Could we just promote int4 constants to int8 always? I just checked and
2^32-1 is already promoted to int8:
select 4294967295;
so the funny thing is that:
SELECT ... WHERE int8col = 4294967296;
already uses the int8 index.
I think the complex case you mentioned was oid. Let's look at the
various possible constant comparisons against an oid column:
-1 This constant would match no oid, so we could just allow the
sequential scan. I don't think anyone would expect good behavior for
such a comparison.
1 This could be promoted to oid cleanly.
2^31+1 This will come in a int8, so we can just downcast to oid
automatically.
I know your case was "<" comparison. It would be:
SELECT ... WHERE oidcol < -1;
SELECT ... WHERE oidcol < 1;
SELECT ... WHERE oidcol < 2147483649; 2^31+1
These all seem to work, I think. -1 will not use an index, which is OK.
I am concerned about having to add catalog maintenance for every index
case, which seems it could be a lot.
Here is my logic. I am having trouble getting the big picture on this:
int2 fits in the int4 range
int4 fits in the int8 range
oid fits only in the int8 range, some oids fit in int4
This means a valid oid could come in as int4 or int8.
I realize this requires hard-coded comparisons to C include defines to
get the maxium for each type. I know this breaks our type-neutral
style, but in this case, it seems it might be the cleanest way ---
abstracting this out into a table seems too hard.
Now for int2-based indexes. Can't we just downcast constants to int2 if
they fit in the int2 valid range?
> I think this might actually be workable for int8, but it's not going to
> work for int2 without changing the initial typing of small integer
> constants, and we already know that that opens a Pandora's box of other
> problems.
>
> But quite aside from the semantic difficulties of rejiggering all that
> stuff, it's going to break other parts of the optimizer if we do it.
> In particular it will interfere with handling of mergejoins and
> recognizing transitive equality. For example consider
>
> SELECT ... WHERE a.int8col = b.int4col AND b.int4col = 42;
>
> Currently we are able to deduce a.int8col = 42 (where the operator
> is int84eq). If we remove int84eq then the output of the parser for
> this example will look like
>
> SELECT ... WHERE a.int8col = b.int4col::int8 AND b.int4col = 42;
>
> and the transitive equality will not be recognized because
> b.int4col::int8 is not the same expression as b.int4col.
Seems we should keep those cross-type comparisons around for col op col
comparisons, at least, as well as internal optimizer use as you
described.
> So I'm currently thinking we'd be better off not to try to eliminate
> the cross-type comparison operators. Instead we need some solution
> that is narrowly focused on the problem of making a non-indexable
> comparison indexable, by converting a comparison value of the wrong
> datatype into the right datatype locally to the indexscan plan
> generation code. I posted some speculation about that here:
> http://archives.postgresql.org/pgsql-hackers/2003-09/msg00983.php
Agreed.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2003-11-07 18:34:34 | Re: Service doesnt UP!!!! |
Previous Message | Jeff | 2003-11-07 18:25:18 | Re: Recovery Data Cant Be!!! |