From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Zeb Burke-Conte <zebburkeconte(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first |
Date: | 2021-04-16 16:49:50 |
Message-ID: | 4158832.1618591790@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Zeb Burke-Conte <zebburkeconte(at)gmail(dot)com> writes:
> I still find this quite counterintuitive, since nothing about my query is
> forcing Postgres to cast the varchar column to bpchar instead of the other
> way around. Is there some arcane standard that requires it?
Don't think so. It's a bit of an implementation artifact arising from
the fact that varchar has no operators of its own (it's kind of like a
domain over text, in our implementation). So the parser, faced with
"vc = bp", has to choose whether to use the "text = text" or
"bpchar = bpchar" operator. It settles on the latter, which I believe
is because it has one more exact match to the actual input types than
"text = text". (Cf. the resolution rules in [1], whcih are most certainly
not derived from the SQL standard.) Interestingly, if you'd been
working with "text = bp", then "text = text" would've been picked and
the inefficiency would be on the other side of the join due to needing
to cast the bpchar input.
It may be possible to argue that the SQL spec has a preference here,
but I'm not sure. (IIRC, their text talks about PAD SPACE attributes
rather than a distinct type, so mapping it onto our API takes some
head-scratching.) In any case, PG has had these resolution rules for
twenty years plus, so changing them would be a huge compatibility break.
> Changing this could be a "nice-to-have" although I see how it's not a bug
> per se.
Even if we wanted to change the resolution rules, that would just move
the pain somewhere else, as I indicated above.
It's possible that things could be improved by inventing operators
for "varchar = bpchar", "bpchar = varchar", etc and adding them to
all the right index opclasses. It would be a large amount of work
though and could easily introduce some unexpected side-effects.
Given that we regard bpchar as pretty much of a third-class citizen,
I'm not surprised that nobody has put effort into that.
regards, tom lane
[1] https://www.postgresql.org/docs/current/typeconv-oper.html
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2021-04-16 16:52:15 | BUG #16967: Extremely slow update statement in trigger |
Previous Message | Zeb Burke-Conte | 2021-04-16 16:16:43 | Re: BUG #16966: Nested loop joining across tables with varchar -> bpchar cast always scans varchar table first |