Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-12-15 17:04:32
Message-ID: 11288.1418663072@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Dec 14, 2014 at 8:24 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> Moving this patch to CF 2014-12 as work is still going on, note that
>> it is currently marked with Robert as reviewer and that its current
>> status is "Needs review".

> The status here is more like "waiting around to see if anyone else has
> an opinion". The issue is what should happen when you enter qualified
> name like alvaro.herrera and there is no column named anything like
> herrara in the RTE named alvaro, but there is some OTHER RTE that
> contains a column with a name that is only a small Levenshtein
> distance away from herrera, like roberto.correra. The questions are:

> 1. Should we EVER give a you-might-have-meant hint in a case like this?
> 2. If so, does it matter whether the RTE name is just a bit different
> from the actual RTE or whether it's completely different? In other
> words, might we skip the hint in the above case but give one for
> alvara.correra?

It would be astonishingly silly to not care about the RTE name's distance,
if you ask me. This is supposed to detect typos, not thinkos.

I think there might be some value in a separate heuristic that, when
you typed foo.bar and that doesn't match but there is a baz.bar, suggests
that maybe you meant baz.bar, even if baz is not close typo-wise. This
would be addressing the thinko case not the typo case, so the rules ought
to be quite different --- in particular I doubt that it'd be a good idea
to hint this way if the column names don't match exactly. But in any
case the key point is that this is a different heuristic addressing a
different failure mode. We should not try to make the
levenshtein-distance heuristic address that case.

So my two cents is that when considering a qualified name, this patch
should take levenshtein distance across the two components equally.
There's no good reason to suppose that typos will attack one name
component more (nor less) than the other.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-12-15 17:22:28 Re: Status of CF 2014-10 and upcoming 2014-12
Previous Message Andres Freund 2014-12-15 17:04:15 Re: Something is broken in logical decoding with CLOBBER_CACHE_ALWAYS