From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Geoghegan <pg(at)heroku(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
Subject: | Re: Doing better at HINTing an appropriate column within errorMissingColumn() |
Date: | 2014-06-17 21:28:53 |
Message-ID: | 53A0B315.3020205@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/17/2014 01:59 PM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Emitting a suggestion with a large distance seems like it could be
>> rather irritating. If the user types in SELECT prodct_id FROM orders,
>> and that column does not exist, suggesting "product_id", if such a
>> column exists, will likely be well-received. Suggesting a column
>> named, say, "price", however, will likely make at least some users say
>> "no I didn't mean that you stupid @%!#" - because probably the issue
>> there is that the user selected from the completely wrong table,
>> rather than getting 6 of the 9 characters they typed incorrect.
>
> Yeah, that's my point exactly. There's no very good reason to assume that
> the intended answer is in fact among the set of column names we can see;
> and if it *is* there, the Levenshtein distance to it isn't going to be
> all that large. I think that suggesting "foobar" when the user typed
> "glorp" is not only not helpful, but makes us look like idiots.
Well, there's two different issues:
(1) offering a suggestion which is too different from what the user
typed. This is easily limited by having a max distance (most likely a
distance/length ratio, with a max of say, 0.5). The only drawback of
this would be the extra cpu cycles to calculate it, and some arguments
about what the max distance should be. But for the sake of the
children, let's not have a GUC for it.
(2) If there are multiple columns with the same levenschtien distance,
which one do you suggest? The current code picks a random one, which
I'm OK with. The other option would be to list all of the columns.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2014-06-17 21:34:49 | Re: Minmax indexes |
Previous Message | Kevin Grittner | 2014-06-17 21:15:19 | Re: Doing better at HINTing an appropriate column within errorMissingColumn() |