From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Geoghegan <pg(at)heroku(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
Subject: | Re: Doing better at HINTing an appropriate column within errorMissingColumn() |
Date: | 2014-11-19 13:43:53 |
Message-ID: | CA+Tgmoam4EsCzo=mhK6PfgNV88BSEFR5ykueb+XEyJP7c7e_kA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 18, 2014 at 8:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> On Tue, Nov 18, 2014 at 3:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Mon, Nov 17, 2014 at 3:04 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>>> postgres=# select qty from orderlines ;
>>>> ERROR: 42703: column "qty" does not exist
>>>> HINT: Perhaps you meant to reference the column "orderlines"."quantity".
>
>>> I don't buy this example, because it would give you the same hint if
>>> you told it you wanted to access a column called ant, or uay, or tit.
>>> And that's clearly ridiculous. The reason why quantity looks like a
>>> reasonable suggestion for qty is because it's a conventional
>>> abbreviation, but an extremely high percentage of comparable cases
>>> won't be.
>
>> I maintain that omission of part of the correct spelling should be
>> weighed less.
>
> I would say that omission of the first letter should completely disqualify
> suggestions based on this heuristic; but it might make sense to weight
> omissions less after the first letter.
I think we would be well-advised not to start inventing our own
approximate matching algorithm. Peter's suggestion boils down to a
guess that the default cost parameters for Levenshtein suck, and your
suggestion boils down to a guess that we can fix the problems with
Peter's suggestion by bolting another heuristic on top of it - and
possibly running Levenshtein twice with different sets of cost
parameters. Ugh.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-11-19 13:45:41 | Re: tracking commit timestamps |
Previous Message | Alvaro Herrera | 2014-11-19 13:22:07 | Re: tracking commit timestamps |