From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
Subject: | Re: Doing better at HINTing an appropriate column within errorMissingColumn() |
Date: | 2014-06-17 21:58:18 |
Message-ID: | 53A0B9FA.6030904@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/17/2014 02:53 PM, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> On 06/17/2014 02:36 PM, Tom Lane wrote:
>>> Another issue is whether to print only those having exactly the minimum
>>> observed Levenshtein distance, or to print everything less than some
>>> cutoff. The former approach seems to me to be placing a great deal of
>>> faith in something that's only a heuristic.
>
>> Well, that depends on what the cutoff is. If it's high, like 0.5, that
>> could be a LOT of columns. Like, I plan to test this feature with a
>> 3-table join that has a combined 300 columns. I can completely imagine
>> coming up with a string which is within 0.5 or even 0.3 of 40 columns names.
>
> I think Levenshtein distances are integers, though that's just a minor
> point.
I was giving distance/length ratios. That is, 0.5 would mean that up to
50% of the characters could be replaced/changed. 0.2 would mean that
only one character could be changed at lengths of five characters. Etc.
The problem with these ratios is that they behave differently with long
strings than short ones. I think realistically we'd need a double
threshold, i.e. ( distance >= 2 OR ratio <= 0.4 ). Otherwise the
obvious case, getting two characters wrong in a 4-character column name
(or one in a two character name), doesn't get a HINT.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-06-17 22:02:11 | Re: Atomics hardware support table & supported architectures |
Previous Message | Tom Lane | 2014-06-17 21:53:40 | Re: Doing better at HINTing an appropriate column within errorMissingColumn() |