Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-07-23 15:57:50
Message-ID: CA+TgmoYKiiq8MC0UJ5i5XfkTYBg1qqfN4YRCkZ60YDUnumkzzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 17, 2014 at 9:34 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> Patch 1 does a couple of things:
> - fuzzystrmatch is dumped to 1.1, as Levenshtein functions are not part of
> it anymore, and moved to core.
> - Removal of the LESS_EQUAL flag that made the original submission patch
> harder to understand. All the Levenshtein functions wrap a single common
> function.
> - Documentation is moved, and regression tests for Levenshtein functions are
> added.
> - Functions with costs are renamed with a suffix with costs.
> After hacking this feature, I came up with the conclusion that it would be
> better for the user experience to move directly into backend code all the
> Levenshtein functions, instead of only moving in the common wrapper as Peter
> did in his original patches. This is done this way to avoid keeping portions
> of the same feature in two different places of the code (backend with common
> routine, fuzzystrmatch with levenshtein functions) and concentrate all the
> logic in a single place. Now, we may as well consider renaming the
> levenshtein functions into smarter names, like str_distance, and keep
> fuzzystrmatch to 1.0, having the functions levenshteing_* calling only the
> str_distance functions.

This is not cool. Anyone who is running a 9.4 or earlier database
using fuzzystrmatch and upgrades, either via dump-and-restore or
pg_upgrade, to a version with this patch applied will have a broken
database. They will still have the catalog entries for the 1.0
definitions, but those definitions won't be resolvable inside the new
cluster's .so file. The user will get a fairly-unfriendly error
message that won't go away until they upgrade the extension, which may
involve dealing with dependency hell since the new definitions are in
a different place than the old definitions, and there may be
dependencies on the old definitions. One of the great advantages of
extension packaging is that this kind of problem is quite easily
avoidable, so let's avoid it.

There are several possible methods of doing that, but I think the best
one is just to leave the SQL-callable C functions in fuzzystrmatch and
move only the underlying code that supports into core. Then, the
whole thing will be completely transparent to users. They won't need
to upgrade their fuzzystrmatch definitions at all, and everything will
just work; under the covers, the fuzzystrmatch code will now be
calling into core code rather than to code located in that same
module, but the user doesn't need to know or care about that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-07-23 16:09:11 Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Previous Message Robert Haas 2014-07-23 15:40:08 Re: gaussian distribution pgbench -- part 1/2