Re: merging some features from plpgsql2 project

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Marko Tiikkaja <marko(at)joh(dot)to>
Subject: Re: merging some features from plpgsql2 project
Date: 2017-01-03 17:32:35
Message-ID: 47260958-4bbc-85ef-7f8b-994e5627a77d@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/3/17 9:58 AM, Pavel Stehule wrote:
> > ** The real problem is that we have no mechanism for allowing a PL's
> > language/syntax/API to move forward without massive backwards compatibility
> > problems. **
>
> Just got back from break :-). Have some thoughts on this. Backwards
> compatibility is really a fundamental problem. There's really no
> solution to it other than to try and avoid using syntax to solve
> problems. It should be obvious to everyone that plgsql cannot
> withstand a compatibility break. Another language could be offered as

I don't think that's obvious at all. We've introduced incompatibility in
the main grammar without problem. You just need a way for people to get
the old behavior back if they need it. Eventually people will stop
relying on the old, broken behavior.

> an alternative in core (say, pl/psm or pl/v8), but pl/pgsql has to
> support old code. Some really out there features could maybe be
> redacted (in particular, using = for assignment), but not not much.
> But I guess we're stuck with the status quo.
>
> I think we ought to avoid language features that influence the
> behavior (performance is ok) of the code (and that includes throwing
> errors). That's a freight train headed towards javscript's 'strict'
> mode, which is thinly disguised language fork. #option and pragma
> type syntaxes are trying to cheat the language -- hardly anyone uses
> them and it's a tricky way to try and make the language into something
> other than it is.

Yeah, trying to bulk all these changes into one "magic setting" is not a
way to move forward. I think we're actually really well off in that
regard, because unlike most languages we have a very robust settings
system that allows controlling this behavior even at the statement level.

> C does it right -- dubious code is raised as warnings and it's up to
> the end user to determine which warnings are interesting and likely to
> be an error. So, rather than hacking the language to control throwing
> and errors and such there should be some ability validate the function
> heavily and verify suspicious use of INTO or other dubious things
> (unused variables, masked assignments, etc). The validation output
> could even be a set returning function.

While static analysis can do some good (and I think we should actually
be enabling more of that by default), it won't realistically solve
everything. Multi-row assignment is a good example: NO ONE is going to
be OK with tons of warnings for every little := or SELECT INTO (without
strict), but the reality is that most code actually won't work correctly
if you have multiple rows coming back, so there's nothing technically
wrong with `var = field FROM table WHERE table_id = plpgsql_variable` if
table_id is the PK: you'll always get 0 or 1 rows back.

> So -1 to strict mode, unless we can make a case why this can't be done
> as part of checking/validation.
>
>
> Can be plpgsq.extra_errors and plpgsql.extra_warnings solution?
>
> I am thinking so there is a space for improvement (in extra_* usage)
>
> Do you know plpgsql_check https://github.com/okbob/plpgsql_check ?

I think we should look at what parts of that we should pull into core
(as well as enabling more by default). Stuff that can be done at
compile/load time is certainly better than runtime checks.

> Other random points:
> *) Another major pain point is swapping in the input variables for
> debugging purposes. Something that emits a script based on a set of
> arguments would be wonderful.
>
> ???

Yeah, could you elaborate here?

> *) Would also like to have a FINALLY block
>
> What you can do there?

It's a block that ALWAYS executes, even if an exception occurs. Python
has this[1]. That (along with an ELSE clause for if there is no
exception) would mean you could catch an exception for a single command
instead of a bunch of commands.

Somewhat related to that, I wish you could make GUC changes that were
local only to a specific BEGIN block. AFAIK the GUC infrastructure fully
supports that, it would just need to be exposed in plpgsql.

> *) A mechanic to manually print out a stack trace for debugging
> purposes would be helpful.
>
>
> I had plan to develop a extension for this purpose - easy printing
> stack, function parameters, and local variables. But I had a motivation
> to start it. It can be usable for any PL

I assume you're thinking an SRF that spits out PG_CONTEXT? It'd be
really nice if you could also get things like function names and line
numbers broken out separately. I've thought of building this myself.

BTW, the biggest case I can think of using this for is a userspace
method of doing "private" functions, where the function throws an
exception unless it was called directly by a set of allowed functions
(or views).

> *) COPY not being able to accept arguments as variables (in particular
> the filename) is a major headache
>
>
> There is a patch "COPY as function"

Instead of just COPY, I'd like an easier way to pass identifiers into
SQL commands. format() certainly makes this easier than it was, but %3$s
gets old pretty quick. (Of course, if we had the concept of a dictionary
you could actually name the parameters and it wouldn't be quite so bad...)

> *) Upon error, we ought to print a few lines of context on either side
> of the error. Not very fun to try and figure out where the errors is
> when you are working off of 'syntax error near "FROM"' etc. This is a
> major problem throughout the platform IMO.
>
>
> Have not idea how to fix it

To do this I think you'd need to keep an array of preceding line
positions instead of just one, which I don't think would be that hard.
The bigger problem would be only spitting out the extra info on the
initial error, and not for subsequent context calls up the stack. I
don't think there's currently a way to tell if you're the level that the
ereport originated at (at least, not in an error).

Improving that would definitely be useful across the board, because
right now DEBUG becomes completely useless once you have more than 1 or
2 levels of nested calls.

> *) Some user visible mechanic other than forcing SQL through EXECUTE
> to be able to control plan caching would be useful.
>
>
> fully agree.
>
> Have you some ideas?
>
> What about plpgsql option (function scope) -- WITHOUT-PLAN-CACHE - any
> non trivial plans will not be cached - and evaluated as parametrized
> query only.

I'd also like the ability to do a "localized" PREPARE; similar to a SQL
level PREPARE statement, but ensuring that the statement got deallocated
when the block ended.

1: https://docs.python.org/3/tutorial/errors.html#defining-clean-up-actions
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-01-03 17:38:05 pgsql: Update copyright for 2017
Previous Message Pavel Stehule 2017-01-03 17:19:08 Re: merging some features from plpgsql2 project