Quick Links

Re: PL/pgSQL 2

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Joel Jacobson <joel(at)trustly(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PL/pgSQL 2
Date:	2014-09-02 20:27:01
Message-ID:	CAHyXU0zSFe4Ope2x0RC-NiznZGbCpRgTuku8UnrKiryTb5T+zw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Sep 1, 2014 at 4:04 AM, Joel Jacobson <joel(at)trustly(dot)com> wrote:
> Hi,
>
> For those of you who use PL/pgSQL every day, I'm quite certain you all feel
> there are a number of things you would like to change in the language, but
> realize it cannot be achieved without possibly breaking compatibility, at
> least in theory. Even though you own code would survive the change, there
> might be code somewhere in the world which would break. This is of course
> not acceptable and that's why we have the current status quo of development,
> or at least not far away from a status quo.
>
> So instead of continue to adding optional settings to the config file, and
> instead of killing discussions around what can be done by bringing up the
> backwards-compatibility argument, let's instead fork the language and call
> it plpgsql2. Since no code is yet written in plpgsql2, we can start of from
> a clean sheet, and no good ideas need to be killed due to
> backwards-compatibility concerns.

What is the reasoning for breaking compatibiilty? Why not improve the
language that's there?

> From the top of my head, these are Things I personally would want to see in
> plpgsql2:
> + Make UPDATE/INSERT/DELETE throw error if they didnt' modify exactly 1 row,
> as that's the most common use-case, and provide alternative syntax to modify
> multiple or zero rows.

This is the exact wrong thing to do. It may be the most common use
case for *you*, but it's certainly not for me and is objectively a
step backards; set based processing is generally faster and less error
prone. Although we have to write loops sometimes (say, for complex
iteration or error handling), I tend to only do so as a last resort.
Windows functions + custom aggregate functions have removed most cases
where I would have had to so for computation reasons. If you *must*
process things row by row, why not use the facility in the language
that handle that: "WHERE CURRENT OF".

> + Make SELECT .. INTO .. throw an error if it selects more than 1 row. INTO
> STRICT only works if no rows should be an error, but there is currently no
> nice way if no rows OR exactly 1 row should be found by the query.

I see the point here, but this is not a fundamental problem with the
language IMO.

> + Change all warnings into errors

This is an ok idea, but not sure why you have to re-invent pl/pgsql to do it.

> These are small changes, probably possible with just a few hundred lines of
code in total, which also should be the ambition, as larger changes would
> never survive during time as it would require too much efforts to keep up
> with the main project. Secondly, I trust plpgsql mainly because it's being
> used by a lot of people in a lot of production systems, the same would not
> hold true for plpgsql2 for the first years of existence, so we who would use
> it in production systems must understand every single line of code changed
> and feel the risk of possible bugs and their impact are within acceptable
> boundaries.

Here are the headaches I see:

* performance: plpsql can be slow for many types of iterative
processing. everybody wants the language to run faster but rewriting
from scratch doesn't seem a good way to do that unless the current
language structure has some critical performance blocking shortcoming.

* lack of non-table data structures (like hashmap, etc). At present
pl/plgsql only has arrays to manage temporary non-table data. this is
where plpgsql is bad but many of the other languages like pl/v8 etc
are good.

* not enough function-time validation (I think we agree on this). the
situation has gotten better over the years but it'd be nice to have
stronger checks raised for runtime code (although, I prefer them as
warnings, not errors).

* can't manage transaction state -- only can create sub transactions

* can't execute until after mvcc snapshot occurs (for example, you
can't change transaction isolation and/or lock tables to prevent
serialization errors)

* weak or non-existent debugging tools

None of the above points except for the transaction management issue
seem to be solved easier in a new language vs an old one. Non sql
based languages pls (I figure plv8 is the best choice out of many)
would fix the performance issue but you'd have to give up first class
SQL statements to get that.

merlin

In response to

PL/pgSQL 2 at 2014-09-01 09:04:53 from Joel Jacobson

Responses

Re: PL/pgSQL 2 at 2014-09-02 21:21:50 from Joel Jacobson

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Marko Tiikkaja	2014-09-02 20:28:01	Re: [BUGS] BUG #10823: Better REINDEX syntax.
Previous Message	Alvaro Herrera	2014-09-02 20:24:15	Re: [BUGS] BUG #10823: Better REINDEX syntax.