PL/pgSQL - for discussion

From: jwieck(at)debis(dot)com (Jan Wieck)
To: pgsql-hackers(at)postgreSQL(dot)org (PostgreSQL HACKERS)
Subject: PL/pgSQL - for discussion
Date: 1998-03-12 15:38:32
Message-ID: m0yDA41-000BFRC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

as I proposed, I'm now starting on the PL/pgSQL loadable
procedural language. As far as I'm now I have a pl_handler
with an independent flex/bison parser that can parse a
rudimentary implementation of the language. The next step is
to start on the PL/pgSQL executor and look if the generated
instruction tree can be used (up to now the pl_handler only
dumps the instruction tree and returns a 0 Datum.

If that works I'll expand the scanner/parser to the full
PL/plSQL language including trigger procedures.

But I would like to have some discussion on language itself.
So I wrote down what I had in mind. The document is appended
below.

Please comment/suggest !

Someone gave a hint about global variables existing during a
session. What is a session than? One transaction? The
backends lifetime? And should global variables be visible by
more than one function? I vote for NO! In that case we need
something like packages of functions that share globals.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #

PL/pgSQL
A procedural language for the PostgreSQL RDBMS
1st draft

Jan Wieck <jwieck(at)debis(dot)com>

Notice

This document is for the postgresql-hackers community for
completing the syntax specification of PL/pgSQL. The
extension module described here doesn't exist up to now!

Preface

PL/pgSQL is a procedural language based on SQL designed for
the PostgreSQL database system.

The extensibility features of PostgreSQL are mostly based on
the ability to define functions for various operations.
Functions could have been written in PostgreSQL's SQL dialect
or in the C programming language. Functions written in C are
compiled into a shared object and loaded by the database
backend process on demand. Also the trigger features of
PostgreSQL are based on functions but required the use of the
C language.

Since version 6.3 PostgreSQL supports the definition of
procedural languages. In the case of a function or trigger
procedure defined in a procedural language, the database has
no builtin knowlege how to interpret the functions source
text. Instead, the function and trigger calls are passed into
a handler that knows the details of the language. The
handler itself is function compiled into a shared object and
loaded on demand.

Overview

The PL/pgSQL language is case insensitive. All keywords and
identifiers can be used in upper-/lowercase mixed.

PL/pgSQL is a block oriented language. A block is defined as

[<<label>>]
[DECLARE
-- declarations]
BEGIN
-- statements
END;

There can be any number of subblocks in the statements
section of a block. Subblocks can be used to hide variables
from outside a block of statements (see Scope and visability
below).

Comments

There are two types of comments in PL/pgSQL. A double dash
'--' starts a comment that extends to the end of the line. A
'/*' starts a block comment that extends to the next '*/'.
Block comments cannot be nested, but double dash comments can
be enclosed into a block comment.

Declarations

All variables, rows and records used in a block or it's
subblocks must be declared in the declarations section of the
block. The parameters given to the function are
automatically declared with the usual identifiers $n. The
declarations have the following syntax:

<name> [CONSTANT] <type> [NOT NULL]
[DEFAULT | := <value>];

Declares a variable of the specified type. If the
variable is declared as CONSTANT, the value cannot be
changed. If NOT NULL is specified, an assignment of a
NULL value results in a runtime error. Since the
default value of a variable is the SQL NULL value,
all variables declared as NOT NULL must also have a
default value.

The default value is evaluated at the actual function
call. So assigning 'now' to an abstime varable causes
the variable to have the time of the actual function
call, not when the function was compiled.

<name> <class>%ROWTYPE;

Declares a row with the structure of the given class.
Class must be an existing table- or viewname of the
database. The fields of the row are accessed in the
dot notation. Parameters to a procedure could be
tuple types. In that case the corresponding
identifier $n will be a rowtype. Only the user
attributes and the oid of a tuple are accessible in
the row. There must be no whitespaces between the
classname, the percent and the ROWTYPE keyword.

<name> RECORD;

Records are similar to rowtypes, but they have no
predefined structure and it's impossible to assign a
value into them. They are used in selections and FOR
loops to hold one actual database tuple from a select
operation. One and the same record can be used in
different selections (but not in nested ones).

<name> ALIAS FOR $n;

For better readability of the code it's possible to
define an alias for a positional parameter to the
function.

Datatypes

The type of a variable can be any of the existing data types
of the database. <type> above is defined as:

postgesql-basetype
or variable%TYPE
or rowtype.field%TYPE
or class.field%TYPE

As for the rowtype declaration, there must be no whitespaces
between the classname, the percent and the TYPE keyword.

Expressions

All expressions used in PL/pgSQL statements are processed
using the backends executor. Since even a constant looking
expression can have a totally different meaning for a
particular data type (as 'now' for abstime), it is impossible
for the PL/pgSQL parser to identify real constant values
other than the NULL keyword. The expressions are evaluated by
internally executing a query

SELECT <expr>

over the SPI manager. In the expression, occurences of
variable identifiers are substituted by parameters and the
actual values from the variables are passed to the executor
as query parameters. All the expressions used in a PL/pgSQL
function are only prepared and saved once.

Statements

Anything not understood by the parser as specified below will
be put into a query and sent down to the database engine to
execute. The resulting query should not return any data.

Assignment

An assignment of a value to a variable or rowtype field
is written as:

<identifier> := <expr>;

If the expressions result data type doesn't match the
variables data type, or the variables atttypmod value is
known (as for char(20)), the result value will be
implicitly casted by the PL/pgSQL executor using the
result types output- and the variables type input-
functions. Note that this could potentially result in
runtime errors generated by the types input functions.

An assignment of a complete selection into a record or
rowtype can be done as:

SELECT targetlist INTO <recname|rowname> FROM fromlist;

If a rowtype is used as target, the selected targetlist
must exactly match the structure of the rowtype or a
runtime error occurs. The fromlist can be followed by
any valid qualification, grouping, sorting etc. There is
a special condition [NOT] FOUND that can be used
immediately after a SELECT INTO to check if the data has
been found.

SELECT * INTO myrec FROM EMP WHERE empname = myname;
IF NOT FOUND THEN
ELOG ERROR 'employee %s not found' myname;
END IF;

In addition, the select statement must not return more
that one row. If multiple rows are returned, a runtime
error will be generated.

Returning from the function

RETURN <expr>;

The function terminates and the value of <expr> will be
returned to the upper executor. The return value of a
function cannot be undefined. If control reaches the end
of the toplevel block of the function without hitting a
RETURN statement, a runtime error will occur.

Aborting and messages

As indicated above there is an ELOG statement that can
throw messages into the PostgreSQL elog mechanism.

ELOG level 'format' [identifiers];

Inside the format, only %s might be used as a placeholder
for the following identifiers. The identifiers must
specify an existing variable or row/record field.

Conditionals

IF <expr> THEN
-- statements
[ELSE
-- statements]
END IF;

The expression <expr> must return a value that at least
can be casted into a boolean.

Loops

There are multiple types of loops.

[<<label>>]
LOOP
-- statements
END LOOP;

An unconditional loop that must be terminated explicitly
by an EXIT statement. The optional label can be used by
EXIT statements of nested loops to specify which level of
nesting should be terminated.

[<<label>>]
WHILE <expr> LOOP
-- statements
END LOOP;

A conditional loop that is executed as long as the
evaluation of <expr> returns true.

[<<label>>]
FOR <name> IN [REVERSE] <expr>..<expr> LOOP
-- statements
END LOOP.

A loop that iterates over a range of integer values. The
variable <name> is automatically created as type integer
and exists only inside the loop. The two expressions
giving the lower and upper bound of the range are
evaluated only when entering the loop. The iteration step
is 1.

FOR <recname|rowname> IN <select_clause> LOOP
-- statements
END LOOP;

The record or row is assigned all the rows resulting from
the select clause and the statements executed for each.
If the loop is terminated with an EXIT statement, the
last accessed row is still accessible in the record or
rowtype.

EXIT [label] [WHEN <expr>];

If no label given, the innermost loop is terminated and
the statement following END LOOP is executed next. If
label is given, it must be the label of the current or an
upper level of nested loops or blocks. Then the named
loop or block is terminated and control continues with
the statement after the loops/blocks corresponding END.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas G. Lockhart 1998-03-12 16:21:08 Re: [HACKERS] PL/pgSQL - for discussion
Previous Message Darren King 1998-03-12 14:49:35 Re: [HACKERS] port/getrusage.c?