Re: in Pl/PgSQL, do commit every 5000 records

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: emilu(at)encs(dot)concordia(dot)ca
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: in Pl/PgSQL, do commit every 5000 records
Date: 2006-03-13 18:14:00
Message-ID: b42b73150603131014sc02f399r2a324fba9fe08770@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> I will try separate my huge data computation into several pieces
> something like:
[...]

> If I understood correctly, "begin ... exception when .. then ... end"
> can work the same way as commit. In another way, if commands in the
> sub-block (such as step1) run successfully, data in this part (step1) is
> committed. Then step2, step3... stepN that are all under "begin..
> exception.. end" sub-blocks will be run and "committed" one by one.

begin...exception...end; does not break up the transaction into
smaller subtransactions. it does however allow graceful handling from
errors inside a function but that is not what you are looking for.

To put it another way, it is impossible for any part of the work
inside the function to become visible to other backends unless you
leave the function without error and the transaction that wraps it (if
there is one) is comitted.

> "To define a procedure, i.e. a function that returns nothing, just
> specify RETURNS VOID. "
> Copied from
> http://www.postgresql.org/docs/8.1/interactive/sql-createfunction.html
>
> So, a stored procedure is "a void function", right?

yes and no. Yes in that a procedure differs from a function in that
it returns no value. No in that 'pure' stored procedures discussed
some months back in hackers (and IIRC not implemented yet) are not
transactional entities. They are like server side sql scripts that
take parameters. Try running vacuum inside a function...it doen't
work but it would inside a non function based stored procedure.

I think your best bet is to first try Florian's approach of the
monolithic query and see if it works for you...if it does, great. If
not, you have basically three options:

1. in transaction cursor:
declare your input cursor in transaction and fetch x rows at a time
(say, 1000) and write them back over another transaction comitting as
you go. This is insensitive in that you can't see changes as you loop
through the set.

2.non transactional cursor:
using only one connection you declare your cursor 'with hold' and loop
over and insert over same connection. Just beware that postgresql
must materialize 'with hold' cursors into a temporary table. This
approach is also insensitive.

3. client side table browsing. You can browse your tables ISAM style.
This is a fancy way of saying you use client code to loop over a
table ordered on a key. This approach can be sensitive (one
connection) or insensitive (two connections) while preserving the
ablity to commit as you go.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2006-03-13 18:19:47 Re: Transaction eating up all RAM
Previous Message Michael Fuhr 2006-03-13 17:50:02 Re: Possible infinite loop in query using bitmap scans