From: | Geert Mak <pobox(at)verysmall(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: processing large amount of rows with plpgsql |
Date: | 2012-08-09 05:57:25 |
Message-ID: | 156BB7A4-1F36-4023-A885-77BDB2AF2699@verysmall.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 08.08.2012, at 22:04, Merlin Moncure wrote:
> What is the general structure of the procedure? In particular, how
> are you browsing and updating the rows?
Here it is -
BEGIN
for statistics_row in SELECT * FROM statistics ORDER BY time ASC
LOOP
...
... here some very minimal transformation is done
... and the row is written into the second table
...
END LOOP;
RETURN 1;
END;
> There is (almost) no way to
> force commit inside a function --
So what you are saying is that this behavior is normal and we should either equip ourselves with enough disk space (which I am trying now, it is a cloud server, which I am resizing to gain more disk space and see what will happen) or do it with an external (scripting) language?
> there has been some discussion about
> stored procedure and/or autonomous transaction feature in terms of
> getting there.
>
> I say 'almost' because you can emulate some aspects of autonomous
> transactions with dblink, but that may not be a very good fit for your
> particular case.
I met already dblink mention in this context somewhere... Though if plpgsql performs well with more disk space, I'll leave it for now. It is a one time operation this one.
Thank you,
Geert
>From pgsql-general-owner(at)postgresql(dot)org Thu Aug 9 04:12:20 2012
Received: from makus.postgresql.org (makus.postgresql.org [98.129.198.125])
by mail.postgresql.org (Postfix) with ESMTP id 548FA1983A6F
for <pgsql-general(at)postgresql(dot)org>; Thu, 9 Aug 2012 04:12:13 -0300 (ADT)
Received: from mailgate.intershop.de ([195.110.60.6])
by makus.postgresql.org with esmtp (Exim 4.72)
(envelope-from <M(dot)Mamin(at)intershop(dot)de>)
id 1SzMui-00047M-HX
for pgsql-general(at)postgresql(dot)org; Thu, 09 Aug 2012 07:12:13 +0000
Received: from JENMAIL01.ad.intershop.net ([10.0.87.43])
by mailgate.intershop.de (8.14.5/8.13.6) with ESMTP id q797B80K014770;
Thu, 9 Aug 2012 09:11:10 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: Re: processing large amount of rows with plpgsql
Date: Thu, 9 Aug 2012 09:11:55 +0200
Message-ID: <C4DAC901169B624F933534A26ED7DF310861B65D(at)JENMAIL01(dot)ad(dot)intershop(dot)net>
In-Reply-To: <156BB7A4-1F36-4023-A885-77BDB2AF2699(at)verysmall(dot)org>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [GENERAL] processing large amount of rows with plpgsql
Thread-Index: Ac118/yUyiCI96QPTBCiliBEhi2TcwACdSgQ
References: <E8B87AF3-D112-43A3-9EC7-155FD377BFB7(at)verysmall(dot)org> <CAHyXU0yr7n0nFB4bieYX7C9+v6CPwWmZ=1u1xQ2Pej0h6WGHeg(at)mail(dot)gmail(dot)com> <156BB7A4-1F36-4023-A885-77BDB2AF2699(at)verysmall(dot)org>
From: "Marc Mamin" <M(dot)Mamin(at)intershop(dot)de>
To: "Geert Mak" <pobox(at)verysmall(dot)org>, <pgsql-general(at)postgresql(dot)org>
X-Pg-Spam-Score: -2.6 (--)
X-Archive-Number: 201208/182
X-Sequence-Number: 189359
> > There is (almost) no way to
> > force commit inside a function --
>
> So what you are saying is that this behavior is normal and we should
> either equip ourselves with enough disk space (which I am trying now,
> it is a cloud server, which I am resizing to gain more disk space and
> see what will happen) or do it with an external (scripting) language?
>
Hello,
a relative simple way to workaround your performance/resource problem is
to slice the update.
e.g.:
create function myupdate(slice int) ...
for statistics_row in
SELECT * FROM statistics
WHERE id % 16 = slice
or:
WHERE hashtext(id::text) % 16 = slice
...
and then call your function with the values 1..15 (when using 16 slices)
Use a power of 2 for the number of slices.
It may be faster to use many slices and
this allows to do the job in parallel on a few threads.
HTH,
Marc Mamin
From | Date | Subject | |
---|---|---|---|
Next Message | Marc Mamin | 2012-08-09 07:19:26 | Re: processing large amount of rows with plpgsql |
Previous Message | Tom Lane | 2012-08-09 00:58:52 | Re: Problem running "ALTER TABLE...", ALTER TABLE waiting |