Re: storing intermediate results from recursive plpgsql

From: wsheldah(at)lexmark(dot)com
To: Fran Fabrizio <ffabrizio(at)mmrd(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: storing intermediate results from recursive plpgsql
Date: 2001-12-13 22:37:14
Message-ID: 200112132241.RAA27523@interlock2.lexmark.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

It sounds like each batch of children gets operated on three different times:
once when you select the children of a particular id, again when you insert them
into the temporary table, and a third time when you select from the temp table.

The first and easiest optimization would be to truncate the temp table instead
of deleting from it, if you're not doing that alread. That won't solve the real
problem though.

I have basically the same design. What I'm doing is issuing the selects from
perl, and storing the results in a perl hash structure. I only have to select
each batch of id's once this way. I'm sure this makes up for whatever I lose by
not doing it in a postgres function. Seems to work well. In some cases I'm using
Storable to cache the resulting perl hash in a Postgresql bytea field so I don't
always rebuild the entire tree from scratch.

You might also google for Joe Celko and his nested set model. It's a bit
complex, but looks like it could be a win, especially if you have a very high
ratio of selects to inserts/updates. Other people have tried other variations of
this; the mail archives of this list in just the last year have other ideas if
you look for tree implementations.

Good luck,

Wes Sheldahl

Fran Fabrizio <ffabrizio%mmrd(dot)com(at)interlock(dot)lexmark(dot)com> on 12/13/2001 04:43:34
PM

To: pgsql-general%postgresql(dot)org(at)interlock(dot)lexmark(dot)com
cc: (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject: [GENERAL] storing intermediate results from recursive plpgsql

Hello,

I've got a plpgsql function that is recursive. Basically, it traverses
a table that represents a tree, which in turn represents parent-child
relationships. So, I have a function, get_descendants. For each pass,
it gets the children of some id. Then it recurses and looks for the
children of all of those children, etc...So, along the way, I'm building
a list of ids that represent the whole family.

For lack of a better idea, I'm storing the id's into a table on each
pass. So, if I recurse three levels, I'm doing three inserts. When the
recursion exits, I simply select the entire table and then I delete all
rows from it. The performance hit I take is unacceptable, something
like .02 - .03 seconds per insert, and it's adding up due to the amount
of times I have to run this function. The end result is that the web
page that displays this data takes many seconds to run.

Is there some sort of data structure in plpgsql (an array) that I can
use instead of the hack of inserting into a table on each pass and
selecting back out at the end? I have to find a way to optimize this
process further.

Thanks,
Fran

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Fran Fabrizio 2001-12-13 22:51:57 Re: storing intermediate results from recursive plpgsql
Previous Message adam fisher 2001-12-13 22:31:11 Re: foreign key creation problem