Quick Links

plpgsql arrays

From:	Matthew Wakeling <matthew(at)flymine(dot)org>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	plpgsql arrays
Date:	2009-04-03 13:32:54
Message-ID:	alpine.DEB.2.00.0904031420470.21772@aragorn.flymine.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I'm writing a plpgsql function that effectively does a merge join on the
results of two queries. Now, it appears that I cannot read the results of
two queries as streams in plpgsql, so I need to copy the contents of one
query into an array first, and then iterate over the second query
afterwards.

I have discovered that creating large arrays in plpgql is rather slow. In
fact, it seems to be O(n^2). The following code fragment is incredibly
slow:

genes = '{}';
next_new = 1;
FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end LOOP
genes[next_new] = loc;
IF (next_new % 10000 = 0) THEN
RAISE NOTICE 'Scanned % gene locations', next_new;
END IF;
next_new = next_new + 1;
END LOOP;
genes_size = coalesce(array_upper(genes, 1), 0);
RAISE NOTICE 'Scanned % gene locations', genes_size;

For 200,000 rows it takes 40 minutes.

So, is there a way to dump the results of a query into an array quickly in
plpgsql, or alternatively is there a way to read two results streams
simultaneously?

Matthew

--
I would like to think that in this day and age people would know better than
to open executables in an e-mail. I'd also like to be able to flap my arms
and fly to the moon. -- Tim Mullen

Responses

Re: plpgsql arrays at 2009-04-03 13:47:31 from Robert Haas
Re: plpgsql arrays at 2009-04-03 14:08:35 from Tom Lane
Re: plpgsql arrays at 2009-04-03 15:02:13 from Merlin Moncure

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Robert Haas	2009-04-03 13:46:32	Re: Rewriting using rules for performance
Previous Message	Matthew Wakeling	2009-04-03 13:17:58	Rewriting using rules for performance