Re: temp tables and function performance

From: Alban Hertroys <alban(at)magproductions(dot)nl>
To: Anton Melser <melser(dot)anton(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: temp tables and function performance
Date: 2006-12-18 09:51:36
Message-ID: 458664A8.4050609@magproductions.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Anton Melser wrote:
> Hi,
> I am trying to move up in the world with my sql and need to do the
> following...
> I have a subscribers table and I need to export to csv (semi-colon
> separated) certain fields - that is fine, but I also need to export a
> multi-select field from another table as one string (0 to n values
> separated by commas) per line. The problem being that to get the
> actual string I have to go via 4 other relations... and as I have
> 200k+ subscribers this takes a while.

200k isn't all that much, unless you have a lot of large columns.

> My idea (which seems to work, though I haven't tested fully as it
> takes too damn long!), was to do the following. I would REALLY
> appreciate any pointers as my sql has never been this challenged!
>
> CREATE OR REPLACE FUNCTION mytest()
> RETURNS integer AS
> $BODY$DECLARE kindy INTEGER;
> BEGIN

> create temporary table tmp_interests(
> id bigint,
> value character varying(100)
> )
> WITHOUT OIDS
> ON COMMIT DROP;
>
> insert into tmp_interests
> select distinct si.subid, rbi.value
> from restem rbi, cats cc, trm_terms tt, subrest si
> where rbi.key = cc.name
> and cc.catid = tt.modcid
> and tt.tid = si.themeid;

You can do these in one go using
CREATE TEMPORARY TABLE tmp_interests AS SELECT ...

> create temporary table tmp_subscribers(
> email character varying(200),
> format character varying(4),
> interests character varying(1000),
> )
> WITHOUT OIDS
> ON COMMIT DROP;
>
> insert into tmp_subscribers
> Select email,
> format,
> my_interests(id) as interests
> from subscriber;
>
> GET DIAGNOSTICS kindy = ROW_COUNT;
>
> copy tmp_subscribers to '/home/myname/subs.csv' WITH CSV DELIMITER AS
> ';' NULL AS '';

If you have PG8.2 and can combine your 2 select queries into one, then
you can create a view of them and copy that instead. It takes out all
the inserts and can use your already existing table statistics - it
should be faster.

Also, after inserting a bunch of records into a table, make a habit of
running ANALYSE on it. Otherwise the query-planner knows nothing about
the data in the tables and is likely to come up with a sub-optimal query
plan.

>
> GET DIAGNOSTICS kindy = ROW_COUNT;
>
> return kindy;
>
> END;$BODY$
> LANGUAGE 'plpgsql' VOLATILE;
>
> ...
>
>
> CREATE OR REPLACE FUNCTION my_interests(bigint)
> RETURNS character varying AS
> $BODY$DECLARE
> subid ALIAS FOR $1;
> interests character varying;
> myinterest RECORD;
>
> BEGIN
> interests := '';
> FOR myinterest IN execute 'select value from tmp_interests where id =
> ' || subid LOOP

No need for a dynamic query here...

> if interests = '' then
> interests := myinterest.value;
> else
> interests := interests || ',' || myinterest.value;
> end if;
> END LOOP;
>
> RETURN interests;
> END$BODY$
> LANGUAGE 'plpgsql' VOLATILE;

I'd have to look up the syntax, but I'm quite certain you can put the
results of a select into an array. After that you can call
array_to_string(...) to convert it into a comma seperated string. That'd
take away the need for this SP (which I think is actually STABLE instead
of VOLATILE).

--
Alban Hertroys
alban(at)magproductions(dot)nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede

// Integrate Your World //

In response to

Browse pgsql-general by date

  From Date Subject
Next Message riki 2006-12-18 10:49:42 Re: installing postgres on win Me...
Previous Message Alban Hertroys 2006-12-18 09:37:16 Re: Stored Procedure and Trigger they puzzle me