From: | "Dmitry Koterov" <dmitry(at)koterov(dot)ru> |
---|---|
To: | "Postgres General" <pgsql-general(at)postgresql(dot)org> |
Subject: | How to speedup intarray aggregate function? |
Date: | 2007-10-09 23:01:58 |
Message-ID: | d7df81620710091601j7ed12403o3e11523dc3f0f82f@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello.
I created an aggregate:
CREATE AGGREGATE intarray_aggregate_push (_int4)
(
STYPE = _int4,
SFUNC = intarray_push_array,
INITCOND = '{}'
);
(or - I may use _int_union instead of intarray_push_array, its speed is
practically the same in my case).
This aggregate merges together a list of integer[] arrays resulting one big
array with all elements.
Then I want to use this aggregate:
SELECT intarray_aggregate_push(arrayfield)
FROM arraytable
The table arraytable contains a lot of rows (about 5000), each row has array
with length of 5-10 elements, so - the resulting array should contain about
50000 elements.
The query is okay, but its speed is too bad: about 1 second.
The main problem is the speed of intarray_aggregate_push function - it is
quite slow, because intarray_push_array reallocates the memory each time I
merge two arrays. I am pretty sure that the reallocaton and copying is the
bottleneck, because if I use another dummy aggreate:
CREATE AGGREGATE intarray_aggregate_dummy (_int4)
(
STYPE = _int4,
SFUNC = dummy,
INITCOND = '{}'
);
CREATE OR REPLACE FUNCTION "public"."dummy" (a integer [], b integer [])
RETURNS integer [] AS
$body$ BEGIN RETURN a; END; $body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
where dummy() is the function which returns its first argument without any
modification, the speed grows dramatically - about 25 ms (instead of 1000
ms!).
The question is: how could I optimize this, and is it possible at all in
Postgres? I just want to get one large array glued from a lot of smaller
arrays...
P.S.
I have tested that
SELECT array_to_string(ARRAY(SELECT text FROM tbl), ' ')
query is many times faster than joining of all "text" fields inside one
pg/plsql stored function (I assume that it is because Postgres do not
reallocate & copy memory each time it glues a new text piece). But
unfortunately there is no way to convert integer[] to string to use this
method: I could write
select '{1,2}'::integer[]
but I couldn't use
select ARRAY[1,2]::text
From | Date | Subject | |
---|---|---|---|
Next Message | pgsql.gen | 2007-10-10 00:25:23 | Using C API |
Previous Message | Erik Jones | 2007-10-09 22:37:10 | Re: Generating subtotal reports direct from SQL |