Re: Aggregate C function accumulating a text array

From: Joe Conway <mail(at)joeconway(dot)com>
To: Joel Dudley <joel(at)nanovoid(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Aggregate C function accumulating a text array
Date: 2004-06-04 23:58:08
Message-ID: 40C10C90.1060301@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Joel Dudley wrote:
> I am about to write a set of C functions to be used in an aggregate
> function in which the final function performs a calculation on an array
> of accumulated text data types stored in a text[] array. I need to use
> the text type because this function will be used on DNA sequences which
> can be very large. My questions are the following. What is the most
> efficient way to accumulate a text array while being efficient with
> memory? I see construct_array() used in accumulation functions but I am
> worried that I might end up making a copy of a potentially very large
> text array each time my accumulation function is called.

True, but the intermediate results should be released after each row, I
think. You might try it with some real data before assuming a
performance problem.

If it is a problem, take a look at how contrib/intagg works. It
basically just passes a pointer from call to call. You could do
something similar for the text data type.

> The general flow is
>
> User defined aggregate function
> SELECT pb_distance_k2p(sequence) WHERE family_id = 10;
>
> uses accumulation function
>
> distance_accum(PG_FUNCTION_ARGS);
>
> and uses a final function
>
> calculate_distance_k2p(PG_FUNCTION_ARGS)
>
> which needs to deconstruct_array() to get the text array and loop
> through the array to do some pairwise comparisons of the text and return
> a multidimensional array

Makes sense to me. BTW, take a look at PL/R
http://www.joeconway.com/plr/

It would allow you to write your final function in R, which has many
extensions related to bioinformatics -- see:
http://www.bioconductor.org/

HTH,

Joe

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2004-06-05 00:00:28 Re: News outage?
Previous Message Patrick Hatcher 2004-06-04 23:54:11 Perl DBI error string question pg7.4.2