Re: overload

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Viktor Bojović <viktor(dot)bojovic(at)gmail(dot)com>
Cc: lists-pgsql(at)useunix(dot)net, pgsql-sql(at)postgresql(dot)org
Subject: Re: overload
Date: 2011-07-08 19:18:15
Message-ID: CAFj8pRCs9-eq5xjwPEx6PA5Oj++YYhRtz=EQCrAPqDFt2r2+_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hello

using a "window" implemented via LIMIT OFFSET is not good - it is
solution on some systems where cursors are not available, but it is
bad solution on PostgreSQL. Use a cursor instead - it is significantly
more efective with less memory requests.

Regards

Pavel Stehule

2011/7/8 Viktor Bojović <viktor(dot)bojovic(at)gmail(dot)com>:
> Thanx Wayne,
> at the end i did it that way and it works.
> The code is below.
> CREATE FUNCTION pattern_counter1("patLength" integer) RETURNS character
> varying
> LANGUAGE plperl
> AS $_X$
> my $rvCnt = spi_exec_query("select count(1) as cnt from entry");
> #my $rowCountAll = $rvCnt->{processed};
> my $row = $rvCnt->{rows}[0];
> my $rowCountAll = $row->{cnt};
> my $windowSize = 500000;
> my %patterns=();
> for (my $p=0;$p<$rowCountAll;$p+=$windowSize){
> my $sql="select sequence from entry limit $windowSize offset $p";
>
> my $rv = spi_exec_query($sql);
> my $rowCount = $rv->{processed};
> my $patLen = $_[0];
> my $patt = '';
>
> foreach my $rn (0 .. $rowCount -1){
> my $row = $rv->{rows}[$rn];
> my $seq = $row->{sequence};
> for (my $x = 1;$x<=length($seq) - $patLen;$x++){
> $patt=substr($seq,$x,$patLen);
> if (! defined $patterns{$patt}) {
> $patterns{$patt}=1;
> }else{
> $patterns{$patt}++;
> }
> }
> }
> }
>
> foreach $patt (keys %patterns){
> my $sql="insert into patterns values('".$patt."',".$patterns{$patt}.")";
> spi_exec_query($sql);
> }
> return $tmp;
> $_X$;
>
>
> On Fri, Jul 8, 2011 at 8:50 PM, <lists-pgsql(at)useunix(dot)net> wrote:
>>
>> I'm have the same situation with large tables.  Take a look at using a
>> cursor to fetch several thousand rows at a time.  I presume what's
>> happening is that perl is attempting to create a massive list/array in
>> memory.  If you use a cursor the list should only contain X number of
>> rows where X in the number specified at each fetch execution.  You'll
>> need to define the cursor inside a transaction block.
>>
>> - begin transaction
>> - define the cursor
>> - fetch rows from cursor
>> - while row count from previous step > 0, execute previous step
>> - terminate transaction
>>
>> Or you could use plpgsql instead of plperl, FOR loops over result sets in
>> plpgsql implicitly use cursors... it's just a little less code.
>>
>> Hope that helps,
>> Wayne
>>
>> On Tue, Jul 05, 2011 at 10:29:03PM +0200, Viktor Bojovi?? wrote:
>> > Hi,
>> > while reading 20GB table through PL/PERL function , it constantly grows
>> > in
>> > RAM.
>> > I wanted to ask you which is the best way to read table inside that
>> > function without such memory consumption.
>> > Thanks in advance
>> >
>> > Code is here:
>> >
>> > CREATE  FUNCTION pattern_counter("patLength" integer)
>> >   RETURNS varchar AS
>> > $BODY$
>> >     my $rv = spi_exec_query("select sequence from entry");
>> >     my $rowCount = $rv->{processed};
>> >     my $patLen = $_[0];
>> >     my $patt = '';
>> >     my %patterns=();
>> >     foreach my $rn (0 .. $rowCount -1){
>> >     my $row = $rv->{rows}[$rn];
>> >     my $seq = $row->{sequence};
>> >     for (my $x = 1;$x<=length($seq) - $patLen;$x++){
>> >         $patt=substr($seq,$x,$patLen);
>> >         if (! defined $patterns{$patt}) {
>> >         $patterns{$patt}=1;
>> >         }else{
>> >         $patterns{$patt}++;
>> >         }
>> >     }
>> >     }
>> >     foreach $patt (keys %patterns){
>> >     my $sql="insert into patterns
>> > values('".$patt."',".$patterns{$patt}.")";
>> >     spi_exec_query($sql);
>> >     }
>> > return '';
>> > $BODY$
>> >   LANGUAGE plperl VOLATILE
>> >   COST 100;
>> >
>> >
>> >
>> > --
>> > ---------------------------------------
>> > Viktor Bojovi??
>> > ---------------------------------------
>> > Wherever I go, Murphy goes with me
>
>
>
> --
> ---------------------------------------
> Viktor Bojović
> ---------------------------------------
> Wherever I go, Murphy goes with me
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message lists-pgsql 2011-07-08 19:49:49 Re: overload
Previous Message Viktor Bojović 2011-07-08 18:55:36 Re: overload