Re: Postgres array parser

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Aleksej Trofimov <aleksej(dot)trofimov(at)ruptela(dot)lt>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres array parser
Date: 2011-12-13 14:42:58
Message-ID: CAFj8pRCOmmUdFMOcQ4wm9H7idw8Oo9oZ4gk5g93wZcjGQ2gOVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello

2011/12/13 Aleksej Trofimov <aleksej(dot)trofimov(at)ruptela(dot)lt>:
> We have tried foreach syntax, but we have noticed performance degradation:
> Function with for: 203ms
> Function with foreach: ~250ms:
>
> there is functions code:
> CREATE OR REPLACE FUNCTION input_value_fe(in_inputs numeric[], in_input_nr
> numeric)
>  RETURNS numeric AS
> $BODY$
> declare i numeric[];
> BEGIN
>        FOREACH i SLICE 1 IN ARRAY in_inputs
>            LOOP
>                 if i[1] = in_input_nr then
>                    return i[2];
>                 end if;
>            END LOOP;
>
>    return null;
> END;
> $BODY$
>  LANGUAGE plpgsql VOLATILE
>  COST 100;
>
> CREATE OR REPLACE FUNCTION input_value(in_inputs numeric[], in_input_nr
> numeric)
>  RETURNS numeric AS
> $BODY$
> declare
>  size int;
> BEGIN
>  size = array_upper(in_inputs, 1);
>    IF size IS NOT NULL THEN
>
>        FOR i IN 1 .. size LOOP
>            if in_inputs[i][1] = in_input_nr then
>                return in_inputs[i][2];
>            end if;
>        END LOOP;
>    END IF;
>
>    return null;
> END;
> $BODY$
>  LANGUAGE plpgsql VOLATILE
>  COST 100;
>
>
> On 12/13/2011 04:02 PM, Pavel Stehule wrote:
>>
>> Hello
>>
>> do you know FOREACH IN ARRAY statement in 9.1
>>
>> this significantly accelerate iteration over array
>>
>>
>> http://www.depesz.com/index.php/2011/03/07/waiting-for-9-1-foreach-in-array/
>>
>>
>>
>> 2011/12/13 Aleksej Trofimov<aleksej(dot)trofimov(at)ruptela(dot)lt>:
>>>
>>> Hello, I wanted to ask according such a problem which we had faced with.
>>> We are widely using postgres arrays like key->value array by doing like
>>> this:
>>>
>>> {{1,5},{2,6},{3,7}}
>>>
>>> where 1,2,3 are keys, and 5,6,7 are values. In our pgSql functions we are
>>> using self written array_input(array::numeric[], key::numeric) function
>>> which makes a loop on whole array and searches for key like
>>> FOR i IN 1 .. size LOOP
>>>            if array[i][1] = key then
>>>                return array[i][2];
>>>            end if;
>>> END LOOP;
>>>
>>> But this was a good solution until our arrays and database had grown. So
>>> now
>>> FOR loop takes a lot of time to find value of an array.
>>>
>>> And my question is, how this problem of performance could be solved? We
>>> had
>>> tried pgperl for string parsing, but it takes much more time than our
>>> current solution. Also we are thinking about self-written C++ function,
>>> may
>>> be someone had implemented this algorithm before?
>>>
>> you can use indexes or you can use hstore
>>
>> Regards
>>
>> Pavel Stehule
>>
>>> --
>>> Best regards
>>>
>>> Aleksej Trofimov
>>>
>>>
>>> --
>>> Sent via pgsql-performance mailing list
>>> (pgsql-performance(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-performance
>
>

It is strange - on my comp FOREACH is about 2x faster

postgres=# select input_value(array(select
generate_series(1,1000000)::numeric), 100000);
input_value
-------------

(1 row)

Time: 495.426 ms

postgres=# select input_value_fe(array(select
generate_series(1,1000000)::numeric), 100000);
input_value_fe
----------------

(1 row)

Time: 248.980 ms

Regards

Pavel

>
> --
> Best regards
>
> Aleksej Trofimov
>
> UAB "Ruptela"
>
> Phone: +370 657 80475
>
> E-Mail: aleksej(dot)trofimov(at)ruptela(dot)lt
> Web:    http://www.ruptela.lt
>
> Ruptela - the most successful IT company in Lithuania 2011
> Ruptela - sekmingiausia Lietuvos aukštųjų technologijų įmonė 2011
> http://www.prezidentas.lt/lt/spaudos_centras_392/pranesimai_spaudai/inovatyvus_verslas_-_konkurencingos_lietuvos_pagrindas.html
> http://www.ruptela.lt/news/37/121/Ruptela-sekmingiausia-jauna-aukstuju-technologiju-imone-Lietuvoje
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jon Nelson 2011-12-13 18:12:47 select distinct uses index scan vs full table scan
Previous Message Jon Nelson 2011-12-13 14:29:52 Re: copy vs. C function