Re: Is it reasonable to store double[] arrays of 30K elements

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: AlexK <alkuzo(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org >> PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Is it reasonable to store double[] arrays of 30K elements
Date: 2014-02-15 21:11:58
Message-ID: CAFj8pRDYADotVmjpRf8v6zoDNvP_TRoj=HtBXe9f-cBPnshZfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2014-02-15 21:52 GMT+01:00 AlexK <alkuzo(at)gmail(dot)com>:

> Hi Pavel,
>
> 1. I believe we have lots of memory. How much is needed to read one array
> of 30K float number?
>

it is not too much - about 120KB

> 2. What do we need to avoid possible repeated detost, and what it is?
>

any access to array emits detoast - so repeated access to any field in
array is wrong. You can read a complete array as one block, or you can
evaluate a array as table - and then detost is processed only once.

> 3. We are not going to update individual elements of the arrays. We might
> occasionally replace the whole thing. When we benchmarked, we did not
> notice slowness. Can you explain how to reproduce slowness?
>

you can see this example

postgres=# do
$$
declare a int[] := '{}';
begin
for i in 1..1000
loop
a := a || i;
end loop;
end;
$$;

This code is fast only for small arrays

10K ~ 100ms
100K ~ 10000ms

postgres=# do
$$
declare a int := 0;
begin
for i in 1..100000
loop
a := a + 1;
end loop;
end;
$$;
DO
Time: 88.481 ms

overhead of plpgsql cycle is about 100ms

but you can generate a array by fast way (but should not be by update)

postgres=# select array_upper(array(select generate_series(1,100000)),1);
array_upper
─────────────
100000
(1 row)
Time: 19.441 ms

Pg can manipulate with large arrays relatively fast

postgres=# select max(unnest) from (select unnest(array(select
generate_series(1,100000)))) x;
max
────────
100000
(1 row)

Time: 96.644 ms

but it should be a block operations

Regards

Pavel

>
> TIA!
>
>
> On Fri, Feb 14, 2014 at 11:03 PM, Pavel Stehule [via PostgreSQL] <[hidden
> email] <http://user/SendEmail.jtp?type=node&node=5792236&i=0>> wrote:
>
>> Hello
>>
>>
>> I worked with 80K float fields without any problem.
>>
>> There are possible issues:
>>
>> * needs lot of memory for detoast - it can be problem with more parallel
>> queries
>> * there is a risk of possible repeated detost - some unhappy usage in
>> plpgsql can be slow - it is solvable, but you have to identify this issue
>> * any update of large array is slow - so these arrays are good for write
>> once data
>>
>> Regards
>>
>> Pavel
>>
>>
>> 2014-02-14 23:07 GMT+01:00 lup <[hidden email]<http://user/SendEmail.jtp?type=node&node=5792144&i=0>
>> >:
>>
>>> Would 10K elements of float[3] make any difference in terms of read/write
>>> performance?
>>> Or 240K byte array?
>>>
>>> Or are these all functionally the same issue for the server? If so,
>>> intriguing possibilities abound. :)
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://postgresql.1045698.n5.nabble.com/Is-it-reasonable-to-store-double-arrays-of-30K-elements-tp5790562p5792099.html
>>> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list ([hidden email]<http://user/SendEmail.jtp?type=node&node=5792144&i=1>
>>> )
>>>
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://postgresql.1045698.n5.nabble.com/Is-it-reasonable-to-store-double-arrays-of-30K-elements-tp5790562p5792144.html
>> To unsubscribe from Is it reasonable to store double[] arrays of 30K
>> elements, click here.
>> NAML<http://postgresql.1045698.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> ------------------------------
> View this message in context: Re: Is it reasonable to store double[]
> arrays of 30K elements<http://postgresql.1045698.n5.nabble.com/Is-it-reasonable-to-store-double-arrays-of-30K-elements-tp5790562p5792236.html>
>
> Sent from the PostgreSQL - general mailing list archive<http://postgresql.1045698.n5.nabble.com/PostgreSQL-general-f1843780.html>at Nabble.com.
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2014-02-15 21:26:35 Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces?
Previous Message AlexK 2014-02-15 20:52:24 Re: Is it reasonable to store double[] arrays of 30K elements