From: | Mathieu Dubois <mathieu(dot)dubois(at)limsi(dot)fr> |
---|---|
To: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: cast bytea to double precision[] |
Date: | 2011-07-26 17:25:36 |
Message-ID: | 4E2EF890.9090901@limsi.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
On 07/26/2011 04:30 PM, Merlin Moncure wrote:
> On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois<mathieu(dot)dubois(at)limsi(dot)fr> wrote:
>> Hello,
>>
>> Le 25/07/2011 17:58, Mathieu Dubois a écrit :
>>> On 07/25/2011 05:54 PM, Merlin Moncure wrote:
>>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu
>>>> Dubois<mathieu(dot)dubois(at)limsi(dot)fr> wrote:
>>>>> I have found a solution by myself for the conversion:
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements
>>>>> are in
>>>>> fact separated by commas).
>>>>>
>>>>> To convert my column I have used:
>>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[];
>>>>>
>>>>> Is that correct?
>>>>> Is it correct to pass the column name to regexp_split_to_array()?
>>>> Yeah -- you are just passing a column's data into a function as an
>>>> argument -- standard practice. This will work -- your bytea is really
>>>> a text column, so it's just a matter of breaking up the string.
>>>> regexp_* functions are great for that.
>>> Thank you very much for your reply.
>>>
>>> I will launch the conversion right now.
>>>
>> The main reason to do this was to have smaller backups.
>> The size of a compressed backup was around 1GB with bytea.
>> I have converted the columns (on a copy of the database) but the expected
>> gain are not here!
>> With double precision[] it is still around 1GB (a little bit smaller but
>> just a few MB).
>>
>> Also the size on the disk is not smaller.
>> I have listed the content of /var/lib/postgres/8.4/main/base with du and the
>> 2 versions have the same size (3.1GB).
>>
>> Does it make sense?
>> My hypothesis is that the compression algorithm is able to find regularities
>> the data so it finds the same regularity in bytea and in double precision[].
>>
>> Is there any advantage to use double precision[] over bytea in my case?
> probably not -- arrays can be significantly smaller than a set of
> individual tuples each holding one value because of the tuple
> overhead, but you still have to pay for the array header and a 4 byte
> length/null indicator per element.
>
> A packed string is often the smallest way to store data, although not
> necessarily the best. A double precision[] comes with a lot of syntax
> advantages.
Thanks for your advice!
I find the result surprising because the floats are encoded with a lot
of characters (something like 20) while a double is 8 bytes.
I have tried to run VACCUM but it changed nothing...
All of my code is based on strings so I won't take time to modify it if
there is no gain.
Mathieu
> merlin
>
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2011-07-26 18:40:43 | Re: cast bytea to double precision[] |
Previous Message | Merlin Moncure | 2011-07-26 14:30:43 | Re: cast bytea to double precision[] |