Re: Large rows number, and large objects

From: Jose Ildefonso Camargo Tolosa <ildefonso(dot)camargo(at)gmail(dot)com>
To: Andrzej Nakonieczny <dzemik-pgsql-performance(at)e-list(dot)pingwin(dot)eu(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig James <craig_james(at)emolecules(dot)com>
Subject: Re: Large rows number, and large objects
Date: 2011-07-22 00:46:40
Message-ID: CAETJ_S9UtECSQUezPL6a-ZgnLNBFC_o6-k7WFrYdnsuqYXgcRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jul 20, 2011 at 3:03 PM, Andrzej Nakonieczny
<dzemik-pgsql-performance(at)e-list(dot)pingwin(dot)eu(dot)org> wrote:
> W dniu 20.07.2011 17:57, Jose Ildefonso Camargo Tolosa pisze:
>
> [...]
>
>>    Many of the advantages of partitioning have to do with maintenance
>>    tasks.  For example, if you gather data on a daily basis, it's faster
>>    to drop the partition that contains Thursday's data than it is to do a
>>    DELETE that finds the rows and deletes them one at a time.  And VACUUM
>>    can be a problem on very large tables as well, because only one VACUUM
>>    can run on a table at any given time.  If the frequency with which the
>>    table needs to be vacuumed is less than the time it takes for VACUUM
>>    to complete, then you've got a problem.
>>
>>
>> And.... pg_largeobject table doesn't get vacuumed? I mean, isn't that
>> table just as any other table?
>
> Vacuum is a real problem on big pg_largeobject table. I have 1.6 TB database
> mostly with large objects and vacuuming that table on fast SAN takes about 4
> hours:
>
>        now          |        start        |   time   |  datname   |
>  current_query
> ---------------------+---------------------+----------+------------+----------------------------------------------
>  2011-07-20 20:12:03 | 2011-07-20 16:21:20 | 03:50:43 | bigdb      |
> autovacuum: VACUUM pg_catalog.pg_largeobject
> (1 row)
>
>
> LO generates a lot of dead tuples when object are adding:
>
>     relname      | n_dead_tup
> ------------------+------------
>  pg_largeobject   |     246731
>
> Adding LO is very fast when table is vacuumed. But when there is a lot of
> dead tuples adding LO is very slow (50-100 times slower) and eats 100% of
> CPU.
>
> It looks that better way is writing object directly as a bytea on paritioned
> tables althought it's a bit slower than LO interface on a vacuumed table.

Well... yes... I thought about that, but now then, what happen when
you need to fetch the file from the DB? will that be fetched
completely at once? I'm thinking about large files here, say
(hypothetically speaking) you have 1GB files stored.... if the system
will fetch the whole 1GB at once, it would take 1GB RAM (or not?), and
that's what I wanted to avoid by dividing the file in 2kB chunks
(bytea chunks, actually).... I don't quite remember where I got the
2kB size from... but I decided I wanted to avoid using TOAST too.

>
>
> Regards,
> Andrzej
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-07-22 17:08:04 Re: hstore - Implementation and performance issues around its operators
Previous Message Greg Smith 2011-07-21 11:16:24 Intel 320 series drives firmware bug