Re: Optimizing a read-only database

From: William Dunn <dunnwjr(at)gmail(dot)com>
To: François Battail <francois(dot)battail(at)sipibox(dot)fr>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Optimizing a read-only database
Date: 2015-05-18 15:20:28
Message-ID: CAEva=VmDq6EnZnBMZM+ecvmX7XzVekjDkUamG-QuZP_p_-M7=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, May 18, 2015 at 10:54 AM, François Battail <
francois(dot)battail(at)sipibox(dot)fr> wrote:

> Le 18/05/2015 16:38, William Dunn a écrit :
>
> * You can also run a CLUSTER command on one of your indexes to group
>> data that is frequently accessed together into the same segment of
>> disk so you can get more of it in a single IO operation.
>>
>
> Hum... I was planning to put indexes and data on different disks (SSD) /
> controller to maximize bandwith use, am I wrong?
>

Hello François - the CLUSTER command doesn't have to do with where your
indexes are. What the CLUSTER command does is physically sort the table
data based on the index (Doc:
http://www.postgresql.org/docs/devel/static/sql-cluster.html) So for
example if you have a column called 'region_code' with an index and CLUSTER
the table by that index all the rows for 'region_code'=15 will be located
next to each other on disk and can be read in the same IO operation. The
usual disadvantage of CLUSTER is that it does not maintain itself, but
since your data is read-only that doesn't matter. And yes you can still
have the index on an SSD and the source table on slower storage.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com>*

On Mon, May 18, 2015 at 10:54 AM, François Battail <
francois(dot)battail(at)sipibox(dot)fr> wrote:

> Le 18/05/2015 16:38, William Dunn a écrit :
>
> Thank you William,
>
> * With read-only work loads you can make shared_buffers very large,
>> like 40% of RAM available to the database. Usually you would keep it
>> lower because in a write heavy workload large shared_buffers causes
>> checkpoints to have huge IO, but since you are not making changes in
>> shared_buffers this will not happen.
>>
>
> Yes, good idea.
>
> * You can also increase checkpoint_timeout to a very large value to
>> prevent checkpoints, since you don't need them. WAL level can be
>> minimal as well.
>>
>
> Already set to 5 min with 50 segments and 0.9 completion target (but used
> also for the bulk loading). But of course I will set it to 1 hour when in
> read only mode.
>
> * You can also run a CLUSTER command on one of your indexes to group
>> data that is frequently accessed together into the same segment of
>> disk so you can get more of it in a single IO operation.
>>
>
> Hum... I was planning to put indexes and data on different disks (SSD) /
> controller to maximize bandwith use, am I wrong?
>
> * You can also run the VACUUM FULL command during off-hours to get
>> your tables vacuumed and statistics up-to-date. It's usually too
>> much overhead to be worthwhile but since you are not doing updates
>> you only have to do it once then don't need to worry about
>> autovacuum being aggressive enough.
>>
>
> Vacuum is done at the end of the import and then set to off.
>
> * I don't think that removing locks will provide any benefit if your
>> queries are truly read-only since ordinary read-only transactions do
>> not require any locks
>>
>
> At least a read write lock should be needed, but you're right: better take
> a look at the source code to be sure.
>
>
> Best regards
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2015-05-18 15:24:46 Re:
Previous Message William Dunn 2015-05-18 15:00:34 Re: My index doesn't write anymore but read