Re: R: DB on mSATA SSD

From: William Dunn <dunnwjr(at)gmail(dot)com>
To: Job <Job(at)colliniconsulting(dot)it>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: R: DB on mSATA SSD
Date: 2015-04-23 17:52:53
Message-ID: CAEva=Vk7aZMtxcMjaJ4wWiGOzdeKPS_Qv5wuBuGU-OdWYfCLBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Additional things to consider for decreasing pressure on the cheap drives:

- Another configuration parameter to look into
is effective_io_concurrency. For SSD we typically set it to 1 io per
channel of controller card not including the RAID parity drives. If you
decrease this value PostgreSQL will not try to initiate as many parallel io
operations.
http://www.postgresql.org/docs/8.4/static/runtime-config-resource.html

- A strategy Instagram used is to (manually) vertically partition tables
that have some, but not all, columns updated frequently. When PostgreSQL
updates a data value for a column it writes a new copy of the entire row
and marks the original row as garbage. If you have tables with many rows
but many of them are updated infrequently (Instagram's example was a table
with user information where "last login" was updated very frequently but
other information about the user was rarely updated) you could split the
frequently updated columns into a separate table to reduce io. However note
that PostgreSQL does not have features to support vertical partitioning
directly so to employ this technique you would need to manually partition
and update your SQL code accordingly.

- Also check into your WAL and logging settings to make sure you aren't
writing more to them than you need to.

*Will J Dunn*

On Thu, Apr 23, 2015 at 10:38 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
wrote:

> On 4/23/15 8:36 AM, Job wrote:
>
>> Hello, thank you first of all for your wonder help!
>>
>> Tomas, regarding:
>>
>> There are ways to make the writes less frequent, both at the database
>>> and OS level. We don't know what's your PostgreSQL config, but making
>>> the checkpoints less frequent and tuning the kernel/mount options may
>>> help a lot.
>>>
>>
>> We can raise up checkpoints, at kernel-side which options do you suggest
>> to tune for minimize disk writing?
>>
>
> Certainly disable atime updates if you haven't already.
>
> Having a long checkpoint period is somewhat similar to running in a ram
> disk and doing periodic backups. BTW, if you want to try using backups, I
> recommend you setup actual PITR archiving to the SSD. That will write data
> sequentially and in larger chunks, which should help the SSD better deal
> with the writes. This will give you more control over how much data you
> lose during a crash. Though, if all you do is a single large update once a
> day you're probably better off just taking a backup right after the update.
> I would also look at the backup size and recovery time of pg_dump vs PITR
> or a filesystem snapshot; it could be significantly smaller. It might take
> longer to restore though.
>
> BTW, if you go the ramdisk route you should turn off fsync; there's no
> point in the extra calls to the kernel. Only do that if the ENTIRE database
> is in a ramdisk though.
>
> We have a table, about 500Mb, that is updated and written every day.
>> When machines updates, table is truncated and then re-populated with
>> pg_bulk.
>> But i think we strongly writes when importing new data tables..
>>
>
> That depends on how much data has actually changed. If most of the data in
> the table is changed then truncate and load will be your best bet. OTOH if
> relatively little of the data has changed you'll probably get much better
> results by loading the data into a loading table and then updating changed
> data, deleting data that shouldn't be there anymore, and inserting new data.
>
> You definitely want the loading table to not be on SSD, and to be
> unlogged. That means it needs to go in a tablespace on a ram disk. True
> temporary tables (CREATE TEMP TABLE) are not logged, but I don't know that
> that will work well with pg_bulk. You can use a real table with the
> unlogged option to the same effect (though, I'm not sure if unlogged is
> available in 8.4).
>
> You also need to consider the indexes. First, make absolutely certain you
> need all of them. Get rid of every one that isn't required. Second, you
> need to test the amount of data that's written during an update with the
> indexes in place *and doing a subsequent VACCUM* compared to dropping all
> the indexes and re-creating them.
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2015-04-23 18:08:00 Re: Moving Specific Data Across Schemas Including FKs
Previous Message Cory Tucker 2015-04-23 17:47:19 Re: Moving Specific Data Across Schemas Including FKs