From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Are there plans to add data compression feature to postgresql? |
Date: | 2008-11-01 13:02:24 |
Message-ID: | 87bpwzfvkv.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> writes:
> But sorry I still can't get WHY compression as a whole and data
> integrity are mutually exclusive.
...
> Now on *average* the write operations should be faster so the risk
> you'll be hit by an asteroid during the time a fsync has been
> requested and the time it returns should be shorter.
> If you're not fsyncing... you've no warranty that your changes
> reached your permanent storage.
Postgres *guarantees* that as long as everything else works correctly it
doesn't lose data. Not that it minimizes the chances of losing data. It is
interesting to discuss hardening against unforeseen circumstances as well but
it's of secondary importance to first of all guaranteeing 100% that there is
no data loss in the expected scenarios.
That means Postgres has to guarantee 100% that if the power is lost mid-write
that it can recover all the data correctly. It does this by fsyncing logs of
some changes and depending on filesystems and drives behaving in certain ways
for others -- namely that a partially completed write will leave each byte
with either the new or old value. Compressed filesystems might break that
assumption making Postgres's guarantee void.
I don't know how these hypothetical compressed filesystems are implemented so
I can't say whether they work or not. When I first wrote the comment I was
picturing a traditional filesystem with each block stored compressed. That
can't guarantee anything like this.
However later in the discussion I mentioned that ZFS with an 8k block size
could actually get this right since it never overwrites existing data, it
always writes to a new location and then changes metadata pointers. I expect
ext3 with data=journal might also be ok. These both have to make performance
sacrifices to get there though.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
From | Date | Subject | |
---|---|---|---|
Next Message | Michelle Konzack | 2008-11-01 13:24:37 | Re: Equivalent for AUTOINCREMENT? |
Previous Message | Zdenek Kotala | 2008-11-01 08:09:35 | Re: PostgreSQL 8.3.4 Solaris x86 compilation issues |