Re: Beginner Question:Why it always make sure that the postgres better than common csv file storage in disaster recovery?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Wen Yi <chuxuec(at)outlook(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Beginner Question:Why it always make sure that the postgres better than common csv file storage in disaster recovery?
Date: 2022-07-04 03:37:05
Message-ID: 1449790.1656905825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Wen Yi <chuxuec(at)outlook(dot)com> writes:
> Since it's all built on top of the file system,why it always make sure
> that the postgres better than common csv file storage in disaster
> recovery?

Sure, Postgres cannot be any more reliable than the filesystem it's
sitting on top of (nor the physical storage underneath that, etc etc).

However, if you're comparing to some program that just writes a
flat file in CSV format or the like, that program is probably
not even *trying* to offer reliable storage. Some things that
are likely missing:

* POSIX-compatible file systems promise nothing about the durability
of data that hasn't been successfully fsync'd. You need to issue
fsync's, and you need a plan about what to do if you crash between
writing some data and getting an fsync confirmation, because maybe
those bits are safely down on disk, or maybe they aren't, or maybe
just some of them are.

* If you did crash partway through an update, you'd like some
assurances that the user-visible state after recovery will be
what it was before starting the failed update. That CSV-using
program probably isn't even trying to do that. Getting back
to a consistent state after a crash typically involves some
scheme along the lines of replaying a write-ahead log.

* None of this is worth anything if you can't even tell the
difference between good data and bad data. CSV is pretty low
on redundancy --- not as bad as some formats, sure, but it's far
from checkable.

There's more to it than that, but if there's not any attention
to crash recovery then it's not what I'd call a database. The
filesystem alone won't promise much here.

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bogdan Siara 2022-07-04 08:33:45 Postgresql 13.7 hangs down
Previous Message Adrian Klaver 2022-07-04 03:31:57 Re: Beginner Question:Why it always make sure that the postgres better than common csv file storage in disaster recovery?