Re: Datatypes and performance

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
Cc: Mattias Kregert <mattias(at)kregert(dot)se>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, PostgreSQL List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Datatypes and performance
Date: 2003-07-21 02:30:56
Message-ID: 200307210230.h6L2UuQ14338@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Lincoln Yeoh wrote:
> Just because all reported commits go in doesn't mean there won't be any
> data loss.
>
> If you pull the power cord, the DB should be in a consistent state barring
> hardware and other issues. However there can be data loss, because the
> clients might have been inserting data which may not be able to be
> reproduced again, but which was not committed nor reported as committed,
> and the clients may no longer have a copy of the data. So how does the app
> or user deal with that? That can determine if data is lost holistically.
>
> For example, if you were in the process of inserting 1 million rows from a
> raw source, haven't committed and someone pulls the plug, the transaction
> will not be committed. Or a prospective customer clicks "Submit Order", and
> you pull the plug just then, customer sees the browser spin for a minute or
> so, times out, gives up goes somewhere else, and you don't get the order.

Well, the user wasn't informed the operation succeeded, so I would
assume they would try the operation again later.

> Given I've seen lots of things not work as they should, 100% guarantees
> seem like wishful thinking. I'd just believe that with Fsync on, the
> probability of commits being saved to disk is very high. Whereas with fsync

With fsync on, there is no probability --- if the drive properly reports
fsync (or guarantees it in case of a power failure) and the hardware
doesn't fail, your data is 100% safe on restart. With fsync off, if
there was any database activity before the failure (within 5 minutes),
there almost certainly was data in the kernel disk buffers that didn't
make it to disk, and the database will be consistent on restart.

On failure, I am talking about power, OS, or hardware failures.

> off, the probability is a lot lower especially if the O/S is prone to
> taking 30 seconds or longer to sync outstanding data to disk.
>
> Also, if power was lost in the middle of an fsync I'm not sure what
> actually happens. I think there are some checks, but they often assume
> atomicity of an operation at a particular level. Even if that holds true,
> the hardware could still fail on you - writing something whilst running out
> of power is not a good situation.

If failure is during fsync, we haven't yet reported the transaction as
complete.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2003-07-21 04:21:35 Re: Rules and actions involving multiple rows
Previous Message Bruce Momjian 2003-07-21 02:10:20 Re: Datatypes and performance