Re: Single-file DBs WAS: Need concrete "Why Postgres

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Single-file DBs WAS: Need concrete "Why Postgres
Date: 2003-08-22 16:49:33
Message-ID: 3F46499D.8060209@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-general pgsql-hackers

Some well known database that is very popular amongst people who care
more for their data than for license fees uses few very big files that
are statically allocated (if using files instead of raw devices).

Sure does Oracle internally maintain some sort of filesystem. But this
is more due to other reasons.

If a filesystem contains only very few big files (and nothing else) and
these files do not grow or shrink during normal operation and are really
fully allocated in the block tables, then said filesystems metadata does
not change and that means that the filesystem will never ever be corrupt
from the OS's point of view (except due to hardware failure). Plus, an
FSCK on a filesystem with very few huge files is fast, really *fast*. So
in the case of an OS crash, your system is up in no time again, no
matter how big your database is.

From there the DB itself maintains it's own metadata and has control
with it's WAL and other mechanisms over what needs to be redone, undone
and turned around to get back into a consistent state.

Jan

Josh Berkus wrote:

> Guys,
>
>> >BTW any comments on storing an entire database in single file? I don't
>> > trust any file system for performance and data integrity if I have single
>> > 100GB file. I would rather have multiple of them..
>>
>> I don't see why not. Entire file systems are stored within a single file
>> sometimes. Examples: vmware, and IIRC UserMode Linux.
>
> Several database systems use a "single file" for data storage. The problem
> with this is that it's not really a single file .... it's a proprietary file
> system on top of the host file system. This sort of design makes a couple
> assumptions:
>
> 1) That the database is better than the host filesystem/OS and storage system
> at regulating its use of, and storage of, data files;
> 2) that your data file will not exceed the maximum file size for the host OS.
>
> Both of these assumptions are, IMHO, based on antiquated data (or on Windows).
> Modern *nix filesystems and RAID are very, very efficient at file access and
> only a database with the development budget of Oracle could hope to keep up.
> Additionally, databases larger than 2GB are becoming increasingly common.
>
> Single-file databases also introduce a number of problems:
>
> 1) The database file is extremely vulnerable to corruption, and if corruption
> occurs it is usually not localized but destroys the entire database due to
> corruption of the internal file structure. Recovery of raw data out of a
> damaged single-file database inevitably requires specialized tools if it is
> possible at all.
> 2) Often DBAs are prevented from using normal file operations on the database
> files to maintain their systems. For example, try moving a partition on an
> MS SQL Server installation. Go on, I double-dog dare you.
> 3) Due to the necessity of maintaining not only data and metadata, but a file
> partitioning structure as well, maintenance on single-file databases is often
> more time-consuming but at the same time more crucial (to prevent #1).
>

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Andrew Rawnsley 2003-08-22 19:48:45 Re: Single-file DBs WAS: Need concrete "Why Postgres
Previous Message Josh Berkus 2003-08-22 16:07:52 Re: Single-file DBs WAS: Need concrete "Why Postgres

Browse pgsql-general by date

  From Date Subject
Next Message Manfred Koizar 2003-08-22 16:54:05 Re: [HACKERS] Buglist
Previous Message Tom Lane 2003-08-22 16:37:35 Re: pg_dump and alter database

Browse pgsql-hackers by date

  From Date Subject
Next Message Manfred Koizar 2003-08-22 16:54:05 Re: [HACKERS] Buglist
Previous Message Jan Wieck 2003-08-22 16:21:50 Re: Buglist