DB logging (was: Problem with the numbers I reported yesterday)

From: jwieck(at)debis(dot)com (Jan Wieck)
To: kgor(at)inetspace(dot)com (Kent S(dot) Gordon)
Cc: ocie(at)paracel(dot)com, maillist(at)candle(dot)pha(dot)pa(dot)us, boersenspiel(at)vocalweb(dot)de, pgsql-hackers(at)postgreSQL(dot)org
Subject: DB logging (was: Problem with the numbers I reported yesterday)
Date: 1998-02-14 13:58:05
Message-ID: m0y3i6Y-000BFRC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kent wrote:
>
> I do not think that pg_log is used like a normal 'log' device in other
> databases. My quick look at the code looks like pg_log only has a
> list of transactions and not the actual data blocks. Notice that
> TransRecover is commented out in backent/access/transam/transam.c.
>
> Most database log has the before images and after images of any page
> that has been modified in a transaction followed by commit/abort
> record. This allows for only this file to have to be synced. The
> rest of the database can float (generally checkpoints are done every
> so often to reduce recover time). The method of recovering from a
> crash is to replay the log from the last checkpoint until the end of
> the log by applying the before/after images (as needed based on
> weather the transaction commited) to the actual database relations.
>
> I would appreciate anyone correcting any mistakes in my understanding
> of how postgres works.
>
> > Ocie Mitchell
>
> Kent S. Gordon
> Architect
> iNetSpace Co.
> voice: (972)851-3494 fax:(972)702-0384 e-mail:kgor(at)inetspace(dot)com
>
>

Totally right, PostgreSQL doesn't have a log mechanism that
collects all the information to recover a corrupted database
from a backup.

I hacked around on that a little bit.

When doing a complete after image logging, that is taking all
the tuples that are stored on insert/update, the tuple id's
of deletes plus the information about transaction id's that
commit, the regression tests produce log data that is more
than the size of the final regression database. The
performance increase when only syncing the log- and
controlfiles (2 control files on different devices and the
logfile on a different device from the database files) and
running the backends with -F is about 15-20% for the
regression test.

I thought this is far too much logging data and so I didn't
spent much time trying to implement a recovery. But as far as
I got it I can tell that the updates to system catalogs and
keeping the indices up to date will be really tricky.

Another possible log mechanism I'll try sometimes after v6.3
release is to log the queries and data from copy commands
along with informations about Oid and Tid allocations.

Until later, Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 1998-02-14 14:12:14 Re: [HACKERS] v6.3 release ToDo list and supported ports
Previous Message Tom I Helbekkmo 1998-02-14 12:34:15 Re: [HACKERS] Re: [PORTS] v6.3 release ToDo list and supported ports