RE: [HACKERS] mdnblocks is an amazing time sink in huge relations

From: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
To: "Vadim Mikheev" <vadim(at)krs(dot)ru>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: RE: [HACKERS] mdnblocks is an amazing time sink in huge relations
Date: 1999-10-19 10:03:22
Message-ID: 000801bf1a19$2d88ae20$2801007e@cadzone.tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Tom Lane wrote:
> >
> > >> a shared cache for system catalog tuples, which might be a
> win but I'm
> > >> not sure (I'm worried about contention for the cache,
> especially if it's
> > >> protected by just one or a few spinlocks). Anyway, if we
> did have one
>
> Commercial DBMSes have this... Isn't it a good reason? -:)
>
> > > But there would be a problem if we use shared catalog cache.
> > > Being updated system tuples are only visible to an updating backend
> > > and other backends should see committed tuples.
> > > On the other hand,an accurate block count should be visible to all
> > > backends.
> > > Which tuple of a row should we load to catalog cache and update ?
> >
> > Good point --- rolling back a transaction would cancel changes to the
> > pg_class row, but it mustn't cause the relation's file to get truncated
> > (since there could be tuples of other uncommitted transactions in the
> > newly added block(s)).
> >
> > This says that having a block count column in pg_class is the Wrong
> > Thing; we should get rid of relpages entirely. The Right Thing is a
> > separate data structure in shared memory that stores the current
> > physical block count for each active relation. The first backend to
> > touch a given relation would insert an entry, and then subsequent
> > extensions/truncations/deletions would need to update it. We already
> > obtain a special lock when extending a relation, so seems like there'd
> > be no extra locking cost to have a table like this.
>
> I supposed that each backend will still use own catalog
> cache (after reading entries from shared one) and synchronize
> shared/private caches on commit - e.g. update reltuples!
> relpages will be updated immediately after physical changes -
> what's problem with this?
>

Does this mean the following ?

1. shared cache holds committed system tuples.
2. private cache holds uncommitted system tuples.
3. relpages of shared cache are updated immediately by
phisical change and corresponding buffer pages are
marked dirty.
4. on commit, the contents of uncommitted tuples except
relpages,reltuples,... are copied to correponding tuples
in shared cache and the combined contents are
committed.

If so,catalog cache invalidation would be no longer needed.
But synchronization of the step 4. may be difficult.

Regards.

Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 1999-10-19 10:53:41 Re: New developer globe
Previous Message Hiroshi Inoue 1999-10-19 09:44:36 RE: [HACKERS] mdnblocks is an amazing time sink in huge relations