Re: [HACKERS] md.c is feeling much better now, thank you

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, "Vadim Mikheev" <vadim(at)krs(dot)ru>
Subject: Re: [HACKERS] md.c is feeling much better now, thank you
Date: 1999-09-03 05:50:06
Message-ID: 12033.936337806@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
>> Doesn't work though: the ALTER TABLE tests in regress/misc fail.
>> Apparently, this change causes the sinval report from update of the
>> relation's pg_class heap entry to be read while the relation has refcnt>0,
>> so RelationFlushRelation doesn't flush it, so we have an obsolete
>> relation cache entry. Ooops.

> How about inserting "RelationDecrementReferenceCount(relation);"
> between LockAcquire() and DiscardInvalid() ?

Would only help if the relation had been opened exactly once before
the lock; not if its refcnt is greater than 1. Worse, it would only
help for the particular relation being locked, but we might receive
an sinval report for a different already-locked relation.

> And isn't it preferable that LockRelation() returns the relation
> which RelationIdGetRelation(tag.relId) returns ?

No, because that would only inform the immediate caller of LockRelation
of a change. This is insufficient for both the reasons mentioned above.
For that matter, my first-cut patch is insufficient, because it
won't detect the case that a relcache entry other than the one
currently being locked has been flushed.

I think what we need to do is revise RelationFlushRelation so that
it (a) deletes the relcache entry if its refcnt is zero; otherwise
(b) leaves the relcache entry in existence, but recomputes all
its contents and subsidiary data structures, and (c) if, while
trying to recompute the contents, it finds that the relation has
actually been deleted, then it's elog(ERROR) time. In this way,
existing pointers to the relcache entry --- which might belong to
routines very far down the call stack --- remain valid, or else
we elog() if they aren't.

We might still have a few bugs with routines that copy data out of the
relcache entry before locking it, but I don't recall having seen any
code like that. Most of the code seems to do heap_open immediately
followed by LockRelation, and that should be safe.

I'd like to hear Vadim's opinion before wading in, but this seems
like it ought to be workable.

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-09-03 05:59:28 Re: [HACKERS] SELECT BUG
Previous Message Hiroshi Inoue 1999-09-03 04:41:19 RE: [HACKERS] md.c is feeling much better now, thank you