Re: Heap truncation without AccessExclusiveLock (9.4)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Heap truncation without AccessExclusiveLock (9.4)
Date: 2013-05-16 17:01:16
Message-ID: CA+TgmoYVmV+5N+RY+W251JDqeZCh=sszFe+xMWW4EgPPc4hf+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 15, 2013 at 8:24 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I've been thinking for a while that we need some other system for
>> managing other kinds of invalidations. For example, suppose we want
>> to cache relation sizes in blocks. So we allocate 4kB of shared
>> memory, interpreted as an array of 512 8-byte entries. Whenever you
>> extend a relation, you hash the relfilenode and take the low-order 9
>> bits of the hash value as an index into the array. You increment that
>> value either under a spinlock or perhaps using fetch-and-add where
>> available.
>
> I'm not sure I believe the details of that.
>
> 1. 4 bytes is not enough to store the exact identity of the table that
> the cache entry belongs to, so how do you disambiguate?

You don't. The idea is that it's inexact. When a relation is
extended, every backend is forced to recheck the length of every
relation whose relfilenode hashes to the same array slot as the one
that was actually extended. So if you happen to be repeatedly
scanning relation A, and somebody else is repeatedly scanning relation
B, you'll *probably* not have to invalidate anything. But if A and B
happen to hash to the same slot, then you'll keep getting bogus
invalidations. Fortunately, that isn't very expensive.

The fast-path locking code uses a similar trick to detect conflicting
strong locks, and it works quite well. In that case, as here, you can
reduce the collision probability as much as you like by increasing the
number of slots, at the cost of increased shared memory usage.

> 2. If you don't find an entry for your target rel in the cache, aren't
> you still going to have to do an lseek?

Don't think of it as a cache. The caching happens inside each
backend's relcache; the shared memory structure is just a tool to
force those caches to be revalidated when necessary.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-05-16 17:15:42 Re: Heap truncation without AccessExclusiveLock (9.4)
Previous Message Amit Langote 2013-05-16 16:59:51 Re: Logging of PAM Authentication Failure