Re: VACUUM FULL versus system catalog cache invalidation

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: VACUUM FULL versus system catalog cache invalidation
Date: 2011-08-12 19:25:43
Message-ID: 4E457E37.6020706@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12.08.2011 21:49, Robert Haas wrote:
> On Fri, Aug 12, 2011 at 2:09 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> 2. Forget about targeting catcache invals by TID, and instead just use the
>> key hash value to determine which cache entries to drop.
>>
>> Approach #2 seems a lot less invasive and more trustworthy, but it has the
>> disadvantage that cache invals would become more likely to blow away
>> entries unnecessarily (because of chance hashvalue collisions), even
>> without any VACUUM FULL being done. If we could make approach #1 work
>> reliably, it would result in more overhead during VACUUM FULL but less at
>> other times --- or at least we could hope so. In an environment where
>> lots of sinval overflows and consequent resets happen, we might come out
>> behind due to doubling the number of catcache flushes forced by a reset
>> event.
>>
>> Right at the moment I'm leaning to approach #2. I wonder if anyone
>> sees it differently, or has an idea for a third approach?
>
> I don't think it really matters whether we occasionally blow away an
> entry unnecessarily due to a hash-value collision. IIUC, we'd only
> need to worry about hash-value collisions between rows in the same
> catalog; and the number of entries that we have cached had better be
> many orders of magnitude less than 2^32. If the cache is large enough
> that we're having hash value collisions more than once in a great
> while, we probably should have flushed some entries out of it a whole
> lot sooner and a whole lot more aggressively, because we're likely
> eating memory like crazy.

What would suck, though, is if you have an application that repeatedly
creates and drops a temporary table, and the hash value for that happens
to match some other table in the database. catcache invalidation would
keep flushing the entry for that other table too, and you couldn't do
anything about it except for renaming one of the tables.

Despite that, +1 for option #2. The risk of collision seems acceptable,
and the consequence of a collision wouldn't be too bad in most
applications anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2011-08-12 19:33:49 Re: Further news on Clang - spurious warnings
Previous Message Kevin Grittner 2011-08-12 19:22:47 Re: VACUUM FULL versus system catalog cache invalidation