From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "michael(dot)paquier(at)gmail(dot)com" <michael(dot)paquier(at)gmail(dot)com>, "david(at)pgmasters(dot)net" <david(at)pgmasters(dot)net>, "craig(at)2ndquadrant(dot)com" <craig(at)2ndquadrant(dot)com> |
Subject: | Re: Protect syscache from bloating with negative cache entries |
Date: | 2019-02-05 22:32:14 |
Message-ID: | 5897f173-6f0c-2b7e-f923-5f56e5a72a9f@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2/5/19 11:05 PM, Alvaro Herrera wrote:
> On 2019-Feb-05, Tomas Vondra wrote:
>
>> I don't think we need to remove the expired entries right away, if there
>> are only very few of them. The cleanup requires walking the hash table,
>> which means significant fixed cost. So if there are only few expired
>> entries (say, less than 25% of the cache), we can just leave them around
>> and clean them if we happen to stumble on them (although that may not be
>> possible with dynahash, which has no concept of expiration) of before
>> enlarging the hash table.
>
> I think seqscanning the hash table is going to be too slow; Ideriha-san
> idea of having a dlist with the entries in LRU order (where each entry
> is moved to head of list when it is touched) seemed good: it allows you
> to evict older ones when the time comes, without having to scan the rest
> of the entries. Having a dlist means two more pointers on each cache
> entry AFAIR, so it's not a huge amount of memory.
>
Possibly, although my guess is it will depend on the number of entries
to remove. For small number of entries, the dlist approach is going to
be faster, but at some point the bulk seqscan gets more efficient.
FWIW this is exactly where a bit of benchmarking would help.
>> So if we want to address this case too (and we probably want), we may
>> need to discard the old cache memory context someho (e.g. rebuild the
>> cache in a new one, and copy the non-expired entries). Which is a nice
>> opportunity to do the "full" cleanup, of course.
>
> Yeah, we probably don't want to do this super frequently though.
>
Right. I've also realized the resizing is built into dynahash and is
kinda incremental - we add (and split) buckets one by one, instead of
immediately rebuilding the whole hash table. So yes, this would need
more care and might need to interact with dynahash in some way.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2019-02-05 22:50:34 | Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages) |
Previous Message | Tom Lane | 2019-02-05 22:08:48 | Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages) |