Re: Analysis on backend-private memory usage (and a patch)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Analysis on backend-private memory usage (and a patch)
Date: 2013-09-05 17:29:35
Message-ID: 5228BF7F.3080806@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05.09.2013 17:22, Tom Lane wrote:
> Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> writes:
>> I ran pgbench for ten seconds, and printed the number of tuples in each
>> catcache after that:
>> [ very tiny numbers ]
>
> I find these numbers a bit suspicious. For example, we must have hit at
> least 13 different system catalogs, and more than that many indexes, in
> the course of populating the syscaches you show as initialized. How is
> it there are only 4 entries in the RELOID cache? I wonder if there were
> cache resets going on.

Relcache is loaded from the init file. The lookups of those system
catalogs and indexes never hit the syscache, because the entries are
found in relcache. When I delete the init file and launch psql, without
running any queries, I get this (caches with 0 tups left out):

LOG: cache id 45 on pg_class: 7 tups
LOG: cache id 32 on pg_index: 63 tups
LOG: cache id 21 on pg_database: 1 tups
LOG: cache id 11 on pg_authid: 1 tups
LOG: cache id 10 on pg_authid: 1 tups
LOG: cache id 2 on pg_am: 1 tups

> A larger issue is that pgbench might not be too representative. In
> a quick check, I find that cache 37 (OPERNAMENSP) starts out empty,
> and contains 1 entry after "select 2=2", which is expected since
> the operator-lookup code will start by looking for int4 = int4 and
> will get an exact match. But after "select 2=2::numeric" there are
> 61 entries, as a byproduct of having thumbed through every binary
> operator named "=" to resolve the ambiguous match. We went so far
> as to install another level of caching in front of OPERNAMENSP because
> it was getting too expensive to deal with heavily-overloaded operators
> like that one. In general, we've had to spend enough sweat on optimizing
> catcache searches to make me highly dubious of any claim that the caches
> are usually almost empty.
>
> I understand your argument that resizing is so cheap that it might not
> matter, but nonetheless reducing these caches as far as you're suggesting
> sounds to me to be penny-wise and pound-foolish. I'm okay with setting
> them on the small side rather than on the large side as they are now, but
> not with choosing sizes that are guaranteed to result in resizing cycles
> during startup of any real app.

Ok, committed the attached.

To choose the initial sizes, I put a WARNING into the rehash function,
ran the regression suite, and adjusted the sizes so that most regression
tests run without rehashing. With the attached patch, 18 regression
tests cause rehashing (see regression.diffs). The ones that do are
because they are exercising some parts of the system more than a typical
application would do: the enum regression test for example causes
rehashes of the pg_enum catalog cache, and the aggregate regression test
causes rehashing of pg_aggregate, and so on. A few regression tests do a
database-wide VACUUM or ANALYZE; those touch all relations, and cause a
rehash of pg_class and pg_index.

- Heikki

Attachment Content-Type Size
resize-syscaches-2.patch text/x-diff 15.2 KB
regression.diffs text/plain 20.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-09-05 17:30:34 missing SSI check in heapgettup_pagemode() ?
Previous Message ktm@rice.edu 2013-09-05 17:29:30 Re: get rid of SQL_ASCII?