Re: ARC Memory Usage analysis

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, josh(at)agliodbs(dot)com
Subject: Re: ARC Memory Usage analysis
Date: 2004-10-22 19:35:49
Message-ID: 41796115.2020501@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches pgsql-performance

On 10/22/2004 2:50 PM, Simon Riggs wrote:

> I've been using the ARC debug options to analyse memory usage on the
> PostgreSQL 8.0 server. This is a precursor to more complex performance
> analysis work on the OSDL test suite.
>
> I've simplified some of the ARC reporting into a single log line, which
> is enclosed here as a patch on freelist.c. This includes reporting of:
> - the total memory in use, which wasn't previously reported
> - the cache hit ratio, which was slightly incorrectly calculated
> - a useful-ish value for looking at the "B" lists in ARC
> (This is a patch against cvstip, but I'm not sure whether this has
> potential for inclusion in 8.0...)
>
> The total memory in use is useful because it allows you to tell whether
> shared_buffers is set too high. If it is set too high, then memory usage
> will continue to grow slowly up to the max, without any corresponding
> increase in cache hit ratio. If shared_buffers is too small, then memory
> usage will climb quickly and linearly to its maximum.
>
> The last one I've called "turbulence" in an attempt to ascribe some
> useful meaning to B1/B2 hits - I've tried a few other measures though
> without much success. Turbulence is the hit ratio of B1+B2 lists added
> together. By observation, this is zero when ARC gives smooth operation,
> and goes above zero otherwise. Typically, turbulence occurs when
> shared_buffers is too small for the working set of the database/workload
> combination and ARC repeatedly re-balances the lengths of T1/T2 as a
> result of "near-misses" on the B1/B2 lists. Turbulence doesn't usually
> cut in until the cache is fully utilized, so there is usually some delay
> after startup.
>
> We also recently discussed that I would add some further memory analysis
> features for 8.1, so I've been trying to figure out how.
>
> The idea that B1, B2 represent something really useful doesn't seem to
> have been borne out - though I'm open to persuasion there.
>
> I originally envisaged a "shadow list" operating in extension of the
> main ARC list. This will require some re-coding, since the variables and
> macros are all hard-coded to a single set of lists. No complaints, just
> it will take a little longer than we all thought (for me, that is...)
>
> My proposal is to alter the code to allow an array of memory linked
> lists. The actual list would be [0] - other additional lists would be
> created dynamically as required i.e. not using IFDEFs, since I want this
> to be controlled by a SIGHUP GUC to allow on-site tuning, not just lab
> work. This will then allow reporting against the additional lists, so
> that cache hit ratios can be seen with various other "prototype"
> shared_buffer settings.

All the existing lists live in shared memory, so that dynamic approach
suffers from the fact that the memory has to be allocated during ipc_init.

What do you think about my other theory to make C actually 2x effective
cache size and NOT to keep T1 in shared buffers but to assume T1 lives
in the OS buffer cache?

Jan

>
> Any thoughts?
>
>
>
> ------------------------------------------------------------------------
>
> Index: freelist.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/freelist.c,v
> retrieving revision 1.48
> diff -d -c -r1.48 freelist.c
> *** freelist.c 16 Sep 2004 16:58:31 -0000 1.48
> --- freelist.c 22 Oct 2004 18:15:38 -0000
> ***************
> *** 126,131 ****
> --- 126,133 ----
> if (StrategyControl->stat_report + DebugSharedBuffers < now)
> {
> long all_hit,
> + buf_used,
> + b_hit,
> b1_hit,
> t1_hit,
> t2_hit,
> ***************
> *** 155,161 ****
> }
>
> if (StrategyControl->num_lookup == 0)
> ! all_hit = b1_hit = t1_hit = t2_hit = b2_hit = 0;
> else
> {
> b1_hit = (StrategyControl->num_hit[STRAT_LIST_B1] * 100 /
> --- 157,163 ----
> }
>
> if (StrategyControl->num_lookup == 0)
> ! all_hit = buf_used = b_hit = b1_hit = t1_hit = t2_hit = b2_hit = 0;
> else
> {
> b1_hit = (StrategyControl->num_hit[STRAT_LIST_B1] * 100 /
> ***************
> *** 166,181 ****
> StrategyControl->num_lookup);
> b2_hit = (StrategyControl->num_hit[STRAT_LIST_B2] * 100 /
> StrategyControl->num_lookup);
> ! all_hit = b1_hit + t1_hit + t2_hit + b2_hit;
> }
>
> errcxtold = error_context_stack;
> error_context_stack = NULL;
> ! elog(DEBUG1, "ARC T1target=%5d B1len=%5d T1len=%5d T2len=%5d B2len=%5d",
> T1_TARGET, B1_LENGTH, T1_LENGTH, T2_LENGTH, B2_LENGTH);
> ! elog(DEBUG1, "ARC total =%4ld%% B1hit=%4ld%% T1hit=%4ld%% T2hit=%4ld%% B2hit=%4ld%%",
> all_hit, b1_hit, t1_hit, t2_hit, b2_hit);
> ! elog(DEBUG1, "ARC clean buffers at LRU T1= %5d T2= %5d",
> t1_clean, t2_clean);
> error_context_stack = errcxtold;
>
> --- 168,187 ----
> StrategyControl->num_lookup);
> b2_hit = (StrategyControl->num_hit[STRAT_LIST_B2] * 100 /
> StrategyControl->num_lookup);
> ! all_hit = t1_hit + t2_hit;
> ! b_hit = b1_hit + b2_hit;
> ! buf_used = T1_LENGTH + T2_LENGTH;
> }
>
> errcxtold = error_context_stack;
> error_context_stack = NULL;
> ! elog(DEBUG1, "shared_buffers used=%8ld cache hits=%4ld%% turbulence=%4ld%%",
> ! buf_used, all_hit, b_hit);
> ! elog(DEBUG2, "ARC T1target=%5d B1len=%5d T1len=%5d T2len=%5d B2len=%5d",
> T1_TARGET, B1_LENGTH, T1_LENGTH, T2_LENGTH, B2_LENGTH);
> ! elog(DEBUG2, "ARC total =%4ld%% B1hit=%4ld%% T1hit=%4ld%% T2hit=%4ld%% B2hit=%4ld%%",
> all_hit, b1_hit, t1_hit, t2_hit, b2_hit);
> ! elog(DEBUG2, "ARC clean buffers at LRU T1= %5d T2= %5d",
> t1_clean, t2_clean);
> error_context_stack = errcxtold;
>
>
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-10-22 20:09:52 Hiding GUC variables from non-superusers
Previous Message Simon Riggs 2004-10-22 19:13:16 Re: code question: storing INTO relation

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2004-10-22 20:08:40 initdb NLS on win32
Previous Message Simon Riggs 2004-10-22 18:50:59 ARC Memory Usage analysis

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Sullivan 2004-10-22 19:40:58 Re: Performance Anomalies in 7.4.5
Previous Message Roger Ging 2004-10-22 19:06:30 Slow query