Quick Links

Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Federico Campoli <federico(at)brandwatch(dot)com>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject:	Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica
Date:	2013-06-11 21:25:03
Message-ID:	CAMkU=1y7QsLiT7zKMDR9mX5zdi8CrKdUQaF=hN52wNTvM3hUqA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

On Fri, Jun 7, 2013 at 6:08 AM, Federico Campoli <federico(at)brandwatch(dot)com>wrote:

> On 06/06/13 21:22, Jeff Janes wrote:
>
>>
>>
>> I'd probably approach this with a combination of "strace -T -ttt -p
>> <PID>" and "lsof -p <PID>" on the PID of the start-up process, to see
>> which file descriptors it is waiting for read or writes on, and what the
>> underlying names of the files for those file descriptors are.
>>
>> Cheers,
>>
>> Jeff
>>
>
> I've generated a strace from the startup process a new profile report and
> a new log file.
>
> It's quite big so you can download here
>
> http://d01.megashares.com/dl/**BqZLh5H/HS_data_lag.tar<http://d01.megashares.com/dl/BqZLh5H/HS_data_lag.tar>
>
> I've monitored the startup process lsof and the only files accessed are
> the recovering wal and the relation affected by the master's activity.
>

You didn't need to strace the entire operation, just a sampling for ~1
minute in the middle of it should be enough.

It looks like it is basically idle before 1370603193.846487, so I ignored
that part. Of the remaining 472 seconds covered (during which time it
replays ~19 WAL files), the sum of the system call times covers only 210
seconds, so the rest is presumably CPU time (although the error due to
rounding and instrumentation vagaries could be quite high).

Even during the active part, quite a bit of the time is still spent idle
("poll([{fd=3, events=POLLIN}], 1, 5000)"). And the biggest remaining
chunk is spent on "lseek(11,", but again I don't know how much to trust
that as any individual call is at the limit of measurability so it could
just be the accumulation of instrumentation errors, not real time.

zcat pg_HS_strace.out.gz | perl -lne '/<([\d.]+)>/ and $_ >
1370603193.846487 and print "$1\t$_"'|sort -rn|perl -lne '$sum+=$_; print
"$sum\t$_"' |less

zcat pg_HS_strace.out.gz | perl -lne '/ ([^,]+,).*<([\d.]+)>/ and $_ >
1370603193.846487 and $h{$1}+=$2; END {print "$h{$_}\t$_" foreach keys
%h}'|sort -g

For the warm stand-by, if I cut of the uninteresting parts at the beginning
and end (1370604295.966518 and
1370605296.458290), then it applies ~33 WAL files in 1000 seconds, so it
doesn't seem to be any faster than the hot standby.

Are you sure the strace covered the events of interest?

Do you have output of vmstat or sar for the period of interest? It would
be interesting to see what the user time and wait time are.

>
> Regarding the warm standby, I've repeated the test on my sandbox with the
> slave in warm standby and I've noticed a replication lag spike.
>
> This does not affect the production,same database version and architecture
> from debian package.
>

I'm not sure what you are saying here. Both HS and WS lag on your test
machine, but only HS does on the production? And the straces you posted
are from the test machine?

In any case, nothing really stands out here. Other than the idle periods,
no one thing seems to be dominating the time.

Cheers,

Jeff

In response to

Re: BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica at 2013-06-07 13:08:33 from Federico Campoli

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	jeff	2013-06-11 22:27:47	BUG #8225: logging options don't change after reload
Previous Message	Peter Eisentraut	2013-06-11 20:53:56	Re: Inconsistent PL error handling