Re: Weird spikes in delay for async streaming replication on 9.1

From: John Scalia <jayknowsunix(at)gmail(dot)com>
To: "David F(dot) Skoll" <dfs(at)roaringpenguin(dot)com>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Weird spikes in delay for async streaming replication on 9.1
Date: 2015-02-26 16:06:50
Message-ID: 54EF449A.9030407@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


On 2/26/2015 11:00 AM, David F. Skoll wrote:
> On Thu, 26 Feb 2015 10:51:55 -0500
> John Scalia <jayknowsunix(at)gmail(dot)com> wrote:
>
>> OK, if it's asynchronous, is your script checking that primary isn't
>> holding up closing out and transmitting the latest WAL segment during
>> these times?
> No. The checking script is dead simple. It's a perl script that does
> more-or-less the equivalent of this. $master_dbi is a DBI handle
> pointing to the master database server and $standby_dbi is one pointing
> to the standby. Pseudocode is shown below (the real code actually measures
> the duration in much finer increments than 1s, but you get the idea...)
>
> my $time = time();
>
> # There's one row in the table with key column 'Timestamp'
> $master_dbi->do("UPDATE table SET value = ? WHERE key = 'Timestamp'", undef, $time);
>
> my $start = time();
> while(..) {
> $row = $standby_dbi->fetchrow_arrayref("SELECT * FROM table WHERE key = 'Timestamp'");
> # Check that we got back the same $time we put in and if so,
> # break out of the while loop; otherwise pause for a bit.
> }
> # See how long that took
> my $duration = time() - $start;
>
>> And if the Standby really needs to be up to date, why
>> not try synchronous replication?
> It doesn't need to be totally up-to-date. It can lag by up to 30s without
> impacting our application. And we don't use synchronous replication because
> then a network problem would stall all the write transactions on the primary
> and that would be fatal.
Possibly, but that's why WE use a pair of standby servers, not a single one, so that all transactions get committed in a timely manner. The odds of both standbys failing at the
same time are really small. Maybe your script should check which is latest WAL segment on each system first? That might show that you have a timedelay with getting the info to the
standby.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message David F. Skoll 2015-02-26 16:11:52 Re: Weird spikes in delay for async streaming replication on 9.1
Previous Message David F. Skoll 2015-02-26 16:00:07 Re: Weird spikes in delay for async streaming replication on 9.1