From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | Fujii Masao <fujii(at)postgresql(dot)org>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [COMMITTERS] pgsql: Add regression tests for multiple synchronous standbys. |
Date: | 2016-04-14 04:22:38 |
Message-ID: | CAB7nPqT6GGLwPzyPOR7BYVjYUkD=S2bA2QT6XhWB1WixWy+7BA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
On Thu, Apr 14, 2016 at 11:29 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Apr 13, 2016 at 4:54 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Fri, Apr 8, 2016 at 4:49 PM, Fujii Masao <fujii(at)postgresql(dot)org> wrote:
>>> Add regression tests for multiple synchronous standbys.
>>>
>>> Authors: Suraj Kharage, Michael Paquier, Masahiko Sawada, refactored by me
>>> Reviewed-By: Kyotaro Horiguchi
>>
>> Well, we are not quite there yet:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2016-04-12%2016%3A00%3A06
>>
>> # Running: pg_ctl -D
>> /home/buildfarm/data/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/data_master_Qmuz/pgdata
>> reload
>> server signaled
>> not ok 2 - asterisk in synchronous_standby_names
>>
>> # Failed test 'asterisk in synchronous_standby_names'
>> # at t/007_sync_rep.pl line 26.
>> # got: 'standby1|1|sync
>> # standby2|1|potential
>> # standby3|0|async'
>> # expected: 'standby1|1|sync
>> # standby2|1|potential
>> # standby3|1|potential'
>
> This seems to be a timing issue.
>
> There can be small window after SIGHUP is sent before walsender updates
> its priority based on new s_s_names. If pg_stat_replication is checked
> before that update, it displays unexpected output. Probably we need to
> sleep a few second after pg_ctl reload before pg_stat_replication.
Yes. I'd prefer avoid a hardcoded sleep and have something that relies
on lookups of pg_stat_replication though, because there is no way to
be sure that a sleep will ever be stable on heavily loaded machines,
like the machines I am specialized in maintaining :)
It kills a bit the purpose on having checks on pg_stat_replication as
the validation tests are based on that, still I think that we had
better base those checks on a loop that has a timeout, something like
that in a subroutine:
test_passed = 0;
while ($timeout < 30)
{
$result = $node->psql('SELECT blah FROM pg_stat_replication');
if ($result eq $expected)
{
test_passed = 1;
break;
}
sleep 1;
$timeout++;
}
ok($test_passed, $test_name);
What do you think?
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2016-04-14 04:48:45 | Re: pgsql: Provide errno-translation wrappers around bind() and listen() on |
Previous Message | Tom Lane | 2016-04-14 03:33:43 | pgsql: Fix broken dependency-mongering for index operator classes/famil |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2016-04-14 04:24:34 | Re: Support for N synchronous standby servers - take 2 |
Previous Message | Kyotaro HORIGUCHI | 2016-04-14 04:11:10 | Re: Support for N synchronous standby servers - take 2 |