Re: master and sync-replica diverging

From: Joshua Berkus <josh(at)agliodbs(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: master and sync-replica diverging
Date: 2012-05-17 12:32:30
Message-ID: 526616069.300852.1337257950539.JavaMail.root@mail-1.01.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Erik,

Are you taking the counts *while* the table is loading? In sync replication, it's possible for the counts to differ for a short time due to one of three things:

* transaction has been saved to the replica and confirm message hasn't reached the master yet
* replica has synched the transaction to the WAL log, but due to wal_delay settings hasn't yet applied it to the tables in memory.
* updating the master with synchronous_commit = local.

----- Original Message -----
> AMD FX 8120 / centos 6.2 / latest source (git head)
>
>
> It seems to be quite easy to force a 'sync' replica to not be equal
> to master by
> recreating+loading a table in a while loop.
>
>
> For this test I compiled+checked+installed three separate instances
> on the same machine. The
> replica application_name are names 'wal_receiver_$copy' where $copy
> is 01, resp. 02.
>
> $ ./sync_state.sh
> pid | application_name | state | sync_state
> -------+------------------+-----------+------------
> 19520 | wal_receiver_01 | streaming | sync
> 19567 | wal_receiver_02 | streaming | async
> (2 rows)
>
> port | synchronous_commit | synchronous_standby_names
> ------+--------------------+---------------------------
> 6564 | on | wal_receiver_01
> (1 row)
>
> port | synchronous_commit | synchronous_standby_names
> ------+--------------------+---------------------------
> 6565 | off |
> (1 row)
>
> port | synchronous_commit | synchronous_standby_names
> ------+--------------------+---------------------------
> 6566 | off |
> (1 row)
>
>
>
> The test consists of creating a table and loading tab-separated data
> from file with COPY and then
> taking the rowcount of that table (13 MB, almost 200k rows) in all
> three instances:
>
>
> # wget
> http://flybase.org/static_pages/downloads/FB2012_03/genes/fbgn_annotation_ID_fb_2012_03.tsv.gz
>
> slurp_file=fbgn_annotation_ID_fb_2012_03.tsv.gz
>
> zcat $slurp_file \
> | grep -v '^#' \
> | grep -Ev '^[[:space:]]*$' \
> | psql -c "
> drop table if exists $table cascade;
> create table $table (
> gene_symbol text
> , primary_fbgn text
> , secondary_fbgns text
> , annotation_id text
> , secondary_annotation_ids text
> );
> copy $table from stdin csv delimiter E'\t';
> ";
>
> # count on master:
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6564
>
> # count on wal_receiver_01 (sync replica):
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6565
>
> # count on wal_receiver_02 (async replica):
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6566
>
>
>
> I expected the rowcounts from master and sync replica to always be
> the same.
>
> Initially this seemed to be the case, but when I run the above
> sequence in a while loop for a few
> minutes about 10% of rowcounts from the sync-replica are not equal to
> the master.
>
> Perhaps not a likely scenario, but surely such a deviating rowcount
> on a sync replica should not
> be possible?
>
>
> thank you,
>
>
> Erik Rijkers
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Berkus 2012-05-17 12:42:39 Re: Strange issues with 9.2 pg_basebackup & replication
Previous Message Robert Haas 2012-05-17 12:30:18 Re: counting pallocs