Difference in the tablespace folders on primary and secondary nodes

From: Abhishek Bhola <abhishek(dot)bhola(at)japannext(dot)co(dot)jp>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Difference in the tablespace folders on primary and secondary nodes
Date: 2023-07-25 08:38:42
Message-ID: CAEDsCzhpM3PPc03HtQ_-aqvQ=gfC06qsS-LhAr+b+nNQX1OAwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I recently set up a Postgres15 master-slave (Primary-secondary) cluster on
2 nodes. At the time of starting, I ensured that data files are exactly the
same on both the nodes. The size of the DB is 1.5TB.
The directory structure on both nodes looks as follows:
```
- /PROD/datadg/
|
|-> /PROD/datadg/tablespace
| |-> /PROD/datadg/tablespace/tablespace1
| |-> /PROD/datadg/tablespace/tablespace2
|
|-> /PROD/datadg/data
| |-> /PROD/datadg/data/pg_tblspc
| | | -> /PROD/datadg/data/pg_tblspc/16432 ->
/PROD/datadg/tablespace/tablespace1
| | | -> /PROD/datadg/data/pg_tblspc/16433 ->
/PROD/datadg/tablespace/tablespace2

```
Almost a week later now, I see almost a 2GB size difference in the
tablespace folders on the 2 nodes. I also see some file count difference on
both the nodes.

`autovacuum` is on on both the nodes and there aren't any `idle in
transaction` queries on the slave node. Also there hasn't been any
disruption on the streaming replication. I did not get any error like `WAL
segment already removed` or so on the slave node. `pg_stat_replication` on
the master node also doesn't show anything out of the ordinary and the
`sent_lsn`, `write_lsn` and `flush_lsn` are regularly updated. I can not
see a difference in counts of most tables either, haven't verified for all
of them.
So my **first question** is:

* Why is there a difference in the files in the tablespace folder? I can
understand the difference in the modification timestamps, but some files
are just missing on the slave node.

Now if I were to run `vacuumdb` on the master node, there are chances that
the slave node will break and give an error like this
```
PANIC,XX000,"WAL contains references to invalid pages",,,,,"WAL redo at
875E/21A70BD0 for
Heap2/VISIBLE: cutoff xid 60350476 flags 0x01; blkref #0: rel
16405/16419/533716933, fork 2, blk 26; blkref #1: rel
16405/16419/533716933, blk 853758",,,,"","startup
```

In the case when slave node breaks, these are the steps I usually do to
bring the slave node back:
1) Start `pg_backup_start('backup')` on the master node
2) rsync the files from master to slave by running the following on the
slave node:
```
rsync -av --delete master_node:/PROD/datadg/data/ /PROD/datadg/data
--exclude 'pg_log' --exclude 'pg_replslot'
```
3. Stop `pg_backup_stop()` on master node

4. Start the slave node again and it usually works, even though the
tablespace files might not still be the same.

**Second question**:

* What is the best way to bring the slave node back? Is the `rsync` between
tablespaces required? And if yes, what is the best method to do it for very
large databases, something maybe as big as 30TB or more. I don't want to
`rsync` all the files even if the timestamp on them is different. So is a
command like this safe to do? Or should an option like `--checksum` be
used?
```
rsync -av --delete master_node:/PROD/datadg/tablespace/
/PROD/datadg/tablespace --size-only
```

**Third question:**

* Is it advised to run `vacuumdb` before or after bringing the slave node
back again?

--
_This correspondence (including any attachments) is for the intended
recipient(s) only. It may contain confidential or privileged information or
both. No confidentiality or privilege is waived or lost by any
mis-transmission. If you receive this correspondence by mistake, please
contact the sender immediately, delete this correspondence (and all
attachments) and destroy any hard copies. You must not use, disclose, copy,
distribute or rely on any part of this correspondence (including any
attachments) if you are not the intended
recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。_

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc Millas 2023-07-25 10:23:42 Re: suggestion about time based partitioning and hibernate
Previous Message jian he 2023-07-25 08:37:05 Re: [Beginner Question]Is there way to test the postgres's kernel function?