| From: | Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com> | 
|---|---|
| To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Add statistics to pg_stat_wal view for wal related parameter tuning | 
| Date: | 2020-10-20 02:31:11 | 
| Message-ID: | 35ef960128b90bfae3b3fdf60a3a860f@oss.nttdata.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
I think we need to add some statistics to pg_stat_wal view.
Although there are some parameter related WAL,
there are few statistics for tuning them.
I think it's better to provide the following statistics.
Please let me know your comments.
```
postgres=# SELECT * from pg_stat_wal;
-[ RECORD 1 ]-------+------------------------------
wal_records         | 2000224
wal_fpi             | 47
wal_bytes           | 248216337
wal_buffers_full    | 20954
wal_init_file       | 8
wal_write_backend   | 20960
wal_write_walwriter | 46
wal_write_time      | 51
wal_sync_backend    | 7
wal_sync_walwriter  | 8
wal_sync_time       | 0
stats_reset         | 2020-10-20 11:04:51.307771+09
```
1. Basic statistics of WAL activity
- wal_records: Total number of WAL records generated
- wal_fpi: Total number of WAL full page images generated
- wal_bytes: Total amount of WAL bytes generated
To understand DB's performance, first, we will check the performance
trends for the entire database instance.
For example, if the number of wal_fpi becomes higher, users may tune
"wal_compression", "checkpoint_timeout" and so on.
Although users can check the above statistics via EXPLAIN, auto_explain,
autovacuum and pg_stat_statements now,
if users want to see the performance trends  for the entire database,
they must recalculate the statistics.
I think it is useful to add the sum of the basic statistics.
2. WAL segment file creation
- wal_init_file: Total number of WAL segment files created.
To create a new WAL file may have an impact on the performance of
a write-heavy workload generating lots of WAL. If this number is 
reported high,
to reduce the number of this initialization, we can tune WAL-related 
parameters
so that more "recycled" WAL files can be held.
3. Number of when WAL is flushed
- wal_write_backend : Total number of WAL data written to the disk by 
backends
- wal_write_walwriter : Total number of WAL data written to the disk by 
walwriter
- wal_sync_backend : Total number of WAL data synced to the disk by 
backends
- wal_sync_walwriter : Total number of WAL data synced to the disk by 
walwrite
I think it's useful for tuning "synchronous_commit" and "commit_delay" 
for query executions.
If the number of WAL is flushed is high, users can know  
"synchronous_commit" is useful for the workload.
Also, it's useful for tuning "wal_writer_delay" and  
"wal_writer_flush_after" for wal writer.
If the number is high, users can change the parameter for performance.
4. Wait time when WAL is flushed
- wal_write_time : Total amount of time that has been spent in the 
portion of
                                  WAL data was written to disk by backend 
and walwriter, in milliseconds
                                 (if track-io-timing is enabled, 
otherwise zero.)
- wal_sync_time : Total amount of time that has been spent in the 
portion of
                                  WAL data was synced to disk by backend 
and walwriter, in milliseconds
                                 (if track-io-timing is enabled, 
otherwise zero.)
If the time becomes much higher, users can detect the possibility of 
disk failure.
Since users can see how much flush time occupies of the query execution 
time,
it may lead to query tuning and so on.
Best Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001_add_statistics_to_pg_stat_wal_view.patch | text/x-diff | 14.4 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Geoghegan | 2020-10-20 02:37:16 | Re: Deleting older versions in unique indexes to avoid page splits | 
| Previous Message | tsunakawa.takay@fujitsu.com | 2020-10-20 02:30:57 | RE: [POC] Fast COPY FROM command for the table with foreign partitions |