Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: RKN Sai Krishna <rknsaiforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-04-04 03:45:42
Message-ID: CALj2ACUkZQJ=Q_aXDMe1f14j0YefYQQe9O7w_P0YymkSELMTCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 2, 2022 at 5:05 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Sat, 2022-03-26 at 10:31 +0530, Bharath Rupireddy wrote:
> > Attaching v16 patch-set, only change in v16-0002-pg_walinspect.patch,
> > others remain the same.
>
> I looked more closely at this patch.

Thanks Jeff for reviewing this.

> * It seems that pg_get_wal_record() is not returning the correct raw
> data for the record. I tested with pg_logical_emit_message, and the
> message isn't there. pg_get_wal_record_info() uses XLogRecordGetData(),
> which seems closer to what I expect.
>
> * I'm a little unclear on the purpose of pg_get_wal_record(). What does
> it offer that the other functions don't?

My intention is to return the overall undecoded WAL record [5] i.e.
the data starting from XLogReadRecord's output [6] till length
XLogRecGetTotalLen(xlogreader);. Please see [7], where Andres agreed
to have this function, I also mentioned a possible use-case there.

pg_get_wal_record_info returns the main data of the WAL record
(xl_heap_delete, xl_heap_insert, xl_heap_multi_insert, xl_heap_update
and so on).

> * I don't think we need the stats at all. We can run GROUP BY queries
> on the results of pg_get_wal_records_info().

As identified in [1], SQL-version of stats function is way slower in
normal cases, hence it was agreed (by Andres, Kyotaro-san and myself)
to have a C-function for stats.

> * Include the xlinfo portion of the wal record in addition to the rmgr,
> like pg_waldump --stats=record shows. That way we can GROUP BY that as
> well.
>
> 5. For pg_get_wal_record and pg_get_wal_records, also return the xlinfo
> using rm_identify() if available.

Yes, that's already part of the description column (much like
pg_waldump does) and the users can still do it with GROUP BY and
HAVING clauses, see [4].

> * I don't think we need the changes to xlogutils.c. You calculate the
> end pointer based on the flush pointer, anyway, so we should never need
> to wait (and when I take it out, the tests still pass).
>
> 6. Remove changes to xlogutils.

As mentioned in [1], read_local_xlog_page_no_wait required because the
functions can still wait in read_local_xlog_page for WAL while finding
the first valid record after the given input LSN (the use case is
simple - just provide input LSN closer to server's current flush LSN,
may be off by 3 or 4 bytes).

Also, I tried to keep the changes minimal with the
read_local_xlog_page_guts static function. IMO, that shouldn't be a
problem.

> I think we can radically simplify it without losing functionality,
> unless I'm missing something.
>
> 1. Eliminate pg_get_wal_record(),
> pg_get_wal_records_info_till_end_of_wal(), pg_get_wal_stats(),
> pg_get_wal_stats_till_end_of_wal().
>
> 4. For pg_get_wal_records, if end_lsn is NULL, read until the end of
> WAL.

It's pretty much clear to the users with till_end_of_wal functions
instead of cooking many things into the same functions with default
values for input LSNs as NULL which also requires the functions to be
"CALLED ON NULL INPUT" types which isn't good. This was also suggested
by Andres, see [2], and I agree with it.

> 2. Rename pg_get_wal_record_info -> pg_get_wal_record
>
> 3. Rename pg_get_wal_records_info -> pg_get_wal_records

As these functions aren't returning the WAL record data, but info
about it (decoded data), I would like to retain the function names
as-is.

> 8. With only two functions in the API, it may even make sense to just
> make it a part of postgres rather than a separate module.

As said above, I would like to have till_end_of_wal versions. Firstly,
pg_walinspect functions may not be needed by everyone, the extension
provides a way for the users to install if required. Also, many
hackers have suggested new functions [3], but, right now the idea is
to get pg_walinspect onboard with simple-yet-useful functions and then
think of extending it with new functions later.

[1] https://www.postgresql.org/message-id/CALj2ACUvU2fGLw7keEpxZhGAoMQ9vrCPX-13hexQPoR%2BQRbuOw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/20220322180006.hgbsoldgbljyrcm7%40alap3.anarazel.de
[3] There are many functions we can add to pg_walinspect - functions
with wait mode for future WAL, WAL parsing, function to return all the
WAL record info/stats given a WAL file name, functions to return WAL
info/stats from historic timelines as well, function to see if the
given WAL file is valid and so on.
[4]
postgres=# select count(resource_manager), description, from
pg_get_wal_records_info('0/14E0568', '0/14F2568') group by description
having description like '%INSERT_LEAF%';
count | description
-------+---------------------
7 | INSERT_LEAF off 108
1 | INSERT_LEAF off 111
1 | INSERT_LEAF off 135
1 | INSERT_LEAF off 142
3 | INSERT_LEAF off 143
1 | INSERT_LEAF off 144
1 | INSERT_LEAF off 145
1 | INSERT_LEAF off 146
1 | INSERT_LEAF off 274
1 | INSERT_LEAF off 405
(10 rows)

[5]
/*
* The overall layout of an XLOG record is:
* Fixed-size header (XLogRecord struct)
* XLogRecordBlockHeader struct
* XLogRecordBlockHeader struct
* ...
* XLogRecordDataHeader[Short|Long] struct
* block data
* block data
* ...
* main data

[6]
XLogRecord *
XLogReadRecord(XLogReaderState *state, char **errormsg)
{
decoded = XLogNextRecord(state, errormsg);
if (decoded)
{
/*
* This function returns a pointer to the record's header, not the
* actual decoded record. The caller will access the decoded record
* through the XLogRecGetXXX() macros, which reach the decoded
* recorded as xlogreader->record.
*/
Assert(state->record == decoded);
return &decoded->header;
}

[7] https://www.postgresql.org/message-id/20220322180006.hgbsoldgbljyrcm7%40alap3.anarazel.de

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-04-04 04:06:40 Re: Defer selection of asynchronous subplans until the executor initialization stage
Previous Message Amit Kapila 2022-04-04 03:32:37 Re: Skipping logical replication transactions on subscriber side