From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | Kirill Bychik <kirill(dot)bychik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, amit(dot)kapila16(at)gmail(dot)com |
Subject: | Re: WAL usage calculation patch |
Date: | 2020-03-29 12:19:44 |
Message-ID: | 20200329121944.GA79261@nol |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 23, 2020 at 11:24:50PM +0900, Fujii Masao wrote:
>
> > Here are the comments for 0001 patch.
> >
> > + /*
> > + * Report a full page image constructed for the WAL record
> > + */
> > + pgWalUsage.wal_fp_records++;
> >
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.
Agreed, I went with fpw.
> > ISTM that this counter could be incorrect if XLogInsertRecord() determines to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?
Yes probably. I also see while adding support for EXPLAIN/auto_explain that
the previous approach was incrementing both records and fpw_records, while it
should be only one of those for each record. I fixed this using the approach I
previously mentionned in [1] which seems to work just fine.
> > + long wal_bytes; /* size of wal records produced */
> >
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?
Yes indeed. I switched to uint64, and modified everything accordingly (and
changed pgss to output numeric as there's no other way to handle unsigned int8)
> > + shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> >
> > bufusage_space should be walusage_space here?
Good catch, fixed.
> > /*
> > * Finish parallel execution. We wait for parallel workers to finish, and
> > * accumulate their buffer usage.
> > */
> >
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.
I went through all the file and quickly checked in other places, and I think I
fixed all required comments.
> Here are the comments for 0002 patch.
>
> + OUT wal_write_bytes int8,
> + OUT wal_write_records int8,
> + OUT wal_write_fp_records int8
>
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?
Agreed, I simply dropped the "_write" part everywhere.
> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
>
> PARALLEL SAFE should be specified?
Indeed, fixed.
> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
>
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?
As mentionned in other pgss thread, I think the general agreement is to never
provide full script anymore, so I didn't changed that.
> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
>
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...
It should ensure a FPW for each new block touch, but yes that's quite fragile.
Since I fixed the record / FPW record counters, I saw that this was actually
already broken as there was a mix of FPW and non-FPW, so I dropped the
checkpoint and just tested (wal_record + wal_fpw_record) instead.
> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
>
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?
As far as I can see it's used to test multiple scenario (single command /
multiple commands in or outside explicit transaction). It shouldn't add a lot
of overhead and since some commands are issues with "\;" it's also testing
proper query string isolation when multi-command query string is provided,
which doesn't seem like a bad idea. I didn't changed that but I'm not opposed
to remove some of the updates if needed.
Also, to answer Amit Kapila's comments about WAL records and parallel query, I
added support for both EXPLAIN and auto_explain (tab completion and
documentation are also updated), and using a simple table with an index, with
forced parallelism and no leader participation and concurrent update on the
same table, I could test that WAL usage is working as expected:
rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=8.695..47.592 rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
WAL: records=204 bytes=86198
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.056..29.112 rows=50005 loops
Output: id, val
WAL: records=204 bytes=86198
Worker 0: actual time=0.060..28.995 rows=49593 loops=1
WAL: records=105 bytes=44222
Worker 1: actual time=0.052..29.230 rows=50417 loops=1
WAL: records=99 bytes=41976
Planning Time: 0.038 ms
Execution Time: 53.957 ms
(14 rows)
and the same query when nothing end up being modified:
rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=9.413..48.187 rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.033..24.697 rows=50005 loops
Output: id, val
Worker 0: actual time=0.028..24.786 rows=50447 loops=1
Worker 1: actual time=0.038..24.609 rows=49563 loops=1
Planning Time: 0.282 ms
Execution Time: 55.643 ms
(10 rows)
So it seems to me that WAL usage infrastructure for parallel query is working
just fine. I added the EXPLAIN/auto_explain in a separate commit just in case.
Attachment | Content-Type | Size |
---|---|---|
v7-0001-Add-infrastructure-to-track-WAL-usage.patch | text/plain | 15.0 KB |
v7-0002-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut.patch | text/plain | 11.3 KB |
v7-0003-Keep-track-of-WAL-usage-in-pg_stat_statements.patch | text/plain | 22.0 KB |
v7-0004-Expose-WAL-usage-counters-in-verbose-auto-vacuum-.patch | text/plain | 3.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2020-03-29 12:31:41 | Re: WAL usage calculation patch |
Previous Message | Masahiko Sawada | 2020-03-29 11:44:24 | Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch) |