From: | Lukas Fittl <lukas(at)fittl(dot)com> |
---|---|
To: | Jelte Fennema <me(at)jeltef(dot)nl> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Sampling-based timing for EXPLAIN ANALYZE |
Date: | 2023-01-17 10:50:40 |
Message-ID: | CAP53Pkw97c7q-8e6BBab4aXS=1ROMtKpnHEeK0RDnPx7yiirog@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 6, 2023 at 1:19 AM Jelte Fennema <me(at)jeltef(dot)nl> wrote:
> Nice addition! And the code looks pretty straight forward.
>
Thanks for reviewing!
The current patch triggers warnings:
> https://cirrus-ci.com/task/6016013976731648 Looks like you need to add
> void as the argument.
>
Fixed in v2 attached. This also adds a simple regression test, as well as
fixes the parallel working handling.
Do you have some performance comparison between TIMING ON and TIMING
> SAMPLING?
>
Here are some benchmarks of auto_explain overhead on my ARM-based M1
Macbook for the following query run with pgbench on a scale factor 100 data
set:
SELECT COUNT(*) FROM pgbench_branches JOIN pgbench_accounts USING (bid)
JOIN pgbench_tellers USING (bid) WHERE bid = 42;
(the motivation is to use a query that is more complex than the standard
pgbench select-only test query)
avg latency (best of 3), -T 300, -c 4, -s 100, shared_buffers 2GB, fsync
off, max_parallel_workers_per_gather 0:
master, log_timing = off: 871 ms (878 / 877 / 871)
patch, log_timing = off: 869 ms (882 / 880 / 869)
patch, log_timing = on: 890 ms (917 / 930 / 890)
patch, log_timing = sampling, samplefreq = 1000: 869 ms (887 / 869 / 894)
Additionally, here is Andres' benchmark from [1], with the sampling option
added:
% psql -Xc 'DROP TABLE IF EXISTS t; CREATE TABLE t AS SELECT * FROM
generate_series(1, 100000) g(i);' postgres && pgbench -n -r -t 100 -f
<(echo -e "SELECT COUNT(*) FROM t;EXPLAIN (ANALYZE, TIMING OFF) SELECT
COUNT(*) FROM t;EXPLAIN (ANALYZE, TIMING SAMPLING) SELECT COUNT(*) FROM
t;EXPLAIN (ANALYZE, TIMING ON) SELECT COUNT(*) FROM t;") postgres |grep '^ '
DROP TABLE
SELECT 100000
3.507 0 SELECT COUNT(*) FROM t;
3.476 0 EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*)
FROM t;
3.576 0 EXPLAIN (ANALYZE, TIMING SAMPLING) SELECT
COUNT(*) FROM t;
5.096 0 EXPLAIN (ANALYZE, TIMING ON) SELECT COUNT(*)
FROM t;
My pg_test_timing data for reference:
% pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 23.65 ns
Histogram of timing durations:
< us % of total count
1 97.64472 123876325
2 2.35421 2986658
4 0.00022 277
8 0.00016 202
16 0.00064 815
32 0.00005 64
In InstrStartSampling there's logic to increase/decrease the frequency of
> an already existing timer. It's not clear to me when this can occur. I'd
> expect sampling frequency to remain constant throughout an explain plan. If
> it's indeed needed, I think a code comment would be useful to explain why
> this edge case is necessary.
>
Clarified in a code comment in v2. This is needed for handling nested
statements which could have different sampling frequencies for each nesting
level, i.e. a function might want to sample it's queries at a higher
frequency than its caller.
Thanks,
Lukas
[1] https://postgr.es/m/20230116213913.4oseovlzvc2674z7%40awork3.anarazel.de
--
Lukas Fittl
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Add-TIMING-SAMPLING-option-for-EXPLAIN-ANALYZE.patch | application/octet-stream | 37.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Takamichi Osumi (Fujitsu) | 2023-01-17 11:00:11 | RE: Time delayed LR (WAS Re: logical replication restrictions) |
Previous Message | Drouvot, Bertrand | 2023-01-17 10:48:19 | Helper functions for wait_for_catchup() in Cluster.pm |