From: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Adding NOTICE for differences between estimated and actual rows |
Date: | 2024-09-05 16:05:06 |
Message-ID: | aaf18433-82d3-417b-a1f5-12e62fedffb8@tantorlabs.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
It's quite common that poor query performance can be attributed to
inaccurate row estimations by the planner. To make it easier to detect
these discrepancies, rather than scrutinizing the estimates manually, it
would be helpful to output a dedicated |NOTICE| message.
In the current patch, I've introduced a new GUC parameter called
'estimated_rows_scale_factor'. If the ratio of the estimated rows to the
actual rows is less than this factor, or if the estimated rows
significantly exceed the actual rows (when the ratio is greater than
this factor), a NOTICE message will be printed. The message reads:
"Estimated rows (%.0f) less(greater) than actual rows (%.0f)."
Here is an example:
CREATE TABLE t(a int, b int);
INSERT INTO t SELECT x/10, x FROM generate_series(1,10000000) g(x);
ANALYZE;
SET estimated_rows_scale_factor = 0.9;
EXPLAIN ANALYZE SELECT * FROM t WHERE a > 10 AND b <= 200;
NOTICE: Estimated rows (1000) greater than actual rows (91).
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..107848.00 rows=1000 width=8) (actual
time=0.446..122.476 rows=91 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on t (cost=0.00..106748.00 rows=417 width=8)
(actual time=77.657..118.000 rows=30 loops=3)
Filter: ((a > 10) AND (b <= 200))
Rows Removed by Filter: 3333303
Planning Time: 0.097 ms
Execution Time: 122.502 ms
(8 rows)
EXPLAIN ANALYZE SELECT * FROM t WHERE a = 10 AND b <= 200;
NOTICE: Estimated rows (1) less than actual rows (10).
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..107748.10 rows=1 width=8) (actual
time=0.280..104.752 rows=10 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on t (cost=0.00..106748.00 rows=1 width=8)
(actual time=66.493..101.102 rows=3 loops=3)
Filter: ((b <= 200) AND (a = 10))
Rows Removed by Filter: 3333330
Planning Time: 0.129 ms
Execution Time: 104.768 ms
(8 rows)
If you have any suggestions regarding the wording of the message, its
placement, or if you'd like to see a different criterion used, I would
greatly appreciate your feedback.
Looking forward to your thoughts and suggestions.
--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Add-message-when-estimated-rows-differ-from-actual.patch | text/x-patch | 3.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Maxim Orlov | 2024-09-05 16:08:58 | Re: Test 041_checkpoint_at_promote.pl faild in installcheck due to missing injection_points |
Previous Message | Noah Misch | 2024-09-05 15:54:35 | Re: Use read streams in pg_visibility |