Re: Questions About TODO: Issuing NOTICEs for row count differences in EXPLAIN ANALYZE

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: KAZAR Ayoub <ma_kazar(at)esi(dot)dz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Questions About TODO: Issuing NOTICEs for row count differences in EXPLAIN ANALYZE
Date: 2024-10-29 00:39:13
Message-ID: CAApHDvo+Rwxd6kR7bxeDG=5zTAi9q7tYdaX6VvbKy08thXFEtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 29 Oct 2024 at 12:43, KAZAR Ayoub <ma_kazar(at)esi(dot)dz> wrote:
> I am currently looking into the following TODO item, "Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage."
> What's the current status of this TODO, and is there any prior discussion or rationale behind it ?

The status is that we don't have anything like that and I don't recall
it being mentioned that anyone is working on it. Normally these items
only get added when there has been some discussion about it, but
normally that discussion gets linked along with the todo item. Clearly
that's not been done in this case. I imagine the rationale is to make
it more clear when the estimates are off from the actual execution.

You might need to do some digging into the history of who added that
todo item and see if you can find any relevant discussion on hackers
around the time it was added.

> Specifically:
> - How can we come up with a percentage for row count differences (fixed value or dynamic)?
> - Should we consider a configurable param for users ?
> - Is there anything extra to consider ?

The biggest thing to consider is if we'd want anything like this in
core PostgreSQL. It feels more like something additional tooling such
as explain.depesz.com would concern themselves with. I could also
imagine features along those lines in some sort of statistics advisor
contrib module. My personal view is that it would feel like a very
misplaced feature if we were to add only what the todo item describes
into core PostgreSQL. In any case, adding a NOTICE for this seems
horrible. Doing it that way means the information about the row
estimate's accuracy is very disconnected from the EXPLAIN line that it
belongs to.

Additionally, there are cases where we expect the actual and estimates
to be off, even with perfect statistics. Consider the Seq Scan in the
following:

postgres=# explain analyze select * from pg_class limit 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.04 rows=1 width=273) (actual time=0.035..0.036
rows=1 loops=1)
-> Seq Scan on pg_class (cost=0.00..18.15 rows=415 width=273)
(actual time=0.033..0.033 rows=1 loops=1)

I don't think we'd want false alarms for cases like that.

David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-10-29 00:50:00 freespace.c modifies buffer without any locks
Previous Message David G. Johnston 2024-10-29 00:29:28 Re: EXPLAIN IndexOnlyScan shows disabled when enable_indexonlyscan=on