Re: explain analyze rows=%.0f

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: explain analyze rows=%.0f
Date: 2022-07-07 20:21:37
Message-ID: CA+TgmoZv8DvmeQHQAX+rkn68id2N+DoUhFZB36d-_w1tyAeqeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 7, 2022 at 1:53 PM Greg Stark <stark(at)mit(dot)edu> wrote:
> > - -> Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
> > + -> Parallel Seq Scan on tenk1 (actual rows=1960.00
>
> At the not inconsiderable risk of bike-shedding....
>
> I'm wondering if printing something like 0.00 will be somewhat
> deceptive when the real value is non-zero but less than 1 row per 200
> loops. I wonder if the number of decimal places should be calculated
> to produce a minimum of one non-zero digit for non-zero values.

I mean, what I really want here if I'm honest is to not have the
system divide the number of rows by the loop count. And it sort of
sounds like maybe that's what you want, too. You want to know whether
the loop count is actually zero, not whether it's close to zero when
you divide it by some number that might be gigantic.

Parallel query's treatment of this topic has come in for some
criticism, but I don't know what else it could really do: there could
be any number of loops in each backend, and it need not be the same
across all backends, so all it can do is add up the loop counts just
like it adds up the row counts and times. And if we just printed out
those totals, the result would be understandable by everyone. But we
insist on dividing it by the loop count, and then things get really
obscure. Consider this example, which does not involve parallel query:

Nested Loop (actual time=TIME FOR THIS AND ALL CHILDREN rows=THE REAL
ROW COUNT loops=1)
-> Seq Scan on something (actual time=THE TIME IT REALLY TOOK rows=THE
REAL ROW COUNT loops=1)
-> Index Scan using someidx on somethingelse (actual time=NOT REALLY
HOW LONG IT TOOK rows=NOT REALLY HOW MANY ROWS WE GOT loops=HUGE
NUMBER)

If I'm looking at this plan and trying to find out what's gone wrong,
I want to know how much time got spent in the nested loop, how much
time got spent in the Seq Scan, and how much time got spent in the
Index Scan. It's easy to figure out how much time got spent in the Seq
Scan, but to find out how much time got spent in the Index Scan, I
have to multiply the time by the loop count. Then, I have to add that
number to the time spent in the Seq Scan and subtract that from the
time from the nested loop to find the time spent on the nested loop
itself. This is quite a lot of computation, especially if the plan
involves a dozen or two different nested loops, and if we didn't
insist on dividing the time by the loop count, it would be MUCH EASIER
to figure out whether the time spent in the Index Scan is a
significant percentage of the total time or not.

And likewise, if you're trying to understand the row count for the
nested loop, it would be a heck of a lot simpler if you could see the
*raw* row count for the index scan. It's unclear to me what value
there ever is in knowing that the number of rows per iteration was
about 0 or about 1 or about 2. The only thing I'm ever going to do
with the row count that gets printed here is multiply it by the loop
count and then try to figure out how much precision I've lost because
of limits on the number of decimal places. Right now that's basically
all of it because nearly every case ends up with the index scan having
rows=1, so even just adding 2 decimal places will help a lot. But I'm
still just going to be reverse engineering what I really want to know,
which is the original number, from what the system gives me, which is
a needlessly-obfuscated version of that value.

Grumble, grumble. It's sad that it's been 13 years and we haven't done
anything about this.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-07-07 20:45:19 Re: explain analyze rows=%.0f
Previous Message Bruce Momjian 2022-07-07 20:15:55 Re: pg15b2: large objects lost on upgrade