From: | Matheus Alcantara <matheusssilv97(at)gmail(dot)com> |
---|---|
To: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
Cc: | Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, vignesh C <vignesh21(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: explain analyze rows=%.0f |
Date: | 2025-02-10 20:43:22 |
Message-ID: | CAFY6G8e_o2AU6iaMd1hQjjBBcgoGOPRRarP=yNhTPZ6AX+GXkQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Em seg., 10 de fev. de 2025 às 13:38, Ilia Evdokimov
<ilya(dot)evdokimov(at)tantorlabs(dot)com> escreveu:
> > -- v7-0001-Clarify-display-of-rows-and-loops-as-decimal-fraction.patch
> >> + if (nloops > 1 && planstate->instrument->ntuples < nloops)
> >> + appendStringInfo(es->str," rows=%.2f loops=%.2f)", rows, nloops);
> >>
> > Sorry, but why do the 'loops' need to be changed? IIUC the issue is
> > only with the
> > rows field? When testing this patch I got some results similar with this:
> > Index Scan using t_a_idx on t (cost=0.14..0.28 rows=1 width=8)
> > (actual time=0.049..0.049 rows=0.50 loops=2.00)
>
>
> The only reason I added this was to make it clear to the user that loops
> > 1, along with the 'rows' value. If no one finds it useful and it only
> raises more questions, I can remove it.
>
Ok, I get it. IMHO the current behaviour of using %.0f for 'loops' would be
better, but let's see if anyone else has opinions about it.
> >> When the total number of returned tuples is less than the number of
> >> loops currently shows 'rows = 0'. This can mislead users into thinking
> >> that no rows were returned at all, even though some might have appeared
> >> occasionally.
> >>
> > I think that this can happen when the returned rows and the loops are small
> > enough to result in a 'row' value like 0.00045? I'm not sure if we have
> > "bigger" values (e.g 1074(ntuples) / 117(nloops) which would result in 9.17
> > rows) this would also be true, what do you think? If you could provide
> > an example of this case would be great!
>
>
> Based on what was discussed earlier in the thread, there are cases with
> large loops [0]. However, I believe it's better not to display average
> rows with excessively long digits or in scientific notation. And, of
> course, I agree with you regarding small values. I think we should also
> add a check to ensure that the total rows is actually greater than zero.
> When the total rows is zero, we could simply display it as an integer
> without decimals. It could help users average rows is very small but not
> zero. What do you think about this approach?
>
Yeah, I agree with you about the long digits. My question is more about why do
we need the planstate->instrument->ntuples < nloops check? I tried to remove
this check and I got a lot of EXPLAIN output that shows 'rows' values with .00,
so I'm just trying to understand the reason. From what I've understood about
this thread is that just avoiding .00 decimals of 'rows' values that could be
just integers would be enough, is that right or I'm missing something here? I'm
just worried if we could have a scenario where nloops > 1 &&
planstate->instrument->ntuples < nloops which would make the 'rows' not be
formatted correctly.
>
> >
> > - executing the index scans on <literal>tenk2</literal>.
> > + executing the index scans on <literal>tenk2</literal>. If a subplan node
> > + is executed multiple times and the average number of rows is less than one,
> > + the rows and <literal>loops</literal> values are shown as a
> > decimal fraction
> > + (with two digits after the decimal point) to indicate that some rows
> > + were actually processed rather than simply rounding down to zero.
> >
> > * I think that it would be good to mention what a 'row' value in
> > decimal means. For
> > example, if its says "0.1 rows" the user should assume that typically 0 rows
> > will be returned but sometimes it can return 1 or more.
> >
> > * There are more spaces than necessary before "If a subplan node ..."
> >
> > * Maybe wrap 'rows' with <literal> </literal>?
> >
>
> I agree with the last two points. As for the first one—maybe we could
> simply state that the average rows value can be decimal, especially for
> very small values?
>
I'm just not sure about the "small values"; the 'rows' in decimal will only
happen with small values? What would be a "small value" in this context? My main
point here is more that I think that it would be good to mention *why* the
'rows' can be decimal, not just describe that it could be decimal.
--
Matheus Alcantara
From | Date | Subject | |
---|---|---|---|
Next Message | Ilia Evdokimov | 2025-02-10 20:51:29 | Re: pull-up subquery if JOIN-ON contains refs to upper-query |
Previous Message | Sami Imseih | 2025-02-10 20:14:09 | Re: [PATCH] Optionally record Plan IDs to track plan changes for a query |