From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | stepan rutz <stepan(dot)rutz(at)gmx(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Detoasting optionally to make Explain-Analyze less misleading |
Date: | 2023-11-02 22:24:36 |
Message-ID: | 14746b40-16a8-b53e-18a6-f2872e696e34@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/2/23 22:33, Matthias van de Meent wrote:
> On Thu, 2 Nov 2023 at 22:25, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>>
>>
>> On 11/2/23 21:02, Matthias van de Meent wrote:
>>> On Thu, 2 Nov 2023 at 20:32, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>> On 11/2/23 20:09, stepan rutz wrote:
>>>>> db1=# explain (analyze, serialize) select * from test;
>>>>> QUERY PLAN
>>>>> ---------------------------------------------------------------------------------------------------
>>>>> Seq Scan on test (cost=0.00..22.00 rows=1200 width=40) (actual
>>>>> time=0.023..0.027 rows=1 loops=1)
>>>>> Planning Time: 0.077 ms
>>>>> Execution Time: 303.281 ms
>>>>> Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
>>>> [...]
>>>> BTW if you really want to print amount of memory, maybe print it in
>>>> kilobytes, like every other place in explain.c?
>>>
>>> Isn't node width in bytes, or is it an opaque value not to be
>>> interpreted by users? I've never really investigated that part of
>>> Postgres' explain output...
>>>
>>
>> Right, "width=" is always in bytes. But fields like amount of sorted
>> data is in kB, and this seems closer to that.
>>
>>>> Also, explain generally
>>>> prints stuff in "key: value" style (in text format).
>>>
>>> That'd be key: metrickey=metricvalue for expanded values like those in
>>> plan nodes and the buffer usage, no?
>>>
>>
>> Possibly. But the proposed output does neither. Also, it starts with
>> "Serialized Bytes" but then prints info about bandwidth.
>>
>>
>>>>> Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
>>>
>>> I was thinking more along the lines of something like this:
>>>
>>> [...]
>>> Execution Time: xxx ms
>>> Serialization: time=yyy.yyy (in ms) size=yyy (in KiB, or B) mode=text
>>> (or binary)
>>>> This is significantly different from your output, as it doesn't hide
>>> the measured time behind a lossy calculation of bandwidth, but gives
>>> the measured data to the user; allowing them to derive their own
>>> precise bandwidth if they're so inclined.
>>>
>>
>> Might work. I'm still not convinced we need to include the mode, or that
>> the size is that interesting/useful, though.
>
> I'd say size is interesting for systems where network bandwidth is
> constrained, but CPU isn't. We currently only show estimated widths &
> accurate number of tuples returned, but that's not an accurate
> explanation of why your 30-row 3GB resultset took 1h to transmit on a
> 10mbit line - that is only explained by the bandwidth of your
> connection and the size of the dataset. As we can measure the size of
> the returned serialized dataset here, I think it's in the interest of
> any debugability to also present it to the user. Sadly, we don't have
> good measures of bandwidth without sending that data across, so that's
> the only metric that we can't show here, but total query data size is
> definitely something that I'd be interested in here.
Yeah, I agree with that.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-11-02 22:34:18 | Re: Remove distprep |
Previous Message | Thomas Munro | 2023-11-02 21:51:12 | Re: Pre-proposal: unicode normalized text |