Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: dgrowleyml(at)gmail(dot)com, orlovmg(at)gmail(dot)com, jian(dot)universality(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN
Date: 2024-09-16 11:11:17
Message-ID: CAExHW5uwOz8aySLQHb1VRmfccf0-JMydvcS2A9_O-j=Kduqkyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 14, 2024 at 1:42 PM Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>
> >> > or the case when the last usage fit in memory but an earlier
> >> > usage spilled to disk.
> >>
> >> In my understanding once tuplestore changes the storage type to disk,
> >> it never returns to the memory storage type in terms of
> >> tuplestore_get_stats. i.e. once state->usedDisk is set to true, it
> >> never goes back to false. So the test case is not necessary.
> >> David, am I correct?
> >
> > I understand that. I am requesting a testcase to test that same logic.
>
> Maybe something like this? In the example below there are 2
> partitions. the first one has 1998 rows and the second one has 2
> rows. Assuming that work_mem is 64kB, the first one does not fit the
> memory and spills to disk. The second partition fits memory. However as
> state->usedDisk remains true, explain shows "Storage: Disk".
>
> test=# explain (analyze,costs off) select sum(n) over(partition by m) from (SELECT n < 3 as m, n from generate_series(1,2000) a(n));
> n QUERY PLAN
>
> --------------------------------------------------------------------------------
> -------------
> WindowAgg (actual time=1.958..473328.589 rows=2000 loops=1)
> Storage: Disk Maximum Storage: 65kB
> -> Sort (actual time=1.008..1.277 rows=2000 loops=1)
> Sort Key: ((a.n < 3))
> Sort Method: external merge Disk: 48kB
> -> Function Scan on generate_series a (actual time=0.300..0.633 rows=2
> 000 loops=1)
> Planning Time: 0.069 ms
> Execution Time: 474515.476 ms
> (8 rows)

Thanks. This will do. Is there a way to force the larger partition to
be computed first? That way we definitely know that the last
computation was done when all the tuples in the tuplestore were in
memory.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-09-16 11:24:40 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Jim Jones 2024-09-16 10:38:15 Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row