Re: A performance issue with Memoize

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: A performance issue with Memoize
Date: 2023-10-26 04:07:35
Message-ID: 1d55df6d-92cb-4bbd-9a4c-0d05bb6bc91d@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/10/2023 17:40, Richard Guo wrote:
> I noticed $subject with the query below.
>
> set enable_memoize to off;
>
> explain (analyze, costs off)
> select * from tenk1 t1 left join lateral
>     (select t1.two as t1two, * from tenk1 t2 offset 0) s
> on t1.two = s.two;
>                                      QUERY PLAN
> ------------------------------------------------------------------------------------
>  Nested Loop Left Join (actual time=0.050..59578.053 rows=50000000 loops=1)
>    ->  Seq Scan on tenk1 t1 (actual time=0.027..2.703 rows=10000 loops=1)
>    ->  Subquery Scan on s (actual time=0.004..4.819 rows=5000 loops=10000)
>          Filter: (t1.two = s.two)
>          Rows Removed by Filter: 5000
>          ->  Seq Scan on tenk1 t2 (actual time=0.002..3.834 rows=10000
> loops=10000)
>  Planning Time: 0.666 ms
>  Execution Time: 60937.899 ms
> (8 rows)
>
> set enable_memoize to on;
>
> explain (analyze, costs off)
> select * from tenk1 t1 left join lateral
>     (select t1.two as t1two, * from tenk1 t2 offset 0) s
> on t1.two = s.two;
>                                         QUERY PLAN
> ------------------------------------------------------------------------------------------
>  Nested Loop Left Join (actual time=0.061..122684.607 rows=50000000
> loops=1)
>    ->  Seq Scan on tenk1 t1 (actual time=0.026..3.367 rows=10000 loops=1)
>    ->  Memoize (actual time=0.011..9.821 rows=5000 loops=10000)
>          Cache Key: t1.two, t1.two
>          Cache Mode: binary
>          Hits: 0  Misses: 10000  Evictions: 9999  Overflows: 0  Memory
> Usage: 1368kB
>          ->  Subquery Scan on s (actual time=0.008..5.188 rows=5000
> loops=10000)
>                Filter: (t1.two = s.two)
>                Rows Removed by Filter: 5000
>                ->  Seq Scan on tenk1 t2 (actual time=0.004..4.081
> rows=10000 loops=10000)
>  Planning Time: 0.607 ms
>  Execution Time: 124431.388 ms
> (12 rows)
>
> The execution time (best of 3) is 124431.388 VS 60937.899 with and
> without memoize.
>
> The Memoize runtime stats 'Hits: 0  Misses: 10000  Evictions: 9999'
> seems suspicious to me, so I've looked into it a little bit, and found
> that the MemoizeState's keyparamids and its outerPlan's chgParam are
> always different, and that makes us have to purge the entire cache each
> time we rescan the memoize node.
>
> But why are they always different?  Well, for the query above, we have
> two NestLoopParam nodes, one (with paramno 1) is created when we replace
> outer-relation Vars in the scan qual 't1.two = s.two', the other one
> (with paramno 0) is added from the subquery's subplan_params, which was
> created when we replaced uplevel vars with Param nodes for the subquery.
> That is to say, the chgParam would be {0, 1}.
>
> When it comes to replace outer-relation Vars in the memoize keys, the
> two 't1.two' Vars are both replaced with the NestLoopParam with paramno
> 1, because it is the first NLP we see in root->curOuterParams that is
> equal to the Vars in memoize keys.  That is to say, the memoize node's
> keyparamids is {1}.
> ...
> Any thoughts?

Do you've thought about the case, fixed with the commit 1db5667? As I
see, that bugfix still isn't covered by regression tests. Could your
approach of a PARAM_EXEC slot reusing break that case?

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-10-26 04:42:12 Re: Open a streamed block for transactional messages during decoding
Previous Message David Rowley 2023-10-26 04:00:29 Re: Making aggregate deserialization (and WAL receive) functions slightly faster