| From: | Richard Guo <guofenglinux(at)gmail(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: A performance issue with Memoize | 
| Date: | 2024-01-26 02:57:58 | 
| Message-ID: | CAMbWs49uuws9xU3u7jDfVNHX9nMzWysf0hkPVsVux+KUWNcKzw@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Fri, Jan 26, 2024 at 2:32 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I'm fairly sure I thought it wouldn't matter because of the Param
> de-duplication done in paramassign.c.  However, Richard's example
> shows that's not so, because process_subquery_nestloop_params is
> picky about the param ID assigned to a particular Var while
> replace_nestloop_params is not.  So flipping the order makes sense.
> I'd change the comment though, maybe to
>
>     /*
>      * Replace any outer-relation variables with nestloop params.
>      *
>      * We must provide nestloop params for both lateral references of
>      * the subquery and outer vars in the scan_clauses.  It's better
>      * to assign the former first, because that code path requires
>      * specific param IDs, while replace_nestloop_params can adapt
>      * to the IDs assigned by process_subquery_nestloop_params.
>      * This avoids possibly duplicating nestloop params when the
>      * same Var is needed for both reasons.
>      */
+1. It's much better.
> However ... it seems like we're not out of the woods yet.  Why
> is Richard's proposed test case still showing
>
> +         ->  Memoize (actual rows=5000 loops=N)
> +               Cache Key: t1.two, t1.two
>
> Seems like there is missing de-duplication logic, or something.
When we collect the cache keys in paraminfo_get_equal_hashops() we
search param_info's ppi_clauses as well as innerrel's lateral_vars for
outer expressions.  We do not perform de-duplication on the collected
outer expressions there.  In my proposed test case, the same Var
't1.two' appears both in the param_info->ppi_clauses and in the
innerrel->lateral_vars, so we see two identical cache keys in the plan.
I noticed this before and wondered if we should do de-duplication on the
cache keys, but somehow I did not chase this to the ground.
Thanks
Richard
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jeff Davis | 2024-01-26 03:01:52 | Re: Improve WALRead() to suck data directly from WAL buffers when possible | 
| Previous Message | vignesh C | 2024-01-26 02:55:29 | Re: gai_strerror() is not thread-safe on Windows |