Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2021-03-30 16:37:43
Message-ID: CALNJ-vT7fLuHfjkN5h5jusxoKHci6ch6+E_S1zKbNM1JiKJCnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,
In paraminfo_get_equal_hashops(),

+ /* Reject if there are any volatile functions */
+ if (contain_volatile_functions(expr))
+ {

You can move the above code to just ahead of:

+ if (IsA(expr, Var))
+ var_relids = bms_make_singleton(((Var *) expr)->varno);

This way, when we return early, var_relids doesn't need to be populated.

Cheers

On Tue, Mar 30, 2021 at 4:42 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Mon, 29 Mar 2021 at 15:56, Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> > For show_resultcache_info()
> >
> > + if (rcstate->shared_info != NULL)
> > + {
> >
> > The negated condition can be used with a return. This way, the loop can
> be unindented.
>
> OK. I change that.
>
> > + * ResultCache nodes are intended to sit above a parameterized node in
> the
> > + * plan tree in order to cache results from them.
> >
> > Since the parameterized node is singular, it would be nice if 'them' can
> be expanded to refer to the source of result cache.
>
> I've done a bit of rewording in that paragraph.
>
> > + rcstate->mem_used -= freed_mem;
> >
> > Should there be assertion that after the subtraction, mem_used stays
> non-negative ?
>
> I'm not sure. I ended up adding one and also adjusting the #ifdef in
> remove_cache_entry() which had some code to validate the memory
> accounting so that it compiles when USE_ASSERT_CHECKING is defined.
> I'm unsure if that's a bit too expensive to enable during debugs but I
> didn't really want to leave the code in there unless it's going to get
> some exercise on the buildfarm.
>
> > + if (found && entry->complete)
> > + {
> > + node->stats.cache_hits += 1; /* stats update */
> >
> > Once inside the if block, we would return.
>
> OK change.
>
> > + else
> > + {
> > The else block can be unindented (dropping else keyword).
>
> changed.
>
> > + * return 1 row. XXX is this worth the check?
> > + */
> > + if (unlikely(entry->complete))
> >
> > Since the check is on a flag (with minimal overhead), it seems the check
> can be kept, with the question removed.
>
> I changed the comment, but I did leave a mention that I'm still not
> sure if it should be an Assert() or an elog.
>
> The attached patch is an updated version of the Result Cache patch
> containing the changes for the things you highlighted plus a few other
> things.
>
> I pushed the change to simplehash.h and the estimate_num_groups()
> change earlier, so only 1 patch remaining.
>
> Also, I noticed the CFBof found another unstable parallel regression
> test. This was due to some code in show_resultcache_info() which
> skipped parallel workers that appeared to not help out. It looks like
> on my machine the worker never got a chance to do anything, but on one
> of the CFbot's machines, it did. I ended up changing the EXPLAIN
> output so that it shows the cache statistics regardless of if the
> worker helped or not.
>
> David
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2021-03-30 16:50:28 Re: truncating timestamps on arbitrary intervals
Previous Message Daniil Zakhlystov 2021-03-30 16:10:47 Re: libpq compression