Quick Links

Re: Improve eviction algorithm in ReorderBuffer

From:	Peter Smith <smithpb2250(at)gmail(dot)com>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Improve eviction algorithm in ReorderBuffer
Date:	2024-03-08 03:58:07
Message-ID:	CAHut+PvGNQnosKyxpVW24bNhAppJfgKiGSLbinVGYA-BqY1+AA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Mar 7, 2024 at 2:16 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Mar 5, 2024 at 3:28 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >

> > 4a.
> > The comment in simplehash.h says
> > * The following parameters are only relevant when SH_DEFINE is defined:
> > * - SH_KEY - ...
> > * - SH_EQUAL(table, a, b) - ...
> > * - SH_HASH_KEY(table, key) - ...
> > * - SH_STORE_HASH - ...
> > * - SH_GET_HASH(tb, a) - ...
> >
> > So maybe it is nicer to reorder the #defines in that same order?
> >
> > SUGGESTION:
> > +#define SH_PREFIX bh_nodeidx
> > +#define SH_ELEMENT_TYPE bh_nodeidx_entry
> > +#define SH_KEY_TYPE bh_node_type
> > +#define SH_SCOPE extern
> > +#ifdef FRONTEND
> > +#define SH_RAW_ALLOCATOR pg_malloc0
> > +#endif
> >
> > +#define SH_DEFINE
> > +#define SH_KEY key
> > +#define SH_EQUAL(tb, a, b) (memcmp(&a, &b, sizeof(bh_node_type)) == 0)
> > +#define SH_HASH_KEY(tb, key) \
> > + hash_bytes((const unsigned char *) &key, sizeof(bh_node_type))
> > +#include "lib/simplehash.h"
>
> I'm really not sure it helps increase readability. For instance, for
> me it's readable if SH_DEFINE and SH_DECLARE come to the last before
> #include since it's more obvious whether we want to declare, define or
> both. Other simplehash.h users also do so.
>

OK.

> > 5.
> > + *
> > + * If 'indexed' is true, we create a hash table to track of each node's
> > + * index in the heap, enabling to perform some operations such as removing
> > + * the node from the heap.
> > */
> > binaryheap *
> > -binaryheap_allocate(int capacity, binaryheap_comparator compare, void *arg)
> > +binaryheap_allocate(int capacity, binaryheap_comparator compare,
> > + bool indexed, void *arg)
> >
> > BEFORE
> > ... enabling to perform some operations such as removing the node from the heap.
> >
> > SUGGESTION
> > ... to help make operations such as removing nodes more efficient.
> >
>
> But these operations literally require the indexed binary heap as we
> have an assertion:
>
> void
> binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d)
> {
> bh_nodeidx_entry *ent;
>
> Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property);
> Assert(heap->bh_indexed);
>

I didn’t quite understand -- the operations mentioned are "operations
such as removing the node", but binaryheap_remove_node() also removes
a node from the heap. So I still felt the comment wording of the patch
is not quite correct.

Now, if the removal of a node from an indexed heap can *only* be done
using binaryheap_remove_node_ptr() then:
- the other removal functions (binaryheap_remove_*) probably need some
comments to make sure nobody is tempted to call them directly for an
indexed heap.
- maybe some refactoring and assertions are needed to ensure those
*cannot* be called directly for an indexed heap.

> > 7b.
> > IMO the parameters would be better the other way around (e.g. 'index'
> > before the 'node') because that's what the assignments look like:
> >
> >
> > heap->bh_nodes[heap->bh_size] = d;
> >
> > becomes:
> > bh_set_node(heap, heap->bh_size, d);
> >
>
> I think it assumes heap->bh_nodes is an array. But if we change it in
> the future, it will no longer make sense. I think it would make more
> sense if we define the parameters in an order like "we set the 'node'
> at 'index'". What do you think?

YMMV. The patch code is also OK by me if you prefer it.

//////////

And, here are some review comments for v8-0002.

======
1. delete_nodeidx

+/*
+ * Remove the node's index from the hash table if the heap is indexed.
+ */
+static bool
+delete_nodeidx(binaryheap *heap, bh_node_type node)
+{
+ if (!binaryheap_indexed(heap))
+ return false;
+
+ return bh_nodeidx_delete(heap->bh_nodeidx, node);
+}

1a.
In v8 this function was changed to now return bool, so, I think the
function comment should explain the meaning of that return value.

1b.
I felt the function body is better expressed positively: "If this then
do that", instead of "If not this then do nothing otherwise do that"

SUGGESTION
if (binaryheap_indexed(heap))
return bh_nodeidx_delete(heap->bh_nodeidx, node);

return false;

~~~

2.
+static void
+replace_node(binaryheap *heap, int index, bh_node_type new_node)
+{
+ bool found PG_USED_FOR_ASSERTS_ONLY;
+
+ /* Quick return if not necessary to move */
+ if (heap->bh_nodes[index] == new_node)
+ return;
+
+ /*
+ * Remove overwritten node's index. The overwritten node's position must
+ * have been tracked, if enabled.
+ */
+ found = delete_nodeidx(heap, heap->bh_nodes[index]);
+ Assert(!binaryheap_indexed(heap) || found);
+
+ /*
+ * Replace it with the given new node. This node's position must also be
+ * tracked as we assume to replace the node by the existing node.
+ */
+ found = set_node(heap, new_node, index);
+ Assert(!binaryheap_indexed(heap) || found);
+}

2a.
/Remove overwritten/Remove the overwritten/
/replace the node by the existing node/replace the node with the existing node/

2b.
It might be helpful to declare another local var...
bh_node_type cur_node = heap->bh_nodes[index];

... because I think it will be more readable to say:
+ if (cur_node == new_node)
+ return;

and

+ found = delete_nodeidx(heap, cur_node);

----------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Re: Improve eviction algorithm in ReorderBuffer at 2024-03-07 03:16:08 from Masahiko Sawada

Responses

Re: Improve eviction algorithm in ReorderBuffer at 2024-03-12 05:23:08 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	jian he	2024-03-08 04:04:31	Re: remaining sql/json patches
Previous Message	Andrew Dunstan	2024-03-08 03:42:06	Re: WIP Incremental JSON Parser