| From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> | 
|---|---|
| To: | 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com> | 
| Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> | 
| Subject: | RE: Improve eviction algorithm in ReorderBuffer | 
| Date: | 2024-02-06 05:45:07 | 
| Message-ID: | TYCPR01MB120772441516B0F7F43C90A2AF5462@TYCPR01MB12077.jpnprd01.prod.outlook.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Dear Sawada-san,
> Thank you for the review comments!
> 
> >
> > A comment for 0001:
> >
> > 01.
> > ```
> > +static void
> > +bh_enlarge_node_array(binaryheap *heap)
> > +{
> > +    if (heap->bh_size < heap->bh_space)
> > +        return;
> > +
> > +    heap->bh_space *= 2;
> > +    heap->bh_nodes = repalloc(heap->bh_nodes,
> > +                              sizeof(bh_node_type) * heap->bh_space);
> > +}
> > ```
> >
> > I'm not sure it is OK to use repalloc() for enlarging bh_nodes. This data
> > structure public one and arbitrary codes and extensions can directly refer
> > bh_nodes. But if the array is repalloc()'d, the pointer would be updated so that
> > their referring would be a dangling pointer.
> 
> Hmm I'm not sure this is the case that we need to really worry about,
> and cannot come up with a good use case where extensions refer to
> bh_nodes directly rather than binaryheap. In PostgreSQL codes, many
> Nodes already have pointers and are exposed.
Actually, me neither. I could not come up with the use-case - I just said the possibility.
If it is not a real issue, we can ignore.
> 
> >
> > 04.
> > ```
> >  extern binaryheap *binaryheap_allocate(int capacity,
> >                                         binaryheap_comparator compare,
> > -                                       void *arg);
> > +                                       bool indexed, void *arg);
> > ```
> >
> > I felt pre-existing API should not be changed. How about adding
> > binaryheap_allocate_extended() or something which can specify the `bool
> indexed`?
> > binaryheap_allocate() sets heap->bh_indexed to false.
> 
> I'm really not sure it's worth inventing a
> binaryheap_allocate_extended() function just for preserving API
> compatibility. I think it's generally a good idea to have
> xxx_extended() function to increase readability and usability, for
> example, for the case where the same (kind of default) arguments are
> passed in most cases and the function is called from many places.
> However, we have a handful binaryheap_allocate() callers, and I
> believe that it would not hurt the existing callers.
I kept (external) extensions which uses binaryheap APIs in my mind.
I thought we could avoid to raise costs for updating their codes. But I could
understand the change is small, so ... up to you.
> > 05.
> > ```
> > +extern void binaryheap_update_up(binaryheap *heap, bh_node_type d);
> > +extern void binaryheap_update_down(binaryheap *heap, bh_node_type d);
> > ```
> >
> > IIUC, callers must consider whether the node should be shift up/down and use
> > appropriate function, right? I felt it may not be user-friendly.
> 
> Right, I couldn't come up with a better interface.
> 
> Another idea I've considered was that the caller provides a callback
> function where it can compare the old and new keys. For example, in
> reorderbuffer case, we call like:
> 
> binaryheap_update(rb->txn_heap, PointerGetDatum(txn),
> ReorderBufferTXNUpdateCompare, (void *) &old_size);
> 
> Then in ReorderBufferTXNUpdateCompare(),
> ReorderBufferTXN *txn = (ReorderBufferTXN *) a;Size old_size = *(Size *) b;
> (compare txn->size to "b" ...)
> 
> However it seems complicated...
> 
I considered similar approach: accept old node as an argument of a compare function.
But it requires further memory allocation. Do someone have better idea?
> 
> >
> > 08.
> > IIUC, if more than 1024 transactions are running but they have small amount of
> > changes, the performance may be degraded, right? Do you have a result in
> sucha
> > a case?
> 
> I've run a benchmark test that I shared before[1]. Here are results of
> decoding a transaction that has 1M subtransaction each of which has 1
> INSERT:
> 
> HEAD:
> 1810.192 ms
> 
> HEAD w/ patch:
> 2001.094 ms
> 
> I set a large enough value to logical_decoding_work_mem not to evict
> any transactions. I can see about about 10% performance regression in
> this case.
Thanks for running. I think this workload is the worst and an extreme case which
would not be occurred on the real system (Such a system should be fixed), so we
can say that the regression is up to -10%. I felt it could be negligible but how
do other think?
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2024-02-06 05:46:42 | Re: Make COPY format extendable: Extract COPY TO format implementations | 
| Previous Message | Peter Smith | 2024-02-06 05:15:49 | Re: Synchronizing slots from primary to standby |