Re: Shared detoast Datum proposal

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Michael Zhilin <m(dot)zhilin(at)postgrespro(dot)ru>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikita Malakhov <hukutoc(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Shared detoast Datum proposal
Date: 2024-02-21 13:20:38
Message-ID: 87h6i2djf9.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I see your reply when I started to write a more high level
document. Thanks for the step by step help!

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:

> On 2/20/24 19:38, Andy Fan wrote:
>>
>> ...
>>
>>> I think it should be done the other
>>> way around, i.e. the patch should introduce the main feature first
>>> (using the traditional Bitmapset), and then add Bitset on top of that.
>>> That way we could easily measure the impact and see if it's useful.
>>
>> Acutally v4 used the Bitmapset, and then both perf and pgbench's tps
>> indicate it is too expensive. and after talk with David at [2], I
>> introduced bitset and use it here. the test case I used comes from [1].
>> IRCC, there were 5% performance difference because of this.
>>
>> create table w(a int, b numeric);
>> insert into w select i, i from generate_series(1, 1000000)i;
>> select b from w where b > 0;
>>
>> To reproduce the difference, we can replace the bitset_clear() with
>>
>> bitset_free(slot->pre_detoasted_attrs);
>> slot->pre_detoasted_attrs = bitset_init(slot->tts_tupleDescriptor->natts);
>>
>> in ExecFreePreDetoastDatum. then it works same as Bitmapset.
>>
>
> I understand the bitset was not introduced until v5, after noticing the
> bitmapset is not quite efficient. But I still think the patches should
> be the other way around, i.e. the main feature first, then the bitset as
> an optimization.
>
> That allows everyone to observe the improvement on their own (without
> having to tweak the code), and it also doesn't require commit of the
> bitset part before it gets actually used by anything.

I start to think this is a better way rather than the opposite. The next
version will be:

0001: shared detoast datum feature with high level introduction.
0002: introduce bitset and use it shared-detoast-datum feature, with the
test case to show the improvement.

>> I will do more test on the memory leak stuff, since there are so many
>> operation aginst slot like ExecCopySlot etc, I don't know how to test it
>> fully. the method in my mind now is use TPCH with 10GB data size, and
>> monitor the query runtime memory usage.
>>
>
> I think this is exactly the "high level design" description that should
> be in a comment, somewhere.

Got it.

>>> Of course, expanded datums are
>>> not meant to be long-lived, while "shared detoasted values" are meant to
>>> exist (potentially) for the query duration.
>>
>> hmm, acutally the "shared detoast value" just live in the
>> TupleTableSlot->tts_values[*], rather than the whole query duration. The
>> simple case is:
>>
>> SELECT * FROM t WHERE a_text LIKE 'abc%';
>>
>> when we scan to the next tuple, the detoast value for the previous tuple
>> will be relased.
>>
>
> But if the (detoasted) value is passed to the next executor node, it'll
> be kept, right?

Yes and only one copy for all the executor nodes.

> Unrelated question - could this actually end up being too aggressive?
> That is, could it detoast attributes that we end up not needing? For
> example, what if the tuple gets eliminated for some reason (e.g. because
> of a restriction on the table, or not having a match in a join)? Won't
> we detoast the tuple only to throw it away?

The detoast datum will have the exactly same lifespan with other
tts_values[*]. If the tuple get eliminated for any reason, those detoast
datum still exist until the slot is cleared for storing the next tuple.

>> typedef struct Join
>> {
>> ..
>> /*
>> * Records of var's varattno - 1 where the Var is accessed indirectly by
>> * any expression, like a > 3. However a IS [NOT] NULL is not included
>> * since it doesn't access the tts_values[*] at all.
>> *
>> * This is a essential information to figure out which attrs should use
>> * the pre-detoast-attrs logic.
>> */
>> Bitmapset *outer_reference_attrs;
>> Bitmapset *inner_reference_attrs;
>> } Join;
>>
>
> Is it actually necessary to add new fields to these nodes? Also, the
> names are not particularly descriptive of the purpose - it'd be better
> to have "detoast" in the name, instead of generic "reference".

Because of the way of the data transformation, I think we must add the
fields to keep such inforamtion. Then these information will be used
initilize the necessary information in PlanState. maybe I am having a
fixed mindset, I can't think out a way to avoid that right now.

I used 'reference' rather than detoast is because some implementaion
issues. In the createplan.c and setref.c, I can't check the atttyplen
effectively, so even a Var with int type is still hold here which may
have nothing with detoast.

>>
>> When I worked with the UniqueKey feature, I maintained a
>> UniqueKey.README to summaried all the dicussed topics in threads, the
>> README is designed to save the effort for more reviewer, I think I
>> should apply the same logic for this feature.
>>
>
> Good idea. Either that (a separate README), or a comment in a header of
> some suitable .c/.h file (I prefer that, because that's kinda obvious
> when reading the code, I often not notice a README exists next to it).

Great, I'd try this from tomorrow.

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrice Chapuis 2024-02-21 14:00:37 Re: pg_restore option --clean
Previous Message Tomas Vondra 2024-02-21 12:39:29 Re: Shared detoast Datum proposal