From: | John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> |
---|---|
To: | Binguo Bao <djydewang(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Владимир Лесков <vladimirlesk(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [proposal] de-TOAST'ing using a iterator |
Date: | 2019-08-14 05:00:25 |
Message-ID: | CACPNZCugpoRmH5A4GVUjTB0LVZ7T2ebTpie4vXKtBJaLh=xkLQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Aug 3, 2019 at 11:11 PM Binguo Bao <djydewang(at)gmail(dot)com> wrote:
>
> John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> 于2019年8月2日周五 下午3:12写道:
>>
>> I like the use of a macro here. However, I think we can find a better
>> location for the definition. See the header comment of fmgr.h:
>> "Definitions for the Postgres function manager and function-call
>> interface." Maybe tuptoaster.h is as good a place as any?
>
> PG_DETOAST_ITERATE isn't a sample function-call interface,
> But I notice that PG_FREE_IF_COPY is also defined in fmgr.h, whose logic is
> similar to PG_DETOAST_ITERATE, make condition check first and then
> decide whether to call the function. Besides, PG_DETOAST_DATUM,
> PG_DETOAST_DATUM_COPY, PG_DETOAST_DATUM_SLICE,
> PG_DETOAST_DATUM_PACKED are all defined in fmgr.h, it is reasonable
> to put all the de-TOAST interface together.
Hmm, it's strange that those macros ended up there, but now I see why
it makes sense to add new ones there also.
>> Okay, and see these functions now return void. The orignal
>> pglz_decompress() returned a value that was check against corruption.
>> Is there a similar check we can do for the iterative version?
>
> As far as I know, we can just do such check after all compressed data is decompressed.
> If we are slicing, we can't do the check.
Okay.
>> >> 3). Speaking of pglz_decompress_iterate(), I diff'd it with
>> >> pglz_decompress(), and I have some questions on it:
>> >>
>> >> a).
>> >> + srcend = (const unsigned char *) (source->limit == source->capacity
>> >> ? source->limit : (source->limit - 4));
>> >>
>> >> What does the 4 here mean in this expression?
>> >
>> > Since we fetch chunks one by one, if we make srcend equals to the source buffer limit,
>> > In the while loop "while (sp < srcend && dp < destend)", sp may exceed the source buffer limit and read unallocated bytes.
>>
>> Why is this? That tells me the limit is incorrect. Can the setter not
>> determine the right value?
>
> There are three statments change `sp` value in the while loop `while (sp < srcend && dp < destend)`:
> `ctrl = *sp++;`
> `off = ((sp[0]) & 0xf0) << 4) | sp[1]; sp += 2;`
> `len += *sp++`
> Although we make sure `sp` is less than `srcend` when enter while loop, `sp` is likely to
> go beyond the `srcend` in the loop, and we should ensure that `sp` is always smaller than `buf->limit` to avoid
> reading unallocated data. So, `srcend` can't be initialized to `buf->limit`. Only one case is exceptional,
> we've fetched all data chunks and 'buf->limit' reaches 'buf->capacity', it's imposisble to read unallocated
> data via `sp`.
Thank you for the detailed explanation and the comment.
>> Please explain further. Was the "sp + 1" correct behavior (and why),
>> or only for debugging setting ctrl/c correctly?
>
> In patch v5, If the condition is `sp < srcend`, suppose `sp = srcend - 1` before
> entering the loop `while (sp < srcend && dp < destend)`, when entering the loop
> and read a control byte(sp equals to `srcend` now), the program can't enter the
> loop `for (; ctrlc < 8 && sp < srcend && dp < destend; ctrlc++)`, then set `iter->ctrlc` to 0,
> exit the first loop and then this iteration is over. At the next iteration,
> the control byte will be reread since `iter->ctrlc` equals to 0, but the previous control byte
> is not used. Changing the condition to `sp + 1 < srcend` avoid only one control byte is read
> then the iterator is over.
Okay, that's quite subtle. I agree the v6/7 way is more clear in this regard.
>> Also, I don't think
>> the new logic for the ctrl/c variables is an improvement:
>>
>> 1. iter->ctrlc is intialized with '8' (even in the uncompressed case,
>> which is confusing). Any time you initialize with something not 0 or
>> 1, it's a magic number, and here it's far from where the loop variable
>> is used. This is harder to read.
>
> `iter->ctrlc` is used to record the value of `ctrl` in pglz_decompress at the end of
> the last iteration(or loop). In the pglz_decompress, `ctrlc`’s valid value is 0~7,
> When `ctrlc` reaches 8, a control byte is read from the source
> buffer to `ctrl` then set `ctrlc` to 0. And a control bytes should be read from the
> source buffer to `ctrlc` on the first iteration. So `iter->ctrlc` should be intialized with '8'.
My point here is it looks strange out of context, but "0" looked
normal. Maybe a comment in init_detoast_buffer(), something like "8
means read a control byte from the source buffer on the first
iteration, see pg_lzdecompress_iterate()".
Or, possibly, we could have a macro like INVALID_CTRLC. That might
even improve the readability of the original function. This is just an
idea, and maybe others would disagree, so you don't need to change it
for now.
>> 3. At the end of the loop, iter->ctrl/c are unconditionally set. In
>> v5, there was a condition which would usually avoid this copying of
>> values through pointers.
>
> Patch v6 just records the value of `ctrlc` at the end of each iteration(or loop)
> whether it is valid (0~7) or 8, and initializes `ctrlc` on the next iteration(or loop) correctly.
> I think it is more concise in patch v6.
And, in the case mentioned above where we enter the while loop with sp
= src_end - 1 , we can read the control byte and still store the
correct value for ctrlc.
>>> [varlena.c api]
>> That's better, and I think we can take it a little bit farther.
>>
>> 1. Notice that TextPositionState is allocated on the stack in
>> text_position(), which passes both the "state" pointer and the "iter"
>> pointer to text_position_setup(), and only then sets state->iter =
>> iter. We can easily set this inside text_position(). That would get
>> rid of the need for other callers to pass NULL iter to
>> text_position_setup().
>
>
> Done.
That looks much cleaner, thanks.
I've repeated my performance test to make sure there's no additional
regression in my suffix tests:
master patch v7
comp. end 1560s 1600ms
uncomp. end 896ms 890ms
The regression from master in the compressed case is about 2.5%, which
is no different from the last patch I tested, so that's good.
At this point, there are no functional things that I think we need to
change. It's close to ready-for-committer. For the next version, I'd
like you go through the comments and edit for grammar, spelling, and
clarity as you see fit. I know you're not a native speaker of English,
so I can help you with anything that remains. Also note we use braces
on their own lines
{
like this
}
We do have a source formatting tool (pgindent), but it helps
readability for committers to have it mostly standard beforehand.
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2019-08-14 05:00:27 | Re: BF failure: could not open relation with OID XXXX while querying pg_views |
Previous Message | Peter Eisentraut | 2019-08-14 04:53:31 | Re: clean up obsolete initdb locale handling |