From: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Small improvement to compactify_tuples |
Date: | 2017-11-03 14:37:46 |
Message-ID: | CAGTBQpYtaiZfN+=EExrm_oXvypMmKeEjue33ZkzGohDr-jbQyA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Nov 2, 2017 at 11:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru> writes:
>> [ 0001-Improve-compactify_tuples.patch, v5 or thereabouts ]
>
> I started to review this patch. I spent a fair amount of time on
> beautifying the code, because I found it rather ugly and drastically
> undercommented. Once I had it to the point where it seemed readable,
> I went to check the shellsort algorithm against Wikipedia's entry,
> and found that this appears to be an incorrect implementation of
> shellsort: where pg_shell_sort_pass has
>
> for (_i = off; _i < _n; _i += off) \
>
> it seems to me that we need to have
>
> for (_i = off; _i < _n; _i += 1) \
>
> or maybe just _i++. As-is, this isn't h-sorting the whole file,
> but just the subset of entries that have multiple-of-h indexes
> (ie, the first of the h distinct subfiles that should get sorted).
> The bug is masked by the final pass of plain insertion sort, but
> we are not getting the benefit we should get from the earlier passes.
>
> However, I'm a bit dubious that it's worth fixing that; instead
> my inclination would be to rip out the shellsort implementation
> entirely. The code is only using it for the nitems <= 48 case
> (which makes the first three offset steps certainly no-ops) and
> I am really unconvinced that it's worth expending the code space
> for a shellsort rather than plain insertion sort in that case,
> especially when we have good reason to think that the input data
> is nearly sorted.
I actually noticed that and benchmarked some variants. Neither
made any noticeable difference in performance, so I decided not
to complain about them.
I guess the same case can be made for removing the shell sort.
So I'm inclined to agree.
> BTW, the originally given test case shows no measurable improvement
> on my box.
I did manage to reproduce the original test and got a consistent improvement.
> I was eventually able to convince myself by profiling
> that the patch makes us spend less time in compactify_tuples, but
> this test case isn't a very convincing one.
>
> So, quite aside from the bug, I'm not excited about committing the
> attached as-is. I think we should remove pg_shell_sort and just
> use pg_insertion_sort. If somebody can show a test case that
> provides a measurable speed improvement from the extra code,
> I could be persuaded to reconsider.
My tests modifying the shell sort didn't produce any measurable
difference, but I didn't test removing it altogether.
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-11-03 14:53:30 | Re: [HACKERS] pgsql: Fix freezing of a dead HOT-updated tuple |
Previous Message | Tom Lane | 2017-11-03 14:35:20 | Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM |