From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Yeb Havinga <yebhavinga(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Fix for seg picksplit function |
Date: | 2010-11-16 08:57:40 |
Message-ID: | AANLkTimL7-iLSfv23osE6WO-s9FQQNBq3vBnCbZ7DYVS@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 16, 2010 at 3:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The loop that begins here:
>
> for (i = 0; i < maxoff; i++)
> {
> /* First half of segs goes to the left datum. */
> if (i < seed_2)
>
> ...looks like it should perhaps be broken into two separate loops.
> That might also help tweak the logic in a way that eliminates this:
>
> seg.c: In function ‘gseg_picksplit’:
> seg.c:327: warning: ‘datum_r’ may be used uninitialized in this function
> seg.c:326: warning: ‘datum_l’ may be used uninitialized in this function
>
I restored original version of that loop.
> But on a broader note, I'm not very certain the sorting algorithm is
> sensible. For example, suppose you have 10 segments that are exactly
> '0' and 20 segments that are exactly '1'. Maybe I'm misunderstanding,
> but it seems like this will result in a 15/15 split when we almost
> certainly want a 10/20 split. I think there will be problems in more
> complex cases as well. The documentation says about the less-than and
> greater-than operators that "These operators do not make a lot of
> sense for any practical purpose but sorting."
I think almost any split algorithm has corner cases when it's results don't
look very good. I think the way to understand significance of these corner
cases for real life is to perform sufficient testing on datasets which is
close to real life. I'm not feeling power to propose enough of test datasets
and estimate their significance for real life cases, and I need help in this
field.
----
With best regards,
Alexander Korotkov.
Attachment | Content-Type | Size |
---|---|---|
seg_picksplit_fix-0.4.patch | text/x-patch | 6.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Shigeru HANADA | 2010-11-16 09:36:02 | Re: SQL/MED estimated time of arrival? |
Previous Message | Tom Lane | 2010-11-16 04:34:49 | Re: [COMMITTERS] pgsql: Improved parallel make support |