From: | Volkan YAZICI <yazicivo(at)ttnet(dot)net(dot)tr> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: intarray internals |
Date: | 2006-05-07 10:21:36 |
Message-ID: | 20060507102136.GA202@alamut |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Hi,
I've prepared a Quick & Dirty patch serie for some missing parts in
intarray contrib module. Here they go with some explanation...
On May 06 12:38, Volkan YAZICI wrote:
> [4]
> In the inner_int_contains() function of _int_tool.c, there's a while
> loop like
>
> [Code assumes that arrays are sorted.]
>
> na = ARRNELEMS(a);
> nb = ARRNELEMS(b);
> da = ARRPTR(a);
> db = ARRPTR(b);
>
> i = j = n = 0;
> while (i < na && j < nb)
> if (da[i] < db[j])
> i++;
> else if (da[i] == db[j])
> {
> n++;
> i++;
> j++;
> }
> else
> j++;
>
> return (n == nb) ? TRUE : FALSE;
>
> AFAICS, last "j++" should be replaced with "return FALSE". Because, "n"
> cannot be equal to "nb" no more, if "j" gets incremented without
> incrementing "n" (remember "j < nb" in the "while" condition).
intarray_contains.patch.0 is for above problem.
[5]
ISTM, inner_int_union() of _int_tool.c, makes some redundant visits to
array:
...
/* union */
i = j = 0;
while (i < na && j < nb)
if (da[i] < db[j])
*dr++ = da[i++];
else
*dr++ = db[j++];
while (i < na)
*dr++ = da[i++];
while (j < nb)
*dr++ = db[j++];
}
if (ARRNELEMS(r) > 1)
r = _int_unique(r);
IMHO, uniting unique values (instead of uniting and then removing
duplicates) should work faster. intarray_union.patch.0 is for this
problem. (Patched code, handles uniting for unique values.)
[6]
There's a seperate sorting algorithm isort() in _int_tool.c. Looks like
it executes some kind of shell sort on the array and returns true if
array has any duplicates. It's used for common sorting and deciding on
executing _int_unique() on the related array if isort() says it has
duplicates.
IIRC, our inner qsort() has a smaller O(N) degree when compared to above
sorting algorithm. Also for the code's sake it'd be better to switch
using qsort() in all sorting related stuff. For these reasons,
intarray_sort.patch.0 addresses this (probable) gotcha.
All 3 patches passed regression tests for intarray contrib module. But
these are just for addressing some gotchas I found while reading code.
Your comments for these problems(?) are welcome.
Regards.
Attachment | Content-Type | Size |
---|---|---|
intarray_contains.patch.0 | text/plain | 546 bytes |
intarray_sort.patch.0 | text/plain | 2.3 KB |
intarray_union.patch.0 | text/plain | 2.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | P.Mo | 2006-05-07 18:07:39 | Winning |
Previous Message | Lincoln Yeoh | 2006-05-07 09:48:32 | Re: Can't Figure Out Where Rows Are Going |
From | Date | Subject | |
---|---|---|---|
Next Message | Gurjeet Singh | 2006-05-07 12:03:42 | Re: [pgsql-hackers-win32] Build with Visual Studio & MSVC |
Previous Message | Gurjeet Singh | 2006-05-07 10:15:23 | Fwd: [pgsql-hackers-win32] Build with Visual Studio & MSVC |