Re: machine-dependent hash_any vs the regression tests

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: machine-dependent hash_any vs the regression tests
Date: 2008-04-06 15:45:14
Message-ID: 20080406154514.GA21544@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 05, 2008 at 05:57:35PM -0400, Tom Lane wrote:
> So the proposed changes in hash_any make its hash values different
> between big-endian and little-endian machines (at least for string keys;
> for keys that are really arrays of int, I think the changes will
> unify the behavior). This means that the hash_seq_search traversal
> order for an internal hash table changes, and it turns out this breaks
> at least two regression tests: portals and dblink. The portals test
> is easy to fix by adding a couple of ORDER BYs, but the problem with
> dblink is here:
>
> SELECT dblink_get_connections();
> dblink_get_connections
> ------------------------
> ! {dtest1,dtest2,dtest3}
> (1 row)
>
> SELECT dblink_is_busy('dtest1');
> --- 714,720 ----
> SELECT dblink_get_connections();
> dblink_get_connections
> ------------------------
> ! {dtest1,dtest3,dtest2}
> (1 row)
>
> SELECT dblink_is_busy('dtest1');
>
> and right offhand I can't think of a simple way to force those array
> elements into a consistent order.
>
> No doubt that can be worked around, but does anyone wish to argue that
> this whole thing is a bad path to be headed down? We're not going to
> gain a *whole* lot of speedup from the word-wide-hashing change, and
> so maybe this type of headache isn't worth the trouble.
>
> regards, tom lane
>
It may be just me, but it is a little bit surprising that the order of
a sequential search traversal should matter. It smacks of the row ordering
being non-deterministic without specifying an "order by". As long as all
of the values are returned, it would make sense not to have the regression
tests depend on that ordering. It will make it easier to evaluate new hash
functions if they do not break the regression tests in such an unintuitive
way -- my two cents.

Regards,
Ken Marshall

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-04-06 17:41:43 Re: machine-dependent hash_any vs the regression tests
Previous Message Hannu Krosing 2008-04-06 07:01:20 Adding pipelining support to set returning functions