From: | Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com> |
---|---|
To: | Kenneth Marshall <ktm(at)rice(dot)edu> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Hash support for arrays |
Date: | 2010-11-03 09:24:16 |
Message-ID: | AANLkTik+zqRBEYzy1X5p_6DnGA=V9BeMS=+HCg_hoSLE@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2010/11/2 Kenneth Marshall <ktm(at)rice(dot)edu>:
> Given that our hash implimentation mixes the input data well (It does.
> I tested it.) then a simple rotate-and-xor method is all that should
> be needed to maintain all of the needed information. The original
> hash function has done the heavy lifting in this case.
Even with the perfect hash function for the elements, certain
combinations of elements could still lead to massive collisions. E.g.,
if repeated values are typical in the input data we are talking about,
then the rotate-and-xor method would still lead to collisions between
any array of the same values of certain lengths, regardless of the
value. In Tom's implementation, as he mentioned before, those
problematical lengths would be multiples of 32 (e.g., an array of 32
1s would collide with an array of 32 2s would collide with an array of
32 3s, etc).
Nicolas
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2010-11-03 09:34:30 | Re: pgsql: Bootstrap WAL to begin at segment logid=0 logseg=1 (000000010000 |
Previous Message | Marc Cousin | 2010-11-03 08:03:21 | Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up |