From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: texteq/byteaeq: avoid detoast [REVIEW] |
Date: | 2011-01-19 08:22:41 |
Message-ID: | 20110119082241.GB11804@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 18, 2011 at 10:03:01AM +0200, Heikki Linnakangas wrote:
>> That isn't ever going to happen, unless you'd like to give up hash joins
>> and hash aggregation on text values.
>
> You could canonicalize the string first in the hash function. I'm not
> sure if we have all the necessary information at hand there, but at
> least with some encoding/locale-specific support functions it'd be
> possible.
This is what strxfrm() was created for.
strcoll(a,b) == strcmp(strxfrm(a),strxfrm(b))
Sure there's a cost, the question is only how much and whether it makes
hash join unfeasible. I doubt it, since by definition it must be faster
than strcoll(). I suppose a test would be interesting.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2011-01-19 09:45:00 | Re: Extending opfamilies for GIN indexes |
Previous Message | Andrea Suisani | 2011-01-19 08:20:59 | Re: limiting hint bit I/O |