| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> |
| Cc: | NISHIYAMA Tomoaki <tomoakin(at)staff(dot)kanazawa-u(dot)ac(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Notes about fixing regexes and UTF-8 (yet again) |
| Date: | 2012-02-18 23:45:10 |
| Message-ID: | 7392.1329608710@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Yeah, it's conceivable that we could implement something whereby
>> characters with codes above some cutoff point are handled via runtime
>> calls to iswalpha() and friends, rather than being included in the
>> statically-constructed DFA maps. The cutoff point could likely be a lot
>> less than U+FFFF, too, thereby saving storage and map build time all
>> round.
> It's been proposed to build a regexp type in PostgreSQL which would
> store the DFA directly and provides some way to run that DFA out of its
> storage without recompiling.
> Would such a mechanism be useful here?
No, this is about what goes into the DFA representation in the first
place, not about how we store it and reuse it.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2012-02-18 23:55:39 | Re: Future of our regular expression code |
| Previous Message | Dimitri Fontaine | 2012-02-18 23:12:09 | Re: Future of our regular expression code |