From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | "Dan O'Hara" <danarasoftware(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Subject: | Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores |
Date: | 2010-03-13 00:48:24 |
Message-ID: | 201003130048.o2D0mOP16522@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
Teodor Sigaev wrote:
> > Oleg, Teodor, can you look at this? I tried to fix it in wparser_def.c,
> > but couldn't figure out how. Thanks.
> >>
> >> select distinct token as email
> >> from ts_parse('default', ' first_last(at)yahoo(dot)com ' )
> >> where tokid = 4
>
> Patch in attachment, it allows underscore in the middle of local part of email
> in in host name (similarly to '-' character).
Thanks, patch applied.
> I'm not sure about backpatching, because it could break existing search
> configuration.
Agreed. I don't think this warrants backpatching.
Here is the before behavior:
test=> select ts_parse('default', ' first_last(at)yahoo(dot)com ' );
ts_parse
--------------------
(12," ")
(1,first)
(12,_)
--> (4,last(at)yahoo(dot)com)
(12," ")
(5 rows)
and the after-patch, fixed behavior:
test=> select ts_parse('default', ' first_last(at)yahoo(dot)com ' );
ts_parse
--------------------------
(12," ")
--> (4,first_last(at)yahoo(dot)com)
(12," ")
(3 rows)
I assume because this only expands the pattern space for email addresses
that there is no affect on binary upgrades with this patch. Is that
correct? Would an email address check on a binary-upgraded tsvector
index not match an email address with underscores? Do we need a warning
in the release notes about this?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-03-13 00:55:51 | Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores |
Previous Message | Wojciech Scigala | 2010-03-12 23:37:02 | BUG #5374: NULLed SERIAL improperly dumped |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2010-03-13 00:55:51 | Re: Re: [BUGS] BUG #5021: ts_parse doesn't recognize email addresses with underscores |
Previous Message | Tatsuo Ishii | 2010-03-12 23:39:24 | Re: Reposnse from backend when wrong user/database request send |