From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Gregory Stark <stark(at)enterprisedb(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: tsearch filenames unlikes special symbols and numbers |
Date: | 2007-09-09 04:38:37 |
Message-ID: | Pine.LNX.4.64.0709090817530.2767@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
On Sun, 2 Sep 2007, Tom Lane wrote:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>>> I made it reject all but latin letters, which is the same restriction
>>> that's in place for timezone set filenames. That might be overly
>>> strong, but we definitely have to forbid "." and "/" (and "\" on
>>> Windows). Do we want to restrict it to letters, digits, underscore?
>>> Or does it need to be weaker than that?
>
>> What's the problem with "."?
>
> ../../../../etc/passwd
>
> Possibly we could allow '.' as long as we forbade /, but the other
> trouble with allowing . is that it encourages people to try to specify
> the filetype suffix (as indeed Oleg was doing). I'd prefer to keep the
> suffixes out of the SQL object definitions, with an eye to possibly
> someday migrating all the configuration data inside the database.
> There's a reasonable argument for restricting the names used for these
> things in the SQL definitions to be valid SQL identifiers, so that that
> will work nicely...
So, what's the current policy ? Still a-z, A-Z ? I think we should allow
'.' and prevent '/'. Look, how ugly is our current ispell setup, which
depends on 3 files - stop word list, .dict and .aff.
Right now, I can use something like
CREATE TEXT SEARCH DICTIONARY en_ispell (
TEMPLATE = ispell,
DictFile = englishDict,
AffFile = englishAff,
StopWords = english
);
I'd better use english.dict, english.aff, english.stop, whih is usual for
any user, without dictating user here. We already did a lot of
restrictions.
I hope we won't require special extension like .dict, .aff, since it's
unknown in advance what files will use other dictionaries.
If we allow '.' without '/', then we'd be happy.
I'd remove requirement for extension of stop words list, which looks
rather artificially to me.
Oh, my god, I see we dictate extensions !
STATEMENT: CREATE TEXT SEARCH DICTIONARY en_ispell (
TEMPLATE = ispell,
DictFile = englishDict,
AffFile = englishAff,
StopWords = englishStop
);
ERROR: could not open dictionary file "/usr/local/pgsql-dev/share/tsearch_data/englishdict.dict": No such file or directory
Folk, this is too much ! Now, we dictate extensions '.dict, .affix, .stop',
what else ?
Does it defined by ispell template only, or it's global requirements ?
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2007-09-09 04:49:32 | Re: tsearch filenames unlikes special symbols and numbers |
Previous Message | Oleg Bartunov | 2007-09-04 16:37:47 | Re: Code examples |
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2007-09-09 04:43:19 | ispell dictionary broken in CVS HEAD ? |
Previous Message | Andrew Dunstan | 2007-09-09 04:02:28 | invalidly encoded strings |