From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paesold <mpaesold(at)gmx(dot)at>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: default_text_search_config and expression indexes |
Date: | 2007-07-31 01:25:03 |
Message-ID: | 200707310125.l6V1P3M11437@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-advocacy pgsql-hackers |
Bruce Momjian wrote:
> We have to decide if we want a GUC default_text_search_config, and if so
> when can it be changed.
>
> Right now there are three ways to create a tsvector (or tsquery)
>
> ::tsvector
> to_tsvector(value)
> to_tsvector(config, value)
>
> (ignoring plainto_tsvector)
>
> Only the last one specifies the configuration. The others use the
> configuration specified by default_text_search_config. (We had an
> previous discussion on what the default value of
> default_text_search_config should be, and it was decided it should be
> set via initdb based on a flag or the locale.)
>
> Now, because most people use a single configuration, they can just set
> default_text_search_config and there is no need to specify the
> configuration name.
>
> However, expression indexes cause a problem here:
>
> http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX
>
> We recommend that users create an expression index on the column they
> want to do a full text search on, e.g.
>
> CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body));
>
> However, the big problem is that the expressions used in expression
> indexes should not change their output based on the value of a GUC
> variable (because it would corrupt the index), but in the case above,
> default_text_search_config controls what configuration is used, and
> hence the output of to_tsvector is changed if default_text_search_config
> changes.
>
> We have a few possible options:
>
> 1) Document the problem and do nothing else.
> 2) Make default_text_search_config a postgresql.conf-only
> setting, thereby making it impossible to change by non-super
> users, or make it a super-user-only setting.
> 3) Remove default_text_search_config and require the
> configuration to be specified in each function call.
>
> If we remove default_text_search_config, it would also make ::tsvector
> casting useless as well.
OK, I just found a case that I think is going to make #3 a requirement
(remove default_text_search_config).
How is a CREATE INDEX ... to_tsvector(col) going to restore from a
pg_dump? I see no way of guaranteeing that the
default_text_search_config is correct on the restore, and in fact I
don't think we have any way of knowing the default_text_search_config
used for the index.
And if we have to require the configuration name in CREATE INDEX, it has
to be used in WHERE, so we might as well just remove the default
capability and always require the configuration name.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2007-07-31 02:09:36 | Re: [GENERAL] European users mailing list |
Previous Message | Josh Berkus | 2007-07-30 23:38:28 | LinuxWorldExpo Page is up |
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2007-07-31 02:14:21 | Re: Machine available for community use |
Previous Message | Decibel! | 2007-07-31 01:09:46 | Re: ascii() for utf8 |