From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paesold <mpaesold(at)gmx(dot)at>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: default_text_search_config and expression indexes |
Date: | 2007-08-01 21:33:52 |
Message-ID: | 200708012133.l71LXqS03436@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-advocacy pgsql-hackers |
Oleg Bartunov wrote:
> On Tue, 31 Jul 2007, Bruce Momjian wrote:
>
> > Oleg Bartunov wrote:
> >> On Tue, 31 Jul 2007, Bruce Momjian wrote:
> >>
> >>>>> And if we have to require the configuration name in CREATE INDEX, it has
> >>>>> to be used in WHERE, so we might as well just remove the default
> >>>>> capability and always require the configuration name.
> >>>>
> >>>> this is very rare use case for text searching
> >>>> 1. expression index without configuration name
> >>>> 2. default_text_search_config can be changed by somebody
> >>>
> >>> If you are going to be using the configuration name with the create
> >>> expression index, you have to use it in the WHERE clause (or the index
> >>> doesn't work), and I assume that is 90% of the text search uses. I
> >>> don't see it as rare at all.
> >>
> >> What is a basis of your assumption ? In my opinion, it's very limited
> >> use of text search, because it doesn't supports ranking. For 4-5 years
> >> of tsearch2 usage I never used it and I never seem in mailing lists.
> >> This is very user-oriented feature and we could probably ask
> >> -general people for their opinion.
> >
> > I doubt 'general' is going to understand the details of merging this
> > into the backend. I assume we have enough people on hackers to decide
> > this.
>
> I mean not technical details, but use case. Does they need expressional
> index without ranking but sacrifice ability to use default configuration
> in other cases too ? My prediction is that people doesn't ever thought about
> this possibility until we said them about.
In a choice between expression indexes and default_text_search_config,
there is no question in my mind that expression indexes are more useful.
Lack of default_text_search_config only means you have to specify the
configuration name every time, and can't do casting to a text search
data type.
> > Are you saying the majority of users have a separate column with a
> > trigger? Does the trigger specify the configuation? I don't see that
> > as a parameter argument to tsvector_update_trigger(). If you reload a
> > pg_dump, what does it use for the configuration?
> >
>
> yes, separate column with custom trigger works fine. It's up to you how
> to keep your data actual and it's up to you how to write trigger.
> Our tsvector_update_trigger() is a tsvector_update_trigger_example() !
Well, that is the major problem --- that this is very error-prone,
especially considering that the tsvector_update_trigger() doesn't get it
right either.
> > Why is a separate column better than the index? Just ranking?
>
> ranking + composite documents. I already mentioned, that this could be
> rather expensive. Also, having separate column allow people various
> ways to say what is a document and even change it.
OK, I am confused why an expression index can't use those features if a
separate column can. I realize the index can't store that information,
but why can the code pick it out of a heap column but not run the
function on the heap row to get that information. I assume it is
something that is just hard to implement.
> > The reason the expression index is nice is this feature has to be easy
> > to use for people who are new to full text and even PostgreSQL. Right
> > now /contrib is fine for experts to use, but we want a larger user base
> > for this feature.
>
> I agree here. This was one of the main reason of our work for 8.3.
> Probably, we shold think in another direction - not to curtail tsearch2
> and confuse rather big existing users, but to add an ability to save somehow
> configuration used for creating of *document*
> either implicitly (in expression index, or just gin(text_column)), or
> explicitly (separate column). There is no problem with index itself !
Agreed. We need to find a way to save the configuration when the output
of a text search function is stored, either in an expression index or
via a trigger into a separate column, but only if we allow the default
configuration to be changed by non-super-users.
> >
> > Should we hold the patch for 8.4?
>
> If we're not agree to say in docs, that implicit usage of text search
> configuration in CREATE INDEX command doesn't supported. Could we leave
> default_text_search_config for super-users, at least ?
>
> Anyway, let's wait what other people say.
The big problem is that not many people have taken the time to fully
understand how full text search works. I hoped that putting the updated
documentation online would help:
http://momjian.us/expire/fulltext/HTML/textsearch.html
but it seems it hasn't.
What we could do it if we make default_text_search_config
super-user-only and tell users at the start that if
default_text_search_config doesn't match the language they want to use,
then they have to read a documentation section that explains the problem
of configuration mismatches.
The problem with that is that we should be setting
default_text_search_config in the pg_dump output, like we do for
client_encoding, but because it is a super-user-only, it will fail for
non-super-user restores.
So, I am back to thinking default_text_search_config isn't going to
work reliably for novice users.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-08-01 21:42:53 | Re: default_text_search_config and expression indexes |
Previous Message | Ron Mayer | 2007-08-01 21:21:12 | Re: default_text_search_config and expression indexes |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-08-01 21:42:53 | Re: default_text_search_config and expression indexes |
Previous Message | Ron Mayer | 2007-08-01 21:21:12 | Re: default_text_search_config and expression indexes |