From: | Edwin Groothuis <postgresql(at)mavetju(dot)org> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Euler Taveira de Oliveira <euler(at)timbira(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org> |
Subject: | Re: [BUGS] BUG #3975: tsearch2 index should not bomb out of 1Mb limit |
Date: | 2008-03-05 20:49:22 |
Message-ID: | 20080305204922.GC67445@k7.mavetju |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-patches |
On Wed, Mar 05, 2008 at 10:53:38AM -0500, Bruce Momjian wrote:
> Euler Taveira de Oliveira wrote:
> > Edwin Groothuis wrote:
> >
> > > Ouch. But... since very long words are already not indexed (is the length
> > > configurable anywhere because I don't mind setting it to 50 characters), I
> > > don't think that it should bomb out of this but print a similar warning like
> > > "String only partly indexed".
> > >
> > This is not a bug. I would say it's a limitation. Look at
> > src/include/tsearch/ts_type.h. You could decrease len in WordEntry to 9
> > (512 characters) and increase pos to 22 (4 Mb). Don't forget to update
> > MAXSTRLEN and MAXSTRPOS accordingly.
> >
> > > I'm still trying to determine how big the message it failed on was...
> > >
> > Maybe we should change the "string is too long for tsvector" to "string
> > is too long (%ld bytes, max %ld bytes) for tsvector".
>
> Good idea. I have applied the following patch to report in the error
> message the string length and maximum, like we already do for long
> words:
>
> Old:
> test=> select repeat('a', 3000)::tsvector;
> ERROR: word is too long (3000 bytes, max 2046 bytes)
>
> New:
> test=> select repeat('a ', 3000000)::tsvector;
> ERROR: string is too long for tsvector (1048576 bytes, max 1048575 bytes)
Is it possible to make it a WARNING instead of an ERROR? Right now I get:
NOTICE: word is too long to be indexed
DETAIL: Words longer than 2047 characters are ignored.
when updating the dictionary on a table, which will make it continue,
but with some long messages I get:
ERROR: string is too long for tsvector
Which is quite fatal for the whole UPDATE / INSERT statement.
Edwin
--
Edwin Groothuis | Personal website: http://www.mavetju.org
edwin(at)mavetju(dot)org | Weblog: http://www.mavetju.org/weblog/
From | Date | Subject | |
---|---|---|---|
Next Message | Ted Kremenek | 2008-03-05 22:12:47 | BUG #4015: uninitialized value passed as an argument to tm2timetz |
Previous Message | Bill Moran | 2008-03-05 18:58:25 | Re: BUG #4012: bug in pg_query |
From | Date | Subject | |
---|---|---|---|
Next Message | Bryce Nesbitt | 2008-03-05 21:06:12 | Proposed patch - psql wraps at window width |
Previous Message | Tom Lane | 2008-03-05 20:33:57 | Re: NetBSD/MIPS supports dlopen |