From: | Andreas Joseph Krogh <andreak(at)officenet(dot)no> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Maximum document-size of text-search? |
Date: | 2010-07-22 13:31:30 |
Message-ID: | 4C484832.1040903@officenet.no |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi.
I'm trying to index the contents of word-documents, extracted text,
which leads to quite large documents sometimes. This resutls in the
following Exception:
Caused by: org.postgresql.util.PSQLException: ERROR: index row requires
10376 bytes, maximum size is 8191
I have the following schema:
andreak=# \d origo_search_index
Table "public.origo_search_index"
Column | Type
| Modifiers
--------------------------+-------------------+-----------------------------------------------------------------
id | integer | not null default
nextval('origo_search_index_id_seq'::regclass)
entity_id | integer | not null
entity_type | character varying | not null
field | character varying | not null
search_value | character varying | not null
textsearchable_index_col | tsvector |
"origo_search_index_fts_idx" gin (textsearchable_index_col)
Triggers:
update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON
origo_search_index FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger('textsearchable_index_col',
'pg_catalog.english', 'search_value')
I store all the text extracted from the documents in "search_value" and
have the built-in trigger tsvector_update_trigger update the
tsvector-column.
Any hints on how to get around this issue to allow indexing large
documents? I don't see how "only index the first N bytes of the
document" would be of interest to anyone...
BTW: I'm using PG-9.0beta3
--
Andreas Joseph Krogh<andreak(at)officenet(dot)no>
Senior Software Developer / CTO
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Rosenholmveien 25 | know how to do a thing and to watch |
1414 Trollåsen | somebody else doing it wrong, without |
NORWAY | comment. |
| |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Sabino Mullane | 2010-07-22 13:34:25 | Re: Finding last checkpoint time |
Previous Message | Satoshi Nagayasu | 2010-07-22 13:29:52 | ECPG - Some errno definitions don't match to the manual |