From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org |
Subject: | pgsql: Cope with more than 64K phrases in a thesaurus dictionary. |
Date: | 2014-11-07 01:53:43 |
Message-ID: | E1XmYkB-0005Fi-Iz@gemulon.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
Cope with more than 64K phrases in a thesaurus dictionary.
dict_thesaurus stored phrase IDs in uint16 fields, so it would get confused
and even crash if there were more than 64K entries in the configuration
file. It turns out to be basically free to widen the phrase IDs to uint32,
so let's just do so.
This was complained of some time ago by David Boutin (in bug #7793);
he later submitted an informal patch but it was never acted on.
We now have another complaint (bug #11901 from Luc Ouellette) so it's
time to make something happen.
This is basically Boutin's patch, but for future-proofing I also added a
defense against too many words per phrase. Note that we don't need any
explicit defense against overflow of the uint32 counters, since before that
happens we'd hit array allocation sizes that repalloc rejects.
Back-patch to all supported branches because of the crash risk.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/d6e37b35cda9a88dfd938dd61e9986dd93cc6dd3
Modified Files
--------------
src/backend/tsearch/dict_thesaurus.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2014-11-07 13:13:32 | pgsql: Remove obsolete cases from GiST update redo code. |
Previous Message | Tom Lane | 2014-11-06 16:41:27 | pgsql: Fix normalization of numeric values in JSONB GIN indexes. |