From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | uwe(dot)binder(at)pass-consulting(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18080: to_tsvector fails for long text input |
Date: | 2023-09-22 17:48:59 |
Message-ID: | 1056584.1695404939@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I wrote:
> Yeah. My thought about blocking the error had been to limit
> prs.lenwords to MaxAllocSize/sizeof(ParsedWord) in this code.
Concretely, as attached. This allows the given test case to
complete, since it doesn't actually create very many distinct
words. In other cases we could expect to fail when the array
has to get enlarged, but that's just a normal implementation
limitation.
I looked for other places that might initialize lenwords
to not-sane values, and didn't find any.
BTW, the field order in ParsedWord is such that there's a fair
amount of wasted pad space on 64-bit builds. I doubt we can
get away with rearranging it in released branches; but maybe
it's worth doing something about that in HEAD, to push out
the point at which you hit the 1Gb limit.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
bound-lenwords-in-to_tsvector_byid.patch | text/x-diff | 534 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2023-09-22 18:55:59 | Re: [16+] subscription can end up in inconsistent state |
Previous Message | Heikki Linnakangas | 2023-09-22 13:43:06 | Re: BUG #18129: GiST index produces incorrect query results |