From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | james+postgres(at)carbocation(dot)com, pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #14654: With high statistics targets on ts_vector, unexpectedly high memory use & OOM are triggered |
Date: | 2017-07-12 11:32:11 |
Message-ID: | dc65ba89-46d1-07f2-3f94-51ba00446931@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 05/14/2017 11:06 PM, james+postgres(at)carbocation(dot)com wrote:
> It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
> target) * (size of data in field), which in my case ended up being well
> above 10 gigabytes. I wonder if this might be considered a bug (either in
> code, or of documentation), as this memory usage seems not to obey other
> limits, or at least wasn't documented in a way that might have helped me
> guess at the underlying problem.
Yes, I can see that happening here too. The problem seems to be that the
analyze-function detoasts every row in the sample. Tsvectors can be very
large, so it adds up.
That's pretty easy to fix, the analyze function needs to free the
detoasted copies as it goes. But in order to do that, it needs to make
copies of all the lexemes stored in the hash table, instead of pointing
directly to the detoasted copies.
Patch attached. I think this counts as a bug, and we should backport this.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
reduce-tsvector-analyze-memory-usage.patch | text/x-diff | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2017-07-12 12:45:48 | Re: BUG #14721: Assertion of synchronous replication |
Previous Message | K S, Sandhya (Nokia - IN/Bangalore) | 2017-07-12 11:20:58 | Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core |