From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Gunther <raj(at)gusw(dot)net>, pgsql-performance(at)lists(dot)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Subject: | Re: Out of Memory errors are frustrating as heck! |
Date: | 2019-04-21 23:07:53 |
Message-ID: | 29762.1555888073@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
>> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>>> If a single 32 bit hash value has enough tuples by itself to not fit in
>>> work_mem, then it will keep splitting until that value is in a batch by
>>> itself before shutting off
>> Right, that's the code's intention. If that's not good enough for this
>> case, we'll need to understand the details a bit better before we can
>> design a better(?) heuristic.
> I think we only disable growing when there are no other values in the
> batch, but that seems rather easy to defeat - all you need is a single
> tuple with a hash that falls into the same batch, and it's over. Maybe
> we should make this a bit less accurate - say, if less than 5% memory
> gets freed, don't add more batches.
Yeah, something like that, but it's hard to design it without seeing some
concrete misbehaving examples.
I think though that this is somewhat independent of the problem that we're
not including the I/O buffers in our reasoning about memory consumption.
> An alternative would be spilling the extra tuples into a special
> overflow file, as I explained earlier. That would actually enforce
> work_mem I think.
Well, no, it won't. If you have umpteen gigabytes of RHS tuples with the
exact same hash code, nothing we can do here is going to prevent you from
having to process those in a single table load. (If it's a plain inner
join, maybe you could break that into subsections anyway ... but that
won't work for left or full joins where you need per-tuple match status.)
I think our ambition here should just be to not have the code go crazy
trying to keep its memory consumption under work_mem when it's ultimately
going to fail to do so anyhow.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Gunther | 2019-04-21 23:25:15 | Re: Out of Memory errors are frustrating as heck! |
Previous Message | Adrian Klaver | 2019-04-21 22:13:23 | Re: Backup and Restore (pg_dump & pg_restore) |