Re: PostgreSQL 8.0.6 crash

From: Rick Gigger <rick(at)alpinenetworking(dot)com>
To: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL 8.0.6 crash
Date: 2006-02-10 00:30:17
Message-ID: 972FEDD4-E9F1-4C89-8110-60056A4CCCC7@alpinenetworking.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Feb 9, 2006, at 12:49 PM, Mark Woodward wrote:

>> On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
>>>> "Mark Woodward" <pgsql(at)mohawksoft(dot)com> writes:
>>>>> Again, regardless of OS used, hashagg will exceed "working
>>>>> memory" as
>>>>> defined in postgresql.conf.
>>>>
>>>> So? If you've got OOM kill enabled, it can zap a process
>>>> whether it's
>>>> strictly adhered to work_mem or not. The OOM killer is entirely
>>> capable
>>>> of choosing a victim process whose memory footprint hasn't changed
>>>> materially since it started (eg, the postmaster).
>>>
>>> Sorry, I must strongly disagree here. The postgresql.conf
>>> "working mem"
>>> is
>>> a VERY IMPORTANT setting, it is intended to limit the consumption of
>>> memory by the postgresql process. Often times PostgreSQL will
>>> work along
>>
>> Actually, no, it's not designed for that at all.
>
> I guess that's a matter of opinion.
>
>>
>>> side other application servers on the same system, infact, may be a
>>> sub-part of application servers on the same system. (This is, in
>>> fact,
>>> how
>>> it is used on one of my site servers.)
>>>
>>> Clearly, if the server will use 1000 times this number (Set for
>>> 1024K,
>>> but
>>> exceeds 1G) this is broken, and it may cause other systems to
>>> fail or
>>> perform very poorly.
>>>
>>> If it is not something that can be fixed, it should be clearly
>>> documented.
>>
>> work_mem (integer)
>>
>> Specifies the amount of memory to be used by internal sort
>> operations and hash tables before switching to temporary disk
>> files.
>> The value is specified in kilobytes, and defaults to 1024
>> kilobytes
>> (1 MB). Note that for a complex query, several sort or hash
>> operations might be running in parallel; each one will be
>> allowed to
>> use as much memory as this value specifies before it starts to
>> put
>> data into temporary files. Also, several running sessions
>> could be
>> doing such operations concurrently. So the total memory used
>> could
>> be many times the value of work_mem; it is necessary to keep this
>> fact in mind when choosing the value. Sort operations are used
>> for
>> ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
>> joins, hash-based aggregation, and hash-based processing of IN
>> subqueries.
>>
>> So it says right there that it's very easy to exceed work_mem by a
>> very
>> large amount. Granted, this is a very painful problem to deal with
>> and
>> will hopefully be changed at some point, but it's pretty clear as
>> to how
>> this works.
>
> Well, if you read that paragraph carefully, I'll admit that I was a
> little
> too literal in my statement apliying it to the "process" and not
> specific
> functions within the process, but in the documentation:
>
> "each one will be allowed to use as much memory as this value
> specifies
> before it starts to put data into temporary files."
>
> According to the documentation the behavior of hashagg is broken.
> It did
> not use up to this amount and then start to use temporary files, it
> used
> 1000 times this limit and was killed by the OS.
>
> I think it should be documented as the behavior is unpredictable.

It seems to me that the solution for THIS INCIDENT is to run an
analyze. That should fix the problem at hand. I have nothing to say
about the OOM issue except that hopefully the analyze will prevent
him from running out of memory at all.

However if hashagg truly does not obey the limit that is supposed to
be imposed by work_mem then it really ought to be documented. Is
there a misunderstanding here and it really does obey it? Or is
hashagg an exception but the other work_mem associated operations
work fine? Or is it possible for them all to go out of bounds?

Even if you've got 100 terabyts of swap space though if seems like if
your system is very heavy on reads then you would really want that
single backend to start using up your disk space and leave your
memory alone so that most of your data can stay cached and largely
unaffeted by the problem of one backend.

If your bottleneck is writing to the disk then it doesn't really seem
to matter. You just need to make sure that huge out of control
hashagg never occurs. If your disks get saturated with writes
because of the hashagg of one backend then all other processes that
need to write a lot of info to disk are going to come to a grinding
halt and queries are not going to complete quickly and build up and
you will have a huge mess on your hands that will essentially prevent
postgres from being able to do it's job even if it doesn't actually
die. In this situation disk bandwidth is a scarce commodity and
whether you let the OS handle it all with virtual memory or you let
postgres swap everything out to disc for that one operation you are
still using disc to make up for a lack of RAM. At some point you
you've either got to stock up on enough RAM to run your queries
properly or alter how your queries run to use less RAM. Having a
process go out of control in resource usage is going to cause big
problems one way or another.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-02-10 00:44:23 Re: PostgreSQL 8.0.6 crash
Previous Message Ernst Herzberg 2006-02-10 00:23:00 Re: PostgreSQL 8.0.6 crash