From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Bill Moran <wmoran(at)collaborativefusion(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Advice on selecting good values for work_mem? |
Date: | 2006-12-07 18:16:22 |
Message-ID: | 20061207181621.GE24675@kenobi.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
* Bill Moran (wmoran(at)collaborativefusion(dot)com) wrote:
> What I'm fuzzy on is how to discretely know when I'm overflowing
> work_mem? Obviously, if work_mem is exhausted by a particular
> query, temp files will be created and performance will begin to suck,
I don't believe this is necessairly *always* the case. There are
instances in PostgreSQL where it will just continue to allocate memory
beyond the work_mem setting. This is usually due to poor statistics
(you changed the data in the table dramatically and havn't run analyze,
or you never ran analyze on the table at all, or the statistics
gathering values are set too low to capture enough information about
the data, etc). It would nice if it was possible to have this detected
and logged, or similar. Additionally, work_mem isn't actually a
per-query thing, aiui, it's more like a per-node in the planner thing.
That is to say that if you have multiple sorts going on, or a sort and a
hash, that *both* of those expect to be able to use up to work_mem
amount of memory.
Also, another point you might want to consider how to handle is that
work_mem has no bearing on libpq and I don't recall there being a way to
constrain libpq's memory usage. This has been an issue for me just
today when a forgot a couple parameters to a join which caused a
cartesean product result and ended up running the box out of memory.
Sure, it's my fault, and unlikely to happen in an application, but it
still sucks. :) It also managed to run quickly enough that I didn't
notice what was happening. :/ Of course, the server side didn't need
much memory at all to generate that result. Also, libpq stores
everything in *it's* memory before passing it to the client. An example
scenario of this being kind of an issue is psql, you need double the
memory size of a given result because the result is first completely
grabbed and stored in libpq and then sent to your pager (eg: less) which
then sucks it all into memory again. In applications (and I guess psql,
though I never think of it, and it'd be nice to have as a configurable
option if it isn't already...) you can use cursors to limit the amount
of memory libpq uses.
As these are new things (both the temp file creation logging and the
work_mem overflow detection, I believe), this discussion is probably
more appropriate for -hackers.
> That leads to my other question. Assuming I've got lots of
> connections (which I do), how can I determine if work_mem is too
> high? Do server processes allocated it even if they don't actually
> use it? Is the only way to find out to reduce it and see when it
> starts to be a problem? If so, that leads back to my first question:
> how can I be sure whether temp files were created or not?
Yeah, look for swappiness... It'd be nice to be able to get memory
statistics on queries which have been run though...
> My goal is to set work_mem as small as is possible for the most
> common queries, then force the developers to use "set work_mem to x"
> to adjust it for big queries.
Sounds like an excellent plan. Be careful though, work_mem settings can
affect query plans and they may discover that if set high enough the
planner will, for example, do a hashjoin which is much faster than
sorting and merge-joining, but takes alot of memory... They may say
"hey, I like it being fast" but not consider what happens when alot of
those queries run at once..
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2006-12-07 21:23:37 | Re: File Systems Compared |
Previous Message | Tom Lane | 2006-12-07 17:04:40 | Re: How to determine if my setting for shared_buffers is too high? |