From: | Fabio Pardi <f(dot)pardi(at)portavita(dot)eu> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Single query uses all system memory |
Date: | 2018-06-26 10:51:42 |
Message-ID: | 5f160fdf-7fa3-f840-1d5d-b827141dbd45@portavita.eu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Ivar,
I agree with Magnus. As addition, also i would change:
effective_cache_size
and
maintenance_work_mem
Then disable OOM killer, change logging in order to log files on disk to see if your work_mem is too low and spills on disk, rethink your data structure and, overall, I think that you can have a look to this page, which I find a good starting point for a proper tuning:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
regards,
fabio pardi
On 26/06/18 12:13, Magnus Hagander wrote:
>
>
> On Tue, Jun 26, 2018 at 11:40 AM, Ivar Fredriksen <ivar(dot)fredriksen(at)gmail(dot)com <mailto:ivar(dot)fredriksen(at)gmail(dot)com>> wrote:
>
> A single large query is able to spend all the system memory (as reported by top), and the oom_reaper kills postgres. See bottom of email for an example query and logs.
>
>
>
> Expected behavior would be that postgres is not killed and the query is limited to the shared_buffer memory, potentially failing, but not killing the postgres process.
>
>
> Then your expectations are completely wrong. shared_buffers have nothing to do with limiting the memory of a query. shared_buffers set the size of the PostgreSQL cache, not the working set. That's controlled by work_mem (see below).
>
>
>
> Do I have some fundamental flaw in my understanding of this, or is there some sort of configuration that should be in place that is missing? The offending sample query has been
>
> rewritten to not use joins, and will no longer crash postgres. I am not asking anyone to spend a lot of time analyzing the query itself, it is just provided as an example for when the problem will occur.
>
> My question is more in a broader sense why one query is eating through all system memory, and is there anything I can do to prevent this from happening?
>
> We have set shared_buffers to 256MB on a test-system that has 1GB memory. Production machines have more resources, but the errors are happening in exactly the same way, so I assume (perhaps wrongly) that using the test
>
> system specs and logs might give me the answers I need to figure out what is happening.
>
>
>
> Technical details are provided below, a big thanks to anyone who can provide any help or insight to this.
>
>
>
> Regards,
>
> Ivar Fredriksen
>
>
>
> PostgreSQL version number:
>
> PostgreSQL 10.4 (Debian 10.4-2.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit
>
> Installed with the debian packages for postgres found at: deb http://apt.postgresql.org/pub/repos/apt/ <http://apt.postgresql.org/pub/repos/apt/>
>
>
>
> Changes made to the settings in the postgresql.conf file:
>
> name | current_setting | source
>
> ------------------------------+-----------------------------------------+----------------------
>
> application_name | psql | client
>
> checkpoint_completion_target | 0.9 | configuration file
>
> client_encoding | UTF8 | client
>
> cluster_name | 10/main | configuration file
>
> DateStyle | ISO, MDY | configuration file
>
> default_statistics_target | 100 | configuration file
>
> default_text_search_config | pg_catalog.english | configuration file
>
> dynamic_shared_memory_type | posix | configuration file
>
> effective_cache_size | 1536MB | configuration file
>
> external_pid_file | /var/run/postgresql/10-main.pid | configuration file
>
> lc_messages | C | configuration file
>
> lc_monetary | C | configuration file
>
> lc_numeric | en_US.UTF-8 | configuration file
>
> lc_time | C | configuration file
>
> listen_addresses | * | configuration file
>
> log_line_prefix | %m [%p] %q%u(at)%d | configuration file
>
> log_timezone | UTC | configuration file
>
> maintenance_work_mem | 128MB | configuration file
>
> max_connections | 100 | configuration file
>
> max_stack_depth | 2MB | environment variable
>
> max_wal_size | 2GB | configuration file
>
> min_wal_size | 1GB | configuration file
>
> port | 5432 | configuration file
>
> search_path | "$user", public, pg_catalog | configuration file
>
> shared_buffers | 256MB | configuration file
>
> ssl | on | configuration file
>
> ssl_cert_file | /etc/ssl/certs/ssl-cert-snakeoil.pem | configuration file
>
> ssl_key_file | /etc/ssl/private/ssl-cert-snakeoil.key | configuration file
>
> standard_conforming_strings | off | configuration file
>
> stats_temp_directory | /var/run/postgresql/10-main.pg_stat_tmp | configuration file
>
> tcp_keepalives_idle | 0 | configuration file
>
> TimeZone | UTC | configuration file
>
> unix_socket_directories | /var/run/postgresql | configuration file
>
> wal_buffers | 16MB | configuration file
>
> work_mem | 1310kB | configuration file
>
> (35 rows)
>
>
>
> Operating system and version:
>
> Debian GNU/Linux 9.4 (stretch)
>
> Linux ip-172-31-30-104 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1 (2018-04-29) x86_64 GNU/Linux
>
>
>
> Machine specs:
>
> CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
>
> Memory: 1GB
>
> HDD: General purpose SSD from amazon ec2, only one disk
>
>
> With only 1GB of memory, your value for work_mem is probably too high for a query like yours.
>
> Even with 1MB, that's 1MB per join. You seem to have about 350 joins? That would be 350Mb right there. Plus a lot of more overhead.
>
> In general, what's up with 350 joins?! That seems to be, ahem, quite insane.
>
> For a query like that you probably have to reduce everything to bare minimums -- I'd try with shared_buffers at maybe 64MB and work_mem at 100kB and see if it passes. (it'll probably still be very slow, but that's to be expected from a query like that)
>
> But I'd sum it up as (1) don't do 350 joins. Just don't. And (2) if you have to, then a machine with just 1Gb is simply too small.
>
>
> //Magnus
>
From | Date | Subject | |
---|---|---|---|
Next Message | Saurabh Agrawal | 2018-06-26 11:49:01 | Re: We find few queries running three times simultaneously with same parameters on postgres db |
Previous Message | Andreas Kretschmer | 2018-06-26 10:32:44 | Re: We find few queries running three times simultaneously with same parameters on postgres db |