From: | "Anand Kumar, Karthik" <Karthik(dot)AnandKumar(at)classmates(dot)com> |
---|---|
To: | Venkata Balaji Nagothi <vbnpgc(at)gmail(dot)com>, "Anand Kumar, Karthik" <Karthik(dot)AnandKumar(at)classmates(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Increase in max_connections |
Date: | 2014-03-11 17:10:15 |
Message-ID: | CF448DD7.E5179%karthik.anandkumar@memorylane.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
No errors in the logs, except when we hit max_connections
No shared memory problems – no associated spike in I/O or system CPU indicating shared memory is either unused or over used. Sufficient memory in cache/buffers, zero swapping or anything indicative of a memory problem.
The box is pretty beefy – 24 core, 768G RAM :) - so yes, an effective cache of 568GB is normal, we arrived at it with months of tuning over time.
cpu_tuple_cost of 0.03 – yes, a lot of our settings are tweaked from the defaults based on performance. I don't have the output now, the the 0.03 was based on recommendations from posrgtes user groups, and via testing with setting it up and running explain analyze on queries. None of the settings have changed when this problem began.
Thanks,
Karthik
From: Venkata Balaji Nagothi <vbnpgc(at)gmail(dot)com<mailto:vbnpgc(at)gmail(dot)com>>
Date: Monday, March 10, 2014 7:35 PM
To: "Anand Kumar, Karthik" <Karthik(dot)AnandKumar(at)classmates(dot)com<mailto:Karthik(dot)AnandKumar(at)classmates(dot)com>>
Cc: "pgsql-general(at)postgresql(dot)org<mailto:pgsql-general(at)postgresql(dot)org>" <pgsql-general(at)postgresql(dot)org<mailto:pgsql-general(at)postgresql(dot)org>>
Subject: Re: [GENERAL] Increase in max_connections
On Tue, Mar 11, 2014 at 12:04 PM, Anand Kumar, Karthik <Karthik(dot)AnandKumar(at)classmates(dot)com<mailto:Karthik(dot)AnandKumar(at)classmates(dot)com>> wrote:
Hi all,
We're running postgres 9.3.2, server configuration below.
Seemingly randomly, we will see the number of active queries in postgres go up until we hit max_connections. The DB will recover after a few minutes.
We had the issue a couple of times in Feb 2014. We then upgraded the postgres server from 9.1 to 9.3.2, and the occurrence has gone up significantly – to several times a day.
The user CPU goes up as well to a 100%, no increase in I/O or system CPU.
We have slow query logging, and there is no dramatic change in the slow queries either.
There is a corresponding spike in shared locks, but that seems to be an effect not a cause – it corresponds to an increase in the number of running processes at the time.
We had a similar issue in the past – that was solved by disabling transparent_huge_pages – but the difference there was that we'd see queries slow down dramatically. Currently, we don't. Also, transparent_huge_pages is still disabled.
I do realize the issue would be caused by a spurt in incoming connections – we do not yet have conclusive evidence on whether that's happening (active queries climbs up, however no conclusive proof on whether thats because of slow down, or because of increase in traffic). Working on getting the information, will update with that information as soon as we have it.
I thought I'd send a post out to the group before then, to see if anyone has run into anything similar.
Thanks,
Karthik
site=# SELECT version();
PostgreSQL 9.3.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
site=# SELECT name, current_setting(name), source
site-# FROM pg_settings
site-# WHERE source NOT IN ('default', 'override');
application_name|psql|client
archive_command|/usr/bin/archiver.sh %f %p|configuration file
archive_mode|on|configuration file
autovacuum_freeze_max_age|250000000|configuration file
autovacuum_max_workers|6|configuration file
bgwriter_lru_maxpages|1000|configuration file
bgwriter_lru_multiplier|4|configuration file
checkpoint_completion_target|0.8|configuration file
checkpoint_segments|250|configuration file
checkpoint_timeout|15min|configuration file
checkpoint_warning|6min|configuration file
client_encoding|UTF8|client
commit_siblings|25|configuration file
cpu_tuple_cost|0.03|configuration file
DateStyle|ISO, MDY|configuration file
default_statistics_target|300|configuration file
default_text_search_config|pg_catalog.english|configuration file
effective_cache_size|568GB|configuration file
fsync|on|configuration file
lc_messages|en_US.UTF-8|configuration file
lc_monetary|en_US.UTF-8|configuration file
lc_numeric|en_US.UTF-8|configuration file
lc_time|en_US.UTF-8|configuration file
listen_addresses|*|configuration file
log_autovacuum_min_duration|0|configuration file
log_checkpoints|on|configuration file
log_connections|on|configuration file
log_destination|syslog|configuration file
log_directory|pg_log|configuration file
log_filename|postgresql-%a.log|configuration file
log_line_prefix|user=%u,db=%d,ip=%h |configuration file
log_min_duration_statement|100ms|configuration file
log_min_messages|debug1|configuration file
log_rotation_age|1d|configuration file
log_rotation_size|0|configuration file
log_timezone|US/Pacific|configuration file
log_truncate_on_rotation|on|configuration file
logging_collector|off|configuration file
maintenance_work_mem|1GB|configuration file
max_connections|1500|configuration file
max_locks_per_transaction|1000|configuration file
max_stack_depth|2MB|environment variable
max_wal_senders|5|configuration file
port|5432|command line
random_page_cost|2|configuration file
shared_buffers|8GB|configuration file
synchronous_commit|off|configuration file
syslog_facility|local0|configuration file
syslog_ident|postgres|configuration file
TimeZone|US/Pacific|configuration file
vacuum_freeze_table_age|0|configuration file
wal_buffers|32MB|configuration file
wal_keep_segments|250|configuration file
wal_level|hot_standby|configuration file
wal_sync_method|fsync|configuration file
work_mem|130MB|configuration file
Please let us know your hardware configuration like RAM, CPU (cores) etc.
Do you see any messages indicating any processes getting terminated/killed forcibly in the Postgresql logs ?
Or do you see any shared memory related error messages ?
cpu_tuple_cost=0.03 - which is not default, any reasons for increasing this.
effective_cache_size = 568 GB - Please help us know if this is optimal for your system.
Venkata Balaji N
Sr. Database Administrator
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Anand Kumar, Karthik | 2014-03-11 17:20:36 | Re: Increase in max_connections |
Previous Message | Jeff Janes | 2014-03-11 16:23:02 | Re: Increase in max_connections |