From: | hubert depesz lubaczewski <depesz(at)depesz(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | PostgreSQL (9.3 and 9.6) eats all memory when using many tables |
Date: | 2016-06-09 15:46:10 |
Message-ID: | 20160609154610.GA20600@depesz.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
so, we are running 9.3.10 in production, but I tested it in 9.6, and the
problem seems to be there too in 9.6, though to much lesser extent.
In our database we have ~ 70000 tables (~ 180 tables in ~ 400 schemas).
So far we used application in such a way that each connection could use
only tables from single schema.
But then, we switched situation to where single connection (very long,
as it's reused thanks to pgbouncer) can effectively query tables from
all schemas.
And this caused memory usage to explode, to the point that our server
wasn't able to handle it (64gb of mem gone).
I checked logs, and other stuff, and came with synthetic test, using
perl:
=======================================================
#!/usr/bin/env perl
use strict;
use DBI;
use Data::Dumper;
my $dbh = DBI->connect("dbi:Pg:dbname=depesz");
my $backend_pid = $dbh->selectall_arrayref("select pg_backend_pid()")->[0]->[0];
system("ps nh uww -p $backend_pid");
my $t = [ map { $_->[0] } @{ $dbh->selectall_arrayref("SELECT oid::regclass from pg_class where relkind = 'r'") } ];
my $len = scalar @{ $t };
my $i = 0;
while (1) {
my $use_table = $t->[ $i % $len ];
my $limit = 1 + ( $i % 3 );
$dbh->prepare("select ${i}::int4 as a, ?::int4 as a, * from $use_table limit $limit")->execute( $i );
$i++;
if ( 0 == $i % 1000) {
print "$i:";
system("ps nh uww -p $backend_pid");
}
}
=======================================================
This effectively does:
select 1, 1, * from <table> limit <1..3>
for each table.
on 9.3 after ~ 35000 tables, pg process grew by 1GB, and it wasn't shared
buffers, as smaps showed that the memory was anonymous.
In 9.6, after 35000 tables it grew by ~ 71MB. Which is much better, but still not really optimal.
The same situation happens when I was *not* using prepared statements on server side.
Basically it looks that postgresql "caches" query plans? parsed elements? for
queries, but doesn't put any kind of limit to size of this cache. Which means
that if our app is using LOTS of different queries, the memory usage will grow
in time.
This, plus the fact that this cache is not shared, means that with non-trivial
numbers of tables, and not-trivial numbers of backends, it will use all of
server memory as soon as enough different queries/tables will get used.
What can we do about it, aside from having less tables and moving to
newer Pg?
Best regards,
depesz
--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Francisco Olarte | 2016-06-09 18:49:45 | Re: Case in Order By Ignored without warning or error |
Previous Message | David G. Johnston | 2016-06-09 15:15:58 | Re: Case in Order By Ignored without warning or error |