Re: Perf decreased although server is better

From: Will Platnick <wplatnick(at)gmail(dot)com>
To: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Perf decreased although server is better
Date: 2016-11-04 13:27:14
Message-ID: CAJDzJM75MEw_FJr6uYrrTQGeesvzL_UK2U2qC9Ud3XnG-jy-gA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

My guess would be that your server upgrade wasn't the upgrade you thought
it was.

You network latency could definitely be the cause of most of this. The
problem is you're not measuring this from the server side. It's not only
going to impact connect time, but you're going to get your data a bit
slower as well. I'm assuming your pgbouncer is installed on the postgres
box. Do you have the pgbouncer addon installed? If you're looking to
isolate network transfer, pgbouncer should give the statistics like avg
query time to see if they line up with what you're seeing in your APM.

Also, you reduced your number of cores, which is a big problem because
basically you just cut your max query capacity in half. Assuming a
dedicated box, each CPU can only process up to one postgres query at a
time. Previously, you could process up to 8 queries simultaneously, whereas
now you can only do 4. Now, since most of your queries are probably in the
ms, that can still be quite a bit of queries in a second time frame and you
may never hit 4 going at the same exact time, but without seeing your
pgbouncer config, this may actually be happening based on all the idle
connections you're seeing.

If ALL your connections are coming from a single pgbouncer locally on the
postgres box, then you can use your server resources better by setting
max_connections to 6 (the number of cores + one or two more for you to
connect locally via pgsql). Then, set your default_pool_size in pgbouncer
to 4 (number of cores) and reserve_pool_size to 0, and restart. This will
keep the number of sessions limited to what your box is actually capable of
doing and will help you avoid loading postgres down with more than it's
capable of doing, which can make a bad situation worse.

My recommendation would be to go back to the old server if it's available.
If not, get a new one in the same data center as your web servers with at
least 8 cores to put you back where you were.

On Fri, Nov 4, 2016 at 7:55 AM Benjamin Toueg <btoueg(at)gmail(dot)com> wrote:

> I've noticed a network latency increase. Ping between web server and
> database : 0.6 ms avg before, 5.3 ms avg after -- it wasn't that big 4 days
> ago :(
>
> I've narrowed my investigation to one particular "Transaction" in terms of
> the NewRelic APM. It's basically the main HTTP request of my application.
>
> Looks like the ping impacts psycopg2:connect (see http://imgur.com/a/LDH1c)
> 4 ms up to 16 ms on average.
>
> That I can understand. However, I don't understand the performance
> decrease of the select queries on table1 (see
> https://i.stack.imgur.com/QaUqy.png) 80 ms up to 160 ms on average
>
> Same goes for table 2 (see http://imgur.com/a/CnETs) 4 ms up to 20 ms on
> average
>
> However, there is a commit in my request, and it performs better (see
> http://imgur.com/a/td8Dc) 12 ms down to 6 ms on average.
>
> I don't see how this can be due to network latency!
>
> I will provide a new bonnie++ benchmark when the requests per minute is at
> the lowest (remember I can only run benchmarks while the server is in use).
>
> Rick, what did you mean by kernel configuration? The OS is a standard
> Ubuntu 16.04:
>
> - Linux 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
>
> Do you think losing half the number of cores can explain my performance
> issue ? (AMD 8 cores down to Haswell 4 cores).
>
> Best Regards,
>
> Benjamin
>
> PS : I've edited the SO post
> http://serverfault.com/questions/812702/posgres-perf-decreased-although-server-is-better
>
> 2016-11-04 1:05 GMT+01:00 Kevin Grittner <kgrittn(at)gmail(dot)com>:
>
> On Thu, Nov 3, 2016 at 9:51 AM, Benjamin Toueg <btoueg(at)gmail(dot)com> wrote:
> >
> > Stream gives substantially better results with the new server
> (before/after)
>
> Yep, the new server can access RAM at about twice the speed of the old.
>
> > I've run "bonnie++ -u postgres -d /tmp/ -s 4096M -r 1096" on both
> > machines. I don't know how to read bonnie++ results (before/after)
> > but it looks quite the same, sometimes better for the new,
> > sometimes better for the old.
>
> On most metrics the new machine looks better, but there are a few
> things that look potentially problematic with the new machine: the
> new machine uses about 1.57x the CPU time of the old per block
> written sequentially ((41 / 143557) / (16 / 87991)); so if the box
> becomes CPU starved, you might notice writes getting slower than on
> the new box. Also, several of the latency numbers are worse -- in
> some cases far worse. If I'm understanding that properly, it
> suggests that while total throughput from a number of connections
> may be better on the new machine, a single connection may not run
> the same query as quickly. That probably makes the new machine
> better for handling an OLTP workload from many concurrent clients,
> but perhaps not as good at cranking out a single big report or
> running dump/restore.
>
> Yes, it is quite possible that the new machine could be faster at
> some things and slower at others.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Otten 2016-11-04 13:29:58 Re: Perf decreased although server is better
Previous Message Benjamin Toueg 2016-11-04 11:53:49 Re: Perf decreased although server is better