From: | Mark Wong <mark(at)2ndQuadrant(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Rui DeSousa <rui(at)crazybean(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Matteo Beccati <php(at)beccati(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Torsten Zuehlsdorff <mailinglists(at)toco-domains(dot)de>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Marko Tiikkaja <marko(at)joh(dot)to>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com> |
Subject: | Re: [HACKERS] kqueue |
Date: | 2020-01-29 19:16:52 |
Message-ID: | 20200129191652.GA18356@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jan 25, 2020 at 11:29:11AM +1300, Thomas Munro wrote:
> On Thu, Jan 23, 2020 at 9:38 AM Rui DeSousa <rui(at)crazybean(dot)net> wrote:
> > On Jan 22, 2020, at 2:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> It's certainly possible that to see any benefit you need stress
> >> levels above what I can manage on the small box I've got these
> >> OSes on. Still, it'd be nice if a performance patch could show
> >> some improved performance, before we take any portability risks
> >> for it.
>
> You might need more than one CPU socket, or at least lots more cores
> so that you can create enough contention. That was needed to see the
> regression caused by commit ac1d794 on Linux[1].
>
> > Here is two charts comparing a patched and unpatched system.
> > These systems are very large and have just shy of thousand
> > connections each with averages of 20 to 30 active queries concurrently
> > running at times including hundreds if not thousand of queries hitting
> > the database in rapid succession. The effect is the unpatched system
> > generates a lot of system load just handling idle connections where as
> > the patched version is not impacted by idle sessions or sessions that
> > have already received data.
>
> Thanks. I can reproduce something like this on an Azure 72-vCPU
> system, using pgbench -S -c800 -j32. The point of those settings is
> to have many backends, but they're all alternating between work and
> sleep. That creates a stream of poll() syscalls, and system time goes
> through the roof (all CPUs pegged, but it's ~half system). Profiling
> the kernel with dtrace, I see the most common stack (by a long way) is
> in a poll-related lock, similar to a profile Rui sent me off-list from
> his production system. Patched, there is very little system time and
> the TPS number goes from 539k to 781k.
>
> [1] https://www.postgresql.org/message-id/flat/CAB-SwXZh44_2ybvS5Z67p_CDz%3DXFn4hNAD%3DCnMEF%2BQqkXwFrGg%40mail.gmail.com
Just to add some data...
I tried the kqueue v14 patch on a AWS EC2 m5a.24xlarge (96 vCPU) with
FreeBSD 12.1, driving from a m5.8xlarge (32 vCPU) CentOS 7 system.
I also use pgbench with a scale factor of 1000, with -S -c800 -j32.
Comparing pg 12.1 vs 13-devel (30012a04):
* TPS increased from ~93,000 to ~140,000, ~ 32% increase
* system time dropped from ~ 78% to ~ 70%, ~ 8% decrease
* user time increased from ~16% to ~ 23%, ~7% increase
I don't have any profile data, but I've attached a couple chart showing
the processor utilization over a 15 minute interval from the database
system.
Regards,
Mark
--
Mark Wong
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/
Attachment | Content-Type | Size |
---|---|---|
image/png | 24.4 KB | |
image/png | 24.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-01-29 19:41:16 | Marking some contrib modules as trusted extensions |
Previous Message | Robert Haas | 2020-01-29 18:41:39 | Re: Enabling B-Tree deduplication by default |