Re: Parallel Select query performance and shared buffers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Metin Doslu <metin(at)citusdata(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Select query performance and shared buffers
Date: 2013-12-04 03:57:18
Message-ID: CAA4eK1+RaLg9dfici05QL_r=G5uZ3Zi=BGJ8C0pVCJ9o9V_s+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Tue, Dec 3, 2013 at 7:11 PM, Metin Doslu <metin(at)citusdata(dot)com> wrote:
> We have several independent tables on a multi-core machine serving Select
> queries. These tables fit into memory; and each Select queries goes over one
> table's pages sequentially. In this experiment, there are no indexes or
> table joins.
>
> When we send concurrent Select queries to these tables, query performance
> doesn't scale out with the number of CPU cores. We find that complex Select
> queries scale out better than simpler ones. We also find that increasing the
> block size from 8 KB to 32 KB, or increasing shared_buffers to include the
> working set mitigates the problem to some extent.
>
> For our experiments, we chose an 8-core machine with 68 GB of memory from
> Amazon's EC2 service. We installed PostgreSQL 9.3.1 on the instance, and set
> shared_buffers to 4 GB.
>
> We then generated 1, 2, 4, and 8 separate tables using the data generator
> from the industry standard TPC-H benchmark. Each table we generated, called
> lineitem-1, lineitem-2, etc., had about 750 MB of data.
I think all of this data cannot fit in shared_buffers, you might
want to increase shared_buffers
to larger size (not 30GB but close to your data size) to see how it behaves

> Next, we sent 1, 2,
> 4, and 8 concurrent Select queries to these tables to observe the scale out
> behavior. Our expectation was that since this machine had 8 cores, our run
> times would stay constant all throughout. Also, we would have expected the
> machine's CPU utilization to go up to 100% at 8 concurrent queries. Neither
> of those assumptions held true.

You queries have Aggregation, ORDER/GROUP BY, so there is a chance
that I/O can happen for those operation's
if PG doesn't have sufficient memory (work_mem) to perform such operation.

> As a quick side, we also repeated the same experiment on an EC2 instance
> with 16 CPU cores, and found that the scale out behavior became worse there.
> (We also tried increasing the shared_buffers to 30 GB. This change
> completely solved the scaling out problem on this instance type, but hurt
> our performance on the hi1.4xlarge instances.)

Instead of 30GB, you can try with lesser value, but it should be close
to your data size.

> Unfortunately, increasing the block size from 8 to 32 KB has other
> implications for some of our customers. Could you help us out with the
> problem here?
>
> What can we do to identify the problem's root cause? Can we work around it?

I think without finding the real cause, it would be difficult to get
the reasonable workaround.
Can you simplify your queries (simple scan or in other words no
aggregation or other things) to see how
they behave in your env., once you are able to see simple queries
scaling as per your expectation, you
can try with complex one's.

Note - post this on pgsql-performance as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KONDO Mitsumasa 2013-12-04 04:08:38 Re: Why we are going to have to go DirectIO
Previous Message Tatsuo Ishii 2013-12-04 02:28:29 Re: Why we are going to have to go DirectIO

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2013-12-04 05:10:20 Re: Parallel Select query performance and shared buffers
Previous Message Jan Michel 2013-12-03 20:18:39 Re: One query run twice in parallel results in huge performance decrease