Re: Very poor read performance, query independent

From: Charles Nadeau <charles(dot)nadeau(at)gmail(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very poor read performance, query independent
Date: 2017-07-15 17:53:56
Message-ID: CADFyZw6S+vzHPvGnYMMQ4WcvcGkdkDX=bdo48knBmQcq_fpFFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Mark,

I increased the read ahead to 16384 and it doesn't improve performance. My
RAID 0 use a stripe size of 256k, the maximum size supported by the
controller.
Thanks!

Charles

On Sat, Jul 15, 2017 at 1:02 AM, Mark Kirkwood <
mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:

> Ah yes - that seems more sensible (but still slower than I would expect
> for 5 disks RAID 0). You should be able to get something like 5 * (single
> disk speed) i.e about 500MB/s.
>
> Might be worth increasing device read ahead (more than you have already).
> Some of these so-called 'smart' RAID cards need to be hit over the head
> before they will perform. E.g: I believe you have it set to 128 - I'd try
> 4096 or even 16384 (In the past I've used those settings on some extremely
> stupid cards that refused to max out their disks known speeds).
>
> Also worth investigating is RAID stripe size - for DW work it makes sense
> for it to be reasonably big (256K to 1M), which again will help speed is
> sequential scans.
>
> Cheers
>
> Mark
>
> P.s I used to work for Greenplum, so this type of problem came up a lot
> :-) . The best cards were the LSI and Areca!
>
>
>
> On 15/07/17 02:09, Charles Nadeau wrote:
>
>> Mark,
>>
>> First I must say that I changed my disks configuration from 4 disks in
>> RAID 10 to 5 disks in RAID 0 because I almost ran out of disk space during
>> the last ingest of data.
>> Here is the result test you asked. It was done with a cold cache:
>>
>> flows=# \timing
>> Timing is on.
>> flows=# explain select count(*) from flows;
>> QUERY PLAN
>> ------------------------------------------------------------
>> -----------------------------------
>> Finalize Aggregate (cost=17214914.09..17214914.09 rows=1 width=8)
>> -> Gather (cost=17214914.07..17214914.09 rows=1 width=8)
>> Workers Planned: 1
>> -> Partial Aggregate (cost=17213914.07..17213914.07
>> rows=1 width=8)
>> -> Parallel Seq Scan on flows
>> (cost=0.00..17019464.49 rows=388899162 width=0)
>> (5 rows)
>>
>> Time: 171.835 ms
>> flows=# select pg_relation_size('flows');
>> pg_relation_size
>> ------------------
>> 129865867264
>> (1 row)
>>
>> Time: 57.157 ms
>> flows=# select count(*) from flows;
>> LOG: duration: 625546.522 ms statement: select count(*) from flows;
>> count
>> -----------
>> 589831190
>> (1 row)
>>
>> Time: 625546.662 ms
>>
>> The throughput reported by Postgresql is almost 198MB/s, and the
>> throughput as mesured by dstat during the query execution was between 25
>> and 299MB/s. It is much better than what I had before! The i/o wait was
>> about 12% all through the query. One thing I noticed is the discrepency
>> between the read throughput reported by pg_activity and the one reported by
>> dstat: pg_activity always report a value lower than dstat.
>>
>> Besides the change of disks configuration, here is what contributed the
>> most to the improvment of the performance so far:
>>
>> Using Hugepage
>> Increasing effective_io_concurrency to 256
>> Reducing random_page_cost from 22 to 4
>> Reducing min_parallel_relation_size to 512kB to have more workers
>> when doing sequential parallel scan of my biggest table
>>
>>
>> Thanks for recomending this test, I now know what the real throughput
>> should be!
>>
>> Charles
>>
>> On Wed, Jul 12, 2017 at 4:11 AM, Mark Kirkwood <
>> mark(dot)kirkwood(at)catalyst(dot)net(dot)nz <mailto:mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>>
>> wrote:
>>
>> Hmm - how are you measuring that sequential scan speed of 4MB/s?
>> I'd recommend doing a very simple test e.g, here's one on my
>> workstation - 13 GB single table on 1 SATA drive - cold cache
>> after reboot, sequential scan using Postgres 9.6.2:
>>
>> bench=# EXPLAIN SELECT count(*) FROM pgbench_accounts;
>> QUERY PLAN
>> ------------------------------------------------------------
>> ------------------------
>> Aggregate (cost=2889345.00..2889345.01 rows=1 width=8)
>> -> Seq Scan on pgbench_accounts (cost=0.00..2639345.00
>> rows=100000000 width=0)
>> (2 rows)
>>
>>
>> bench=# SELECT pg_relation_size('pgbench_accounts');
>> pg_relation_size
>> ------------------
>> 13429514240
>> (1 row)
>>
>> bench=# SELECT count(*) FROM pgbench_accounts;
>> count
>> -----------
>> 100000000
>> (1 row)
>>
>> Time: 118884.277 ms
>>
>>
>> So doing the math seq read speed is about 110MB/s (i.e 13 GB in
>> 120 sec). Sure enough, while I was running the query iostat showed:
>>
>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
>> avgrq-sz avgqu-sz await r_await w_await svctm %util
>> sda 0.00 0.00 926.00 0.00 114.89 0.00
>> 254.10 1.90 2.03 2.03 0.00 1.08 100.00
>>
>>
>> So might be useful for us to see something like that from your
>> system - note you need to check you really have flushed the cache,
>> and that no other apps are using the db.
>>
>> regards
>>
>> Mark
>>
>>
>> On 12/07/17 00:46, Charles Nadeau wrote:
>>
>> After reducing random_page_cost to 4 and testing more, I can
>> report that the aggregate read throughput for parallel
>> sequential scan is about 90MB/s. However the throughput for
>> sequential scan is still around 4MB/s.
>>
>>
>>
>>
>>
>> --
>> Charles Nadeau Ph.D.
>> http://charlesnadeau.blogspot.com/
>>
>
>

--
Charles Nadeau Ph.D.
http://charlesnadeau.blogspot.com/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2017-07-15 17:58:32 Re: Very poor read performance, query independent
Previous Message Charles Nadeau 2017-07-15 16:12:25 Re: Very poor read performance, query independent