Re: Very poor read performance, query independent

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Charles Nadeau <charles(dot)nadeau(at)gmail(dot)com>
Cc: Igor Neyman <ineyman(at)perceptron(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very poor read performance, query independent
Date: 2017-07-17 20:56:23
Message-ID: CAGTBQpY+0eUrRJjpYB+DN5gvTBK+nARyxMkiTdLi1UpP0kX--Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jul 14, 2017 at 12:34 PM, Charles Nadeau
<charles(dot)nadeau(at)gmail(dot)com> wrote:
> Workers Planned: 12
> Workers Launched: 12
> Buffers: shared hit=728798037 read=82974833
> -> Hash Semi Join
> (cost=30059688.07..47951761.31 rows=220376 width=20) (actual
> time=1268845.181..2007864.725 rows=7057357 loops=13)
> Hash Cond: (flows.dstaddr =
> flows_1.dstaddr)
> Buffers: shared hit=728795193
> read=82974833
> -> Nested Loop
> (cost=0.03..17891246.86 rows=220376 width=20) (actual time=0.207..723790.283
> rows=37910370 loops=13)
> Buffers: shared hit=590692229
> read=14991777
> -> Parallel Seq Scan on flows
> (cost=0.00..16018049.14 rows=55094048 width=20) (actual
> time=0.152..566179.117 rows=45371630 loops=13)
> Buffers: shared
> hit=860990 read=14991777
> -> Index Only Scan using
> mynetworks_ipaddr_idx on mynetworks (cost=0.03..0.03 rows=1 width=8)
> (actual time=0.002..0.002 rows=1 loops=589831190)
> Index Cond: (ipaddr >>=
> (flows.srcaddr)::ip4r)
> Heap Fetches: 0
> Buffers: shared
> hit=589831203

12 workers on a parallel sequential scan on a RAID-10 volume of
rotating disks may not be a good idea.

Have you measured average request size and average wait times with iostat?

Run "iostat -x -m -d 60" while running the query and copy a few
relevant lines (or attach the whole thing). I suspect 12 parallel
sequential scans are degrading your array's performance to random I/O
performance, and that explains the 10MB/s very well (a rotating disk
will give you about 3-4MB/s at random I/O, and you've got 2 mirrors on
that array).

You could try setting the max_parallel_workers_per_gather to 2, which
should be the optimum allocation for your I/O layout.

You might also want to test switching to the deadline scheduler. While
the controller may get more aggregate thoughput rearranging your I/O
requests, high I/O latency will severly reduce postgres' ability to
saturate the I/O system itself, and deadlines tends to minimize
latency. I've had good results in the past using deadline, but take
this suggestion with a grain of salt, YMMV.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Charles Nadeau 2017-07-18 09:20:35 Re: Very poor read performance, query independent
Previous Message Charles Nadeau 2017-07-17 11:22:47 Re: Very poor read performance, query independent