Re: Parallel Seq Scan

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-01-30 01:22:00
Message-ID: 54CADCB8.20702@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/28/15 7:27 PM, Stephen Frost wrote:
> * Jim Nasby (Jim(dot)Nasby(at)BlueTreble(dot)com) wrote:
>> >On 1/28/15 9:56 AM, Stephen Frost wrote:
>>> > >Such i/o systems do exist, but a single RAID5 group over spinning rust
>>> > >with a simple filter isn't going to cut it with a modern CPU- we're just
>>> > >too darn efficient to end up i/o bound in that case. A more complex
>>> > >filter might be able to change it over to being more CPU bound than i/o
>>> > >bound and produce the performance improvments you're looking for.
>> >
>> >Except we're nowhere near being IO efficient. The vast difference between Postgres IO rates and dd shows this. I suspect that's because we're not giving the OS a list of IO to perform while we're doing our thing, but that's just a guess.
> Uh, huh? The dd was ~321000 and the slowest uncached PG run from
> Robert's latest tests was 337312.554, based on my inbox history at
> least. I don't consider ~4-5% difference to be vast.

Sorry, I was speaking more generally than this specific test. In the past I've definitely seen SeqScan performance that was an order of magnitude slower than what dd would do. This was an older version of Postgres and an older version of linux, running on an iSCSI SAN. My suspicion is that the added IO latency imposed by iSCSI is what was causing this, but that's just conjecture.

I think Robert was saying that he hasn't been able to see this effect on their test server... that makes me think it's doing read-ahead on the OS level. But I suspect it's pretty touch and go to rely on that; I'd prefer we have some way to explicitly get that behavior where we want it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-01-30 01:32:20 Re: Error check always bypassed in tablefunc.c
Previous Message David Steele 2015-01-30 01:14:59 Re: pg_upgrade and rsync