From: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan |
Date: | 2015-01-26 21:48:39 |
Message-ID: | 54C6B637.9050408@BlueTreble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/23/15 10:16 PM, Amit Kapila wrote:
> Further, if we want to just get the benefit of parallel I/O, then
> I think we can get that by parallelising partition scan where different
> table partitions reside on different disk partitions, however that is
> a matter of separate patch.
I don't think we even have to go that far.
My experience with Postgres is that it is *very* sensitive to IO latency (not bandwidth). I believe this is the case because complex queries tend to interleave CPU intensive code in-between IO requests. So we see this pattern:
Wait 5ms on IO
Compute for a few ms
Wait 5ms on IO
Compute for a few ms
...
We blindly assume that the kernel will magically do read-ahead for us, but I've never seen that work so great. It certainly falls apart on something like an index scan.
If we could instead do this:
Wait for first IO, issue second IO request
Compute
Already have second IO request, issue third
...
We'd be a lot less sensitive to IO latency.
I wonder what kind of gains we would see if every SeqScan in a query spawned a worker just to read tuples and shove them in a queue (or shove a pointer to a buffer in the queue). Similarly, have IndexScans have one worker reading the index and another worker taking index tuples and reading heap tuples...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2015-01-26 21:59:14 | Re: proposal: plpgsql - Assert statement |
Previous Message | Andres Freund | 2015-01-26 21:35:55 | Re: New CF app deployment |