From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel Seq Scan vs kernel read ahead |
Date: | 2020-06-19 02:10:17 |
Message-ID: | CAApHDvq+mXCDE61qEWHLBCOVxHQMaF1S_Z8vhU_KsvhAowg+5w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 19 Jun 2020 at 11:34, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Fri, 19 Jun 2020 at 03:26, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Thu, Jun 18, 2020 at 6:15 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > > With a 32TB relation, the code will make the chunk size 16GB. Perhaps
> > > I should change the code to cap that at 1GB.
> >
> > It seems pretty hard to believe there's any significant advantage to a
> > chunk size >1GB, so I would be in favor of that change.
>
> I could certainly make that change. With the standard page size, 1GB
> is 131072 pages and a power of 2. That would change for non-standard
> page sizes, so we'd need to decide if we want to keep the chunk size a
> power of 2, or just cap it exactly at whatever number of pages 1GB is.
>
> I'm not sure how much of a difference it'll make, but I also just want
> to note that synchronous scans can mean we'll start the scan anywhere
> within the table, so capping to 1GB does not mean we read an entire
> extent. It's more likely to span 2 extents.
Here's a patch which caps the maximum chunk size to 131072. If
someone doubles the page size then that'll be 2GB instead of 1GB. I'm
not personally worried about that.
I tested the performance on a Windows 10 laptop using the test case from [1]
Master:
workers=0: Time: 141175.935 ms (02:21.176)
workers=1: Time: 316854.538 ms (05:16.855)
workers=2: Time: 323471.791 ms (05:23.472)
workers=3: Time: 321637.945 ms (05:21.638)
workers=4: Time: 308689.599 ms (05:08.690)
workers=5: Time: 289014.709 ms (04:49.015)
workers=6: Time: 267785.270 ms (04:27.785)
workers=7: Time: 248735.817 ms (04:08.736)
Patched:
workers=0: Time: 155985.204 ms (02:35.985)
workers=1: Time: 112238.741 ms (01:52.239)
workers=2: Time: 105861.813 ms (01:45.862)
workers=3: Time: 91874.311 ms (01:31.874)
workers=4: Time: 92538.646 ms (01:32.539)
workers=5: Time: 93012.902 ms (01:33.013)
workers=6: Time: 94269.076 ms (01:34.269)
workers=7: Time: 90858.458 ms (01:30.858)
David
Attachment | Content-Type | Size |
---|---|---|
bigger_io_chunks_for_parallel_seqscan_v2.patch | application/x-patch | 10.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-06-19 02:20:01 | Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans |
Previous Message | David Rowley | 2020-06-19 02:02:29 | Re: Missing HashAgg EXPLAIN ANALYZE details for parallel plans |