Re: Parallel Seq Scan vs kernel read ahead

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan vs kernel read ahead
Date: 2020-06-22 22:50:16
Message-ID: CAApHDvpyh1Ab4OkkzNpXAyCCeCzXjSN6y7v7Kr36mx7AvmGS3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 23 Jun 2020 at 09:52, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Fri, Jun 19, 2020 at 2:10 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > Here's a patch which caps the maximum chunk size to 131072. If
> > someone doubles the page size then that'll be 2GB instead of 1GB. I'm
> > not personally worried about that.
>
> I wonder how this interacts with the sync scan feature. It has a
> conflicting goal...

Of course, syncscan relies on subsequent scanners finding buffers
cached, either in (ideally) shared buffers or the kernel cache. The
scans need to be roughly synchronised for that to work. If we go and
make the chunk size too big, then that'll reduce the chances useful
buffers being found by subsequent scans. It sounds like a good reason
to try and find the smallest chunk size that allows readahead to work
well. The benchmarks I did on Windows certainly show that there are
diminishing returns when the chunk size gets larger, so capping it at
some number of megabytes would probably be a good idea. It would just
take a bit of work to figure out how many megabytes that should be.
Likely it's going to depend on the size of shared buffers and how much
memory the machine has got, but also what other work is going on that
might be evicting buffers at the same time. Perhaps something in the
range of 2-16MB would be ok. I can do some tests with that and see if
I can get the same performance as with the larger chunks.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-06-22 22:53:01 Re: Backpatch b61d161c14 (Introduce vacuum errcontext ...)
Previous Message Tomas Vondra 2020-06-22 22:25:30 Re: POC: GROUP BY optimization