From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: Streaming read-ready sequential scan code |
Date: | 2024-03-08 21:56:47 |
Message-ID: | 20240308215647.6afddqsimrrbuwwk@liskov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 02, 2024 at 06:07:48PM -0500, Melanie Plageman wrote:
> On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote:
> > > On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman
> > > <melanieplageman(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman
> > > > <melanieplageman(at)gmail(dot)com> wrote:
> > > > >
> > > > > There is an outstanding question about where to allocate the
> > > > > PgStreamingRead object for sequential scans
> > > >
> > > > I've written three alternative implementations of the actual streaming
> > > > read user for sequential scan which handle the question of where to
> > > > allocate the streaming read object and how to handle changing scan
> > > > direction in different ways.
> > > >
> > > > Option A) https://github.com/melanieplageman/postgres/tree/seqscan_pgsr_initscan_allocation
> > > > - Allocates the streaming read object in initscan(). Since we do not
> > > > know the scan direction at this time, if the scan ends up not being a
> > > > forwards scan, the streaming read object must later be freed -- so
> > > > this will sometimes allocate a streaming read object it never uses.
> > > > - Only supports ForwardScanDirection and once the scan direction
> > > > changes, streaming is never supported again -- even if we return to
> > > > ForwardScanDirection
> > > > - Must maintain a "fallback" codepath that does not use the streaming read API
> > >
> > > Attached is a version of this patch which implements a "reset"
> > > function for the streaming read API which should be cheaper than the
> > > full pg_streaming_read_free() on rescan. This can easily be ported to
> > > work on any of my proposed implementations (A/B/C). I implemented it
> > > on A as an example.
> >
> > Attached is the latest version of this patchset -- rebased in light of
> > Thomas' updatees to the streaming read API [1]. I have chosen the
> > approach I think we should go with. It is a hybrid of my previously
> > proposed approaches.
>
> While investigating some performance concerns, Andres pointed out that
> the members I add to HeapScanDescData in this patch push rs_cindex and
> rs_ntuples to another cacheline and introduce a 4-byte hole. Attached
> v4's HeapScanDescData is as well-packed as on master and its members
> are reordered so that rs_cindex and rs_ntuples are back on the second
> cacheline of the struct's data.
I did some additional profiling and realized that dropping the
unlikely() from the places we check rs_inited frequently was negatively
impacting performance. v5 adds those back and also makes a few other
very minor cleanups.
Note that this patch set has a not yet released version of Thomas
Munro's Streaming Read API with a new ramp-up logic which seems to fix a
performance issue I saw with my test case when all of the sequential
scan's blocks are in shared buffers. Once he sends the official new
version, I will rebase this and point to his explanation in that thread.
- Melanie
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Split-heapgetpage-into-two-parts.patch | text/x-diff | 7.9 KB |
v5-0002-Replace-blocks-with-buffers-in-heapgettup-control.patch | text/x-diff | 7.8 KB |
v5-0003-fixup-Streaming-Read-API.patch | text/x-diff | 56.5 KB |
v5-0004-Add-pg_streaming_read_reset.patch | text/x-diff | 1.9 KB |
v5-0005-Sequential-scans-and-TID-range-scans-stream-reads.patch | text/x-diff | 7.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-03-08 22:03:22 | Re: vacuumdb/clusterdb/reindexdb: allow specifying objects to process in all databases |
Previous Message | Jeff Davis | 2024-03-08 21:22:30 | Re: Proposal to add page headers to SLRU pages |