| From: | Andres Freund <andres(at)anarazel(dot)de> | 
|---|---|
| To: | Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: pgsql: Use pre-fetching for ANALYZE | 
| Date: | 2022-06-03 02:35:11 | 
| Message-ID: | 20220603023511.ug6nkj2oyho7lovc@alap3.anarazel.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-committers pgsql-hackers | 
Hi,
On 2022-06-02 19:30:16 -0700, Andres Freund wrote:
> On 2021-03-16 18:48:08 +0000, Stephen Frost wrote:
> > Use pre-fetching for ANALYZE
> > 
> > When we have posix_fadvise() available, we can improve the performance
> > of an ANALYZE by quite a bit by using it to inform the kernel of the
> > blocks that we're going to be asking for.  Similar to bitmap index
> > scans, the number of buffers pre-fetched is based off of the
> > maintenance_io_concurrency setting (for the particular tablespace or,
> > if not set, globally, via get_tablespace_maintenance_io_concurrency()).
> 
> I just looked at this as part of debugging a crash / hang in the AIO patch.
> 
> The code does:
> 
> 		block_accepted = table_scan_analyze_next_block(scan, targblock, vac_strategy);
> 
> #ifdef USE_PREFETCH
> 
> 		/*
> 		 * When pre-fetching, after we get a block, tell the kernel about the
> 		 * next one we will want, if there's any left.
> 		 *
> 		 * We want to do this even if the table_scan_analyze_next_block() call
> 		 * above decides against analyzing the block it picked.
> 		 */
> 		if (prefetch_maximum && prefetch_targblock != InvalidBlockNumber)
> 			PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock);
> #endif
> 
> I.e. we lock a buffer and *then* we prefetch another buffer. That seems like a
> quite bad idea to me. Why are we doing IO while holding a content lock, if we
> can avoid it?
It also seems decidedly not great from a layering POV to do the IO in
analyze.c. There's no guarantee that the tableam maps blocks in a way that's
compatible with PrefetchBuffer().  Yes, the bitmap heap scan code does
something similar, but a) that is opt in by the AM, b) there's a comment
saying it's quite crufty and should be fixed.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2022-06-03 02:51:56 | pgsql: Ignore heap rewrites for materialized views in logical replicati | 
| Previous Message | Andres Freund | 2022-06-03 02:30:16 | Re: pgsql: Use pre-fetching for ANALYZE | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2022-06-03 03:13:43 | Re: PG15 beta1 sort performance regression due to Generation context change | 
| Previous Message | Andres Freund | 2022-06-03 02:30:16 | Re: pgsql: Use pre-fetching for ANALYZE |