From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Subject: | BTScanOpaqueData size slows down tests |
Date: | 2025-04-02 15:20:58 |
Message-ID: | kgz63a4hp6s22egd47mlgngkjsz44t6wgojzlzi67zgrx2mzl3@dntq6nrahdgr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I was a bit annoyed at test times just now. Ran a profile on the entire
regression tests in a cassert -Og build.
Unsurprisingly most of the time is spent in AllocSetCheck(). I was mildly
surprised to see how expensive the new compact attribute checks are.
What I was more surprised to realize is how much of the time is spent in
freeing and allocating BTScanOpaqueData.
+ 6.94% postgres postgres [.] AllocSetCheck
- 4.96% postgres libc.so.6 [.] __memset_evex_unaligned_erms
- 1.94% memset(at)plt
- 1.12% _int_malloc
- 1.11% malloc
- 0.90% AllocSetAllocLarge
- AllocSetAlloc
- 0.77% palloc
- 0.63% btbeginscan
- index_beginscan_internal
- 0.63% index_beginscan
- 0.61% systable_beginscan
+ 0.22% SearchCatCacheMiss
+ 0.07% ScanPgRelation
+ 0.05% RelationBuildTupleDesc
+ 0.04% findDependentObjects
0.03% GetNewOidWithIndex
+ 0.02% deleteOneObject
+ 0.02% shdepDropDependency
+ 0.02% DeleteComments
+ 0.02% SearchCatCacheList
+ 0.02% DeleteSecurityLabel
+ 0.02% DeleteInitPrivs
+ 0.04% text_to_cstring
+ 0.02% cstring_to_text_with_len
+ 0.02% datumCopy
+ 0.02% tuplesort_begin_batch
+ 0.11% palloc_extended
+ 0.01% AllocSetRealloc
+ 0.20% AllocSetAllocFromNewBlock
+ 0.82% _int_free_merge_chunk
- 1.90% __memset_evex_unaligned_erms
- 1.82% wipe_mem
- 1.33% AllocSetFree
- 1.33% pfree
+ 0.73% btendscan
+ 0.22% freedfa
+ 0.06% ExecAggCopyTransValue
+ 0.04% freenfa
+ 0.03% enlarge_list
+ 0.03% ExecDropSingleTupleTableSlot
+ 0.02% xmlconcat
+ 0.01% RemoveLocalLock
+ 0.01% errcontext_msg
+ 0.01% IndexScanEnd
0.01% heap_free_minimal_tuple
+ 0.49% AllocSetReset
0.02% palloc0
0.01% PageInit
+ 0.01% wipe_mem
+ 0.59% alloc_perturb
+ 0.46% asm_exc_page_fault
+ 0.03% asm_sysvec_apic_timer_interrupt
+ 0.02% wipe_mem
Looking at the size of BTScanOpaqueData I am less surprised:
/* --- cacheline 1 boundary (64 bytes) --- */
char * currTuples; /* 64 8 */
char * markTuples; /* 72 8 */
int markItemIndex; /* 80 4 */
/* XXX 4 bytes hole, try to pack */
BTScanPosData currPos __attribute__((__aligned__(8))); /* 88 13632 */
/* --- cacheline 214 boundary (13696 bytes) was 24 bytes ago --- */
BTScanPosData markPos __attribute__((__aligned__(8))); /* 13720 13632 */
/* size: 27352, cachelines: 428, members: 17 */
/* sum members: 27340, holes: 4, sum holes: 12 */
/* forced alignments: 2, forced holes: 1, sum forced holes: 4 */
/* last cacheline: 24 bytes */
} __attribute__((__aligned__(8)));
allocating, zeroing and freeing 28kB of memory for every syscache miss, yea,
that's gonna hurt.
The reason BTScanPosData is that large is that it stores MaxTIDsPerBTreePage*
sizeof(BTScanPosItem):
BTScanPosItem items[1358] __attribute__((__aligned__(2))); /* 48 13580 */
Could we perhaps allocate BTScanPosData->items dynamically if more than a
handful of items are needed?
And/or perhaps we could could allocate BTScanOpaqueData.markPos as a whole
only when mark/restore are used?
I'd be rather unsurprised if this isn't just an issue for tests, but also in a
few real workloads.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tender Wang | 2025-04-02 15:24:32 | Re: bug when apply fast default mechanism for adding new column over domain with default value |
Previous Message | Fujii Masao | 2025-04-02 15:19:58 | Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query). |