| From: | Bruce Momjian <bruce(at)momjian(dot)us> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, suchithjn22(at)gmail(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: The documentation for storage type 'plain' actually allows single byte header | 
| Date: | 2023-10-22 01:56:13 | 
| Message-ID: | ZTSBPankJQ9_kfXF@momjian.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-docs pgsql-hackers | 
On Fri, Oct 20, 2023 at 09:48:05PM -0400, Bruce Momjian wrote:
> Here is the original thread from pgsql-docs:
> 
> 	https://www.postgresql.org/message-id/flat/167336599095.2667301.15497893107226841625%40wrigleys.postgresql.org
> 
> The report is about single-byte headers being used for varlena values
> with PLAIN storage.
> 
> Here is the reproducible report:
> 
> 	CREATE EXTENSION pageinspect;
> 	CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
> 	INSERT INTO test VALUES (repeat('A',10));
> 	
> 	Now peek into the page with pageinspect functions
> 	
> 	SELECT left(encode(t_data, 'hex'), 40) FROM
> 	heap_page_items(get_raw_page('test', 0));
> 	
> 	This returned value of "1741414141414141414141".
> 	Here the first byte 0x17 = 0001 0111 in binary.
> 	Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
> 	[base-10] = 10 [base-10]
> 	which exactly matches the expected length. Further the data "41" repeated 10
> 	times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.
> 
> I researched this and thought it would be a case where we were lacking a
> check before creating a single-byte header, but I couldn't find anything
> missing.  I think the problem is that the _source_ tupleDesc attstorage
> attribute is being used to decide if we should use a short header, while
> it is really the storage type of the destination that we should be
> checking.  Unfortunately, I don't think the destination is accessible at
> the location were we are deciding about a short header.
> 
> I am confused how to proceed.  I feel we need to fully understand why
> this happening before we adjust anything.  Here is a backtrace --- the
> short header is being created in fill_val() and the attstorage value
> there is 'x'/EXTENDED.
I did some more research.  It turns out that the source slot/planSlot is
populating its pg_attribute information via makeTargetEntry() and it
has no concept of a storage type.
Digging further, I found that we cannot get rid of the the use of
att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
because there are internal uses of fill_val() that can't handle packed
varlena headers.
I ended up with a doc patch that adds a C comment about this odd
behavior and removes doc text about PLAIN storage not using packed
headers.
-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        https://momjian.us
  EDB                                      https://enterprisedb.com
Only you can decide what is important to you.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2023-10-22 01:59:04 | Re: The documentation for storage type 'plain' actually allows single byte header | 
| Previous Message | Michael Paquier | 2023-10-21 07:11:50 | Re: Minipatch concerning tags | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2023-10-22 01:59:04 | Re: The documentation for storage type 'plain' actually allows single byte header | 
| Previous Message | Thomas Munro | 2023-10-22 01:44:31 | Re: LLVM 16 (opaque pointers) |