From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm() |
Date: | 2023-08-17 14:27:36 |
Message-ID: | 34ff6307-d5be-d008-56ed-6249f3cd0749@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 2023-07-15 Sa 03:00, Alexander Lakhin wrote:
> Hello,
>
> 28.06.2023 21:49, Tom Lane wrote:
>> The bad news is that while investigating this I came across
>> another hazard that seems much messier to fix. To wit, if
>> you have a tuple with "missing" pass-by-ref columns, then
>> some of the columns output by heap_deform_tuple() will
>> contain pointers into the tupdesc (cf. getmissingattr()).
>> That's not sufficient lifespan in the scenario we're dealing
>> with here: if the tupdesc gets trashed anytime before the
>> eventual ExecEvalFieldStoreForm(), we'll have garbage in the
>> result tuple.
>
> Please excuse me, as I'm definitely late to the party, but may I ask some
> questions to understand the issue discussed? I still can't see a
> practical
> way to get garbage data in a tuple with missing columns, so maybe I'm
> missing something (except for columns).
>
> If we talk about ExecEvalFieldStoreDeForm() -> heap_deform_tuple() ->
> getmissingattr(), then I can't find a way to add a non-null default for a
> record type to exploit this path. The rowtypes test contains:
> -- at the moment this will not work due to ALTER TABLE inadequacy:
> alter table fullname add column suffix text default '';
> ERROR: cannot alter table "fullname" because column "people.fn" uses
> its row type
>
> So it seems to me, that ExecEvalFieldStoreDeForm() can't be called for a
> record/tuple having missing non-null attributes. And even if ALTER
> TABLE was
> "adequate", does it mean that ExecEvalFieldStoreDeForm() would return
> default
> values from the underlying table type?
>
>> Worse, I fear there may be many other places with latent bugs of the
>> same ilk. Before the attmissingval code was added, one could assume
>> that the result of heap_deform_tuple would hold good as long as you
>> had a pin on the source tuple. But now it depends on metadata as
>> well, and I don't think we have a coherent story about that.
>
> But is it true, that having a pin on tupdesc is enough to make sure that
> the result of heap_deform_tuple() holds good?
>
> If so, then probably we should find places, where tupdesc gets unpinned
> before that result is saved somewhere.
> I see the following callers of the heap_deform_tuple():
> src/backend/replication/logical/reorderbuffer.c
> ReorderBufferToastReplace(): heap_deform_tuple() followed by
> heap_form_tuple()
> src/backend/executor/execTuples.c
> ExecForceStoreHeapTuple(), ExecForceStoreMinimalTuple(),
> ExecStoreHeapTupleDatum(): use long-living and hopefully pinned
> slot->tts_tupleDescriptor
> src/backend/executor/execExprInterp.c
> ExecEvalFieldStoreDeForm(): the above question applies here
> src/backend/executor/spi.c
> SPI_modifytuple(): heap_deform_tuple() followed by heap_form_tuple()
> src/backend/utils/adt/rowtypes.c
> record_out(), hash_record(), hash_record_extended(): tupdesc
> released after extracting/processing data
> record_send(): tupdesc released after sending data
> record_cmp(), record_eq(), record_image_cmp(), record_image_eq():
> tupdesc1. tupdesc2 released after comparing data
> src/backend/utils/adt/jsonfuncs.c
> populate_record(): heap_deform_tuple() followed by heap_form_tuple()
> src/backend/utils/adt/expandedrecord.c
> deconstruct_expanded_record() uses long-living (pinned)
> erh->er_tupdesc
> src/backend/utils/adt/ri_triggers.c
> RI_Initial_Check(), RI_PartitionRemove_Check(): use long-living?
> SPI_tuptable->tupdesc
> src/backend/access/heap/heapam_handler.c
> reform_and_rewrite_tuple(): uses RelationGetDescr(OldHeap)
> (pinned, I suppose)
> src/backend/access/heap/heaptoast.c
> heap_toast_delete(), heap_toast_insert_or_update() use long-living
> rel->rd_att;
> toast_flatten_tuple(): heap_deform_tuple(..., tupleDesc, ...)
> followed by heap_form_tuple(..., tupleDesc, ...)
> toast_flatten_tuple_to_datum(): heap_deform_tuple followed by
> detoasting attrs
> src/backend/access/heap/heapam.c
> ExtractReplicaIdentity(): heap_deform_tuple(..., desc, ...)
> followed by heap_form_tuple(..., desc, ...)
> src/backend/access/common/heaptuple.c
> heap_modify_tuple(): heap_deform_tuple(..., tupleDesc, ...)
> followed by heap_form_tuple(..., tupleDesc, ...)
> heap_modify_tuple_by_cols(): heap_deform_tuple(..., tupleDesc,
> ...) followed by heap_form_tuple(..., tupleDesc, ...)
> src/backend/access/common/tupconvert.c
> execute_attr_map_tuple(): heap_deform_tuple() followed by
> heap_form_tuple()
> src/test/regress/regress.c
> make_tuple_indirect(): heap_deform_tuple() followed by
> heap_form_tuple()
> src/pl/plpgsql/src/pl_exec.c
> exec_move_row() called with tupdesc = NULL, SPI_tuptable->tupdesc,
> or tupdesc released after the call
> contrib/hstore/hstore_io.c
> hstore_from_record(), hstore_populate_record(): tupdesc released
> after extracting data
>
> And even if I missed some possibly problematic calls, maybe it's worth to
> consider fixing exactly that places...
>
> Also, besides heap_deform_tuple(), getmissingattr() is called from
> heap_getattr(). And heap_getattr() is used in many places, but most of
> them
> are protected too due to tupdesc pinned.
> Two suspicious places that I found are GetAttributeByName() and
> GetAttributeByNum(), so when using these functions you can really get the
> missing attribute value with the tupdesc released. But what
> circumstances do
> we need to end up with an invalid data?
> 1) Write a function that will call GetAttributeByName() and hold it's
> result inside for some time.
> 2) Add a passed-by-ref column with a default value to a table.
> 3) Pass to that function a tuple without the "missing" column (just
> "SELECT func(table) FROM table" won't do, but func(OLD) in a trigger
> should).
> 4) Trigger a relcache invalidation.
> 5) Perform a database access inside that function to process the
> invalidation.
> 6) Access the result of GetAttributeByName() after that.
>
> Maybe we could construct such test case, e.g. add to pg_regress one more
> SQL-accessible C function, bu9ccvhhihvcnb·t I wonder, is this the way
> to go?
>
> On the other hand, probably there are extensions, that use
> GetAttributeByName() in the non-safe way (as the function documentation
> doesn't warn about such issues), but maybe just perform datumCopy inside
> that function (and similar one(s))?
> Though, perhaps I just don't notice the elephant in the core...
I was waiting to see if others had a reaction to this, but ...
I think the last point (possible unsafe use by extensions) is reason
enough to proceed with the proposed mitigation, which is pretty simple
and non-invasive..
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2023-08-17 14:35:23 | BUG #18059: Unexpected error 25001 in stored procedure |
Previous Message | Masahiko Sawada | 2023-08-17 14:03:30 | Re: BUG #18055: logical decoding core on AllocateSnapshotBuilder() |