Re: confusing valgrind report about tuplestore+wrapper_handler (?) on 32-bit arm

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: confusing valgrind report about tuplestore+wrapper_handler (?) on 32-bit arm
Date: 2024-06-20 12:14:17
Message-ID: CAEudQArPg7RswtsoRkWz2n-chLPeWdrx+eVoTDWsfCEFFGqBtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qui., 20 de jun. de 2024 às 08:54, Tomas Vondra <
tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:

>
>
> On 6/20/24 13:32, Ranier Vilela wrote:
> > Em qui., 20 de jun. de 2024 às 07:28, Tomas Vondra <
> > tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:
> >
> >> Hi,
> >>
> >> While running valgrind on 32-bit ARM (rpi5 with debian), I got this
> >> really strange report:
> >>
> >>
> >> ==25520== Use of uninitialised value of size 4
> >> ==25520== at 0x94A550: wrapper_handler (pqsignal.c:108)
> >> ==25520== by 0x4D7826F: ??? (sigrestorer.S:64)
> >> ==25520== Uninitialised value was created by a heap allocation
> >> ==25520== at 0x8FB780: palloc (mcxt.c:1340)
> >> ==25520== by 0x913067: tuplestore_begin_common (tuplestore.c:289)
> >> ==25520== by 0x91310B: tuplestore_begin_heap (tuplestore.c:331)
> >> ==25520== by 0x3EA717: ExecMaterial (nodeMaterial.c:64)
> >> ==25520== by 0x3B2FF7: ExecProcNodeFirst (execProcnode.c:464)
> >> ==25520== by 0x3EF73F: ExecProcNode (executor.h:274)
> >> ==25520== by 0x3F0637: ExecMergeJoin (nodeMergejoin.c:703)
> >> ==25520== by 0x3B2FF7: ExecProcNodeFirst (execProcnode.c:464)
> >> ==25520== by 0x3C47DB: ExecProcNode (executor.h:274)
> >> ==25520== by 0x3C4D4F: fetch_input_tuple (nodeAgg.c:561)
> >> ==25520== by 0x3C8233: agg_retrieve_direct (nodeAgg.c:2364)
> >> ==25520== by 0x3C7E07: ExecAgg (nodeAgg.c:2179)
> >> ==25520== by 0x3B2FF7: ExecProcNodeFirst (execProcnode.c:464)
> >> ==25520== by 0x3A5EC3: ExecProcNode (executor.h:274)
> >> ==25520== by 0x3A8FBF: ExecutePlan (execMain.c:1646)
> >> ==25520== by 0x3A6677: standard_ExecutorRun (execMain.c:363)
> >> ==25520== by 0x3A644B: ExecutorRun (execMain.c:304)
> >> ==25520== by 0x6976D3: PortalRunSelect (pquery.c:924)
> >> ==25520== by 0x6972F7: PortalRun (pquery.c:768)
> >> ==25520== by 0x68FA1F: exec_simple_query (postgres.c:1274)
> >> ==25520==
> >> {
> >> <insert_a_suppression_name_here>
> >> Memcheck:Value4
> >> fun:wrapper_handler
> >> obj:/usr/lib/arm-linux-gnueabihf/libc.so.6
> >> }
> >> **25520** Valgrind detected 1 error(s) during execution of "select
> >> count(*) from
> >> **25520** (select * from tenk1 x order by x.thousand, x.twothousand,
> >> x.fivethous) x
> >> **25520** left join
> >> **25520** (select * from tenk1 y order by y.unique2) y
> >> **25520** on x.thousand = y.unique2 and x.twothousand = y.hundred and
> >> x.fivethous = y.unique2;"
> >>
> >>
> >> I'm mostly used to weird valgrind stuff on this platform, but it's
> >> usually about libarmmmem and (possibly) thinking it might access
> >> undefined stuff when calculating checksums etc.
> >>
> >> This seems somewhat different, so I wonder if it's something real?
> >
> > It seems like a false positive to me.
> >
> > According to valgrind's documentation:
> > https://valgrind.org/docs/manual/mc-manual.html#mc-manual.value
> >
> > " This can lead to false positive errors, as the shared memory can be
> > initialised via a first mapping, and accessed via another mapping. The
> > access via this other mapping will have its own V bits, which have not
> been
> > changed when the memory was initialised via the first mapping. The bypass
> > for these false positives is to use Memcheck's client requests
> > VALGRIND_MAKE_MEM_DEFINED and VALGRIND_MAKE_MEM_UNDEFINED to inform
> > Memcheck about what your program does (or what another process does) to
> > these shared memory mappings. "
> >
>
> But that's about shared memory, and the report has nothing to do with
> shared memory AFAICS.
>
You can try once:
Selecting --expensive-definedness-checks=yes causes Memcheck to use the
most accurate analysis possible. This minimises false error rates but can
cause up to 30% performance degradation.

I did a search through my reports and none refer to this particular source.

best regards,
Ranier Vilela

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-06-20 12:30:44 Re: suspicious valgrind reports about radixtree/tidstore on arm64
Previous Message Andrew Dunstan 2024-06-20 12:05:14 Re: jsonapi type fixups