Re: pgsql: Generational memory allocator

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Generational memory allocator
Date: 2017-11-25 04:54:18
Message-ID: bf84d940-90d4-de91-19dd-612e011007f4@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Hi,

On 11/25/2017 02:25 AM, Tom Lane wrote:
> I wrote:
>> For me, this patch fixes the valgrind failures inside generation.c
>> itself, but I still see one more in the test_decoding run: ...
>> Not sure what to make of this: the stack traces make it look unrelated
>> to the GenerationContext changes, but if it's not related, how come
>> skink was passing before that patch went in?
>
> I've pushed fixes for everything that I could find wrong in generation.c
> (and there was a lot :-(). But I'm still seeing the "invalid read in
> SnapBuildProcessNewCid" failure when I run test_decoding under valgrind.
> Somebody who has more familiarity with the logical decoding stuff than
> I do needs to look into that.
>
> I tried to narrow down exactly which fetch in SnapBuildProcessNewCid was
> triggering the failure, with the attached patch. Weirdly, *it does not
> fail* with this. I have no explanation for that.
>

I have no explanation for that either. FWIW I don't think this is
related to the new memory contexts. I can reproduce it on 3bae43c (i.e.
before the Generation memory context was introduced), and with Slab
removed from ReorderBuffer.

I wonder if this might be a valgrind issue. I'm not sure which version
skink is using, but I'm running with valgrind-3.12.0-9.el7_4.x86_64.

BTW I also see these failures in hstore:

==15168== Source and destination overlap in memcpy(0x5d0fed0, 0x5d0fed0, 40)
==15168== at 0x4C2E00C: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1018)
==15168== by 0x15419A06: hstoreUniquePairs (hstore_io.c:343)
==15168== by 0x15419EE4: hstore_in (hstore_io.c:416)
==15168== by 0x9ED11A: InputFunctionCall (fmgr.c:1635)
==15168== by 0x9ED3C2: OidInputFunctionCall (fmgr.c:1738)
==15168== by 0x6014A2: stringTypeDatum (parse_type.c:641)
==15168== by 0x5E1ADC: coerce_type (parse_coerce.c:304)
==15168== by 0x5E17A9: coerce_to_target_type (parse_coerce.c:103)
==15168== by 0x5EDD6D: transformTypeCast (parse_expr.c:2724)
==15168== by 0x5E8860: transformExprRecurse (parse_expr.c:203)
==15168== by 0x5E8601: transformExpr (parse_expr.c:156)
==15168== by 0x5FCF95: transformTargetEntry (parse_target.c:103)
==15168== by 0x5FD15D: transformTargetList (parse_target.c:191)
==15168== by 0x5A5EEC: transformSelectStmt (analyze.c:1214)
==15168== by 0x5A4453: transformStmt (analyze.c:297)
==15168== by 0x5A4381: transformOptionalSelectInto (analyze.c:242)
==15168== by 0x5A423F: transformTopLevelStmt (analyze.c:192)
==15168== by 0x5A4097: parse_analyze (analyze.c:112)
==15168== by 0x87E0AF: pg_analyze_and_rewrite (postgres.c:664)
==15168== by 0x87E6EE: exec_simple_query (postgres.c:1045)

Seems hstoreUniquePairs may call memcpy with the same pointers in some
cases (which looks a bit dubious). But the code is ancient, so it's
strange it didn't fail before.

regards
Tomas

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-11-25 05:18:25 Re: pgsql: Generational memory allocator
Previous Message Tom Lane 2017-11-25 01:25:38 Re: pgsql: Generational memory allocator