Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Andres Freund <andres(at)anarazel(dot)de>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Date: 2018-02-02 22:52:02
Message-ID: CAGTBQpaiNQSNJC8y4w82UBTaPsvSqRRg++yEi5wre1MFE2iD8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 25, 2018 at 6:21 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Jan 26, 2018 at 9:38 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>> I had the tests running in a loop all day long, and I cannot reproduce
>> that variance.
>>
>> Can you share your steps to reproduce it, including configure flags?
>
> Here are two build logs where it failed:
>
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/332968819
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/332592511
>
> Here's one where it succeeded:
>
> https://travis-ci.org/postgresql-cfbot/postgresql/builds/333139855
>
> The full build script used is:
>
> ./configure --enable-debug --enable-cassert --enable-coverage
> --enable-tap-tests --with-tcl --with-python --with-perl --with-ldap
> --with-icu && make -j4 all contrib docs && make -Otarget -j3
> check-world
>
> This is a virtualised 4 core system. I wonder if "make -Otarget -j3
> check-world" creates enough load on it to produce some weird timing
> effect that you don't see on your development system.

I can't reproduce it, not even with the same build script.

It's starting to look like a timing effect indeed.

I get a similar effect if there's an active snapshot in another
session while vacuum runs. I don't know how the test suite ends up in
that situation, but it seems to be the case.

How do you suggest we go about fixing this? The test in question is
important, I've caught actual bugs in the implementation with it,
because it checks that vacuum effectively frees up space.

I'm thinking this vacuum test could be put on its own parallel group
perhaps? Since I can't reproduce it, I can't know whether that will
fix it, but it seems sensible.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-02-02 23:04:44 Re: Boolean partitions syntax
Previous Message Robert Haas 2018-02-02 22:34:46 Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key