Re: pgbench logging broken by time logic changes

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, david(dot)christensen(at)crunchydata(dot)com
Subject: Re: pgbench logging broken by time logic changes
Date: 2021-06-16 19:13:30
Message-ID: e05abaab-7bd2-93ed-eeca-3f3ceb542506@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 6/16/21 2:59 PM, Fabien COELHO wrote:
>
> Hello Greg,
>
>> I have a lot of community oriented work backed up behind this right
>> now, so
>> I'm gonna be really honest.  This time rework commit in its current form
>> makes me uncomfortable at this point in the release schedule.  The
>> commit
>> has already fought through two rounds of platform specific bug
>> fixes.  But
>> since the buildfarm doesn't test the logging feature, that whole
>> process is
>> suspect.
>
> Logging is/should going to be fixed.
>
>> My take on the PostgreSQL way to proceed:  this bug exposes that pgbench
>> logging is a feature we finally need to design testing for.
>
> Sure.
>
> The key feedback for me is the usual one: what is not tested does not
> work. Wow:-)

Agreed.

>
> I'm unhappy because I already added tap tests for time-sensitive
> features (-T and others, maybe logging aggregates, cannot remember),
> which have been removed because they could fail under some
> circonstances (eg very very very very slow hosts), or required some
> special handling (a few lines of code) in pgbench, and the net result
> of this is there is not a single test in place for some features:-(

I'm not familiar with exactly what happened in this case, but tests need
to be resilient over a wide range of performance characteristics. One
way around this issue might be to have a way of detecting that it's on a
slow platform and if so either skipping tests (Test::More provides
plenty of support for this) or expecting different results.

>
> There is no problem with proposing tests, the problem is that they are
> accepted, or if they are accepted then that they are not removed at
> the first small issue but rather fixed, or their limitations accepted,
> because testing time-sensitive features is not as simple as testing
> functional features.
>
> Note that currently there is not a single test of psql with autocommit
> off or with "on error rollback". Last time a submitted tap tests to
> raise psql test coverage from 50% to 90%, it was rejected. I'll admit
> that I'm tired arguing that more testing is required, and I'm very
> happy if someone else is ready to try again. Good luck! :-)

:-(

>
>> We need a new buildfarm test and then a march through a full release
>> phase to see how it goes.
>
>> Only then should we start messing with the time logic.  Even if we
>> fixed the source today on both my test platforms, I'd still be
>> nervous that beta 2 could ship and more performance testing could
>> fall over from this modification.  And that's cutting things a little
>> close.
>
> Well, the point beta is to discover bugs not caught by reviews and dev
> tests, fix them, and possibly add tests which would have caught them.
>
> If you revert all features on the first issue in a corner case and put
> it back to the queue, then I do not see why the review and dev tests
> will be much better on the next round, so it does not really help
> moving things forward.
>
> IMHO, the pragmatic approach is to look at fixing first, and maybe
> revert if the problems are deep. I'm not sure this is obviously the
> case here.

It does look like the submitted fix basically reverts the changes w.r.t.
this timestamp logging.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2021-06-16 19:18:20 Re: snapshot too old issues, first around wraparound and then more.
Previous Message Andres Freund 2021-06-16 19:12:23 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic