From: | Cyril Champier <cyril(dot)champier(at)doctolib(dot)com> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Default ordering option |
Date: | 2019-07-26 07:52:52 |
Message-ID: | CAJaA8Vcemtt4yhJTk-Wtvj94ZpPLziMKDfzv4S_Ma5dGnAfhPw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Adrian:
Are you really looking for a pseudo-random name?
No, the code I pasted was an existing production bug: the last_name should
have been unique, so the selected patient would always be the same.
This should have been detected in tests, but since the order was "almost
always the same", our test was green 99% of the time, so we discarded it as
flaky.
Fuzzy testing could be an option, but this would go too far, as for Peter
extension suggestion.
We have huge existing codebase with more than 10K tests, and I do not want
to modify our whole testing strategy.
Meanwhile, I went for an ORM patch (ActiveRecord) and forbid usages that
can workaround it.
If you are interested, here is a gist:
https://gist.github.com/cyrilchampier/fdb945e8a09f93d50c7e89305c2f53f0
I wish there was a simple flag to activate in PostgreSQL to do that!
Thanks for your ideas!
On Thu, Jul 25, 2019 at 4:55 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 7/25/19 12:43 AM, Cyril Champier wrote:
> > *Adrian*:
> >
> > If order is not an issue in the production code why test for it in
> the
> > test code?
> >
> >
> > In many cases, it would not be a problem in tests if we had an unordered
> > array comparison helper.
> > But in other cases, it is a production issue.
> > In ruby ActiveRecord for exemple, you can do `Patient.find_by(last_name:
> > 'champier')`,
> > which translates to `SELECT "patients".* FROM "patients" WHERE
> > "patients"."last_name" = 'champier' LIMIT 1`.
> > If last_name is not unique, the returned record will be random.
>
> Are you really looking for a pseudo-random name?
>
> If so would not(warning not a Ruby developer, so below is tentative):
>
> Patient.where(["last_name = :last_name", {last_name:
> "champier"}]).order('RANDOM()').first
>
> work better?
>
> If not why not use something that returns all possible matches?
>
> >
> > So yes, everything as to be randomized, because the sources are
> > multiples and the consequences can vary to a dramatic production bug, a
> > failed CI 1% of the time, or to a useless test assertion.
>
> One way I can think of doing this is write a script that walks through
> your tables in the test db and does an UPDATE across the rows. It is
> going to add time to your tests, but then I believe that is going to be
> the case for anything you do. Or you could look at something I have
> never tried, fuzzy testing. As a starting point:
>
> https://www.guru99.com/fuzz-testing.html
>
> Maybe other folks have suggestions on tools you could use for fuzzy
> testing.
>
>
> >
> >
> > *Peter*:
> >
> > It might be an interesting exercise to implement this as a
> post-parsing
> > hook.
> >
> > I known nothing about that, but that sounds interesting, do you have any
> > documentation pointer to help me implement that?
> >
> >
> >
> > On Wed, Jul 24, 2019 at 10:36 PM Peter Eisentraut
> > <peter(dot)eisentraut(at)2ndquadrant(dot)com
> > <mailto:peter(dot)eisentraut(at)2ndquadrant(dot)com>> wrote:
> >
> > On 2019-07-23 17:43, Cyril Champier wrote:
> > > In this documentation
> > > <https://www.postgresql.org/docs/9.1/queries-order.html>, it is
> said:
> > >
> > > If sorting is not chosen, the rows will be returned in an
> > > unspecified order. The actual order in that case will depend
> > on the
> > > scan and join plan types and the order on disk, but it must
> > not be
> > > relied on.
> > >
> > >
> > > I would like to know if there is any way to change that to have a
> > "real"
> > > random behaviour.
> >
> > It might be an interesting exercise to implement this as a
> post-parsing
> > hook.
> >
> > --
> > Peter Eisentraut http://www.2ndQuadrant.com/
> > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> >
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
From | Date | Subject | |
---|---|---|---|
Next Message | Luca Ferrari | 2019-07-26 08:06:18 | Re: Too slow to create new schema and their tables, functions, triggers. |
Previous Message | Adrian Klaver | 2019-07-26 03:09:47 | Re: Too slow to create new schema and their tables, functions, triggers. |