Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query

From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: andy(at)prestigedigital(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Date: 2018-08-13 16:43:27
Message-ID: CAL9smLBPZJOUWjHKgZOQaDO5FK+AkkX61BT06mpSfiNz4wdtgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 13, 2018 at 7:35 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2018-08-13 16:14:03 +0000, PG Bug reporting form wrote:
> > Execute this query (multiple times!)
> >
> > select * from events where account in (select account from events where
> > data->>'page' = 'success.html' limit 3);
>
> Well, the subselect with thelimit going to return different results from
> run to run. Unless you add an ORDER BY there's no guaranteed order in
> which tuples are returned. So I don't think it's surprising that you're
> getting results that differ between runs.
>

While this is true, that's missing the point. This output, for example:

account | page
---------+--------------
14 | a.html
14 | success.html
65 | b.html
65 | success.html
80 | b.html
80 | success.html
24084 | a.html
24084 | success.html
24085 | c.html
24085 | success.html
24095 | a.html
24095 | success.html
(12 rows)

contains data from six different accounts, which should surely be
impossible regardless of which three accounts the subquery returns.

The one in repro1 is also problematic, because it shows that 304873, 304875
and 304885 were all selected, but not all rows for those accounts were
returned.

.m

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2018-08-13 17:26:45 Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Previous Message Andrew Fletcher 2018-08-13 16:40:49 Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query