Re: Obvious data mismatch in View2 which basically SELECT * from View1

From: Ben <bentenzha(at)outlook(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Obvious data mismatch in View2 which basically SELECT * from View1
Date: 2020-09-16 11:47:57
Message-ID: MWHPR06MB2400B96FF80186B940BE98F0B9210@MWHPR06MB2400.namprd06.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Magnus,

Thanks for the heads up.

For the data posted in this email, the listed result is the full result
set for that query.
I have checked the data in view2 and view1 with various conditions.
Both view has many difference in column wspan::float8.
The condition in the shown query is intended to show just a small set of
them.
But you are right, I should be more cautious. Thanks for the headsup.

Regards,
Ben

On 9/16/20 3:35 PM, Magnus Hagander wrote:
>
>
> On Wed, Sep 16, 2020 at 9:26 AM Ben <bentenzha(at)outlook(dot)com
> <mailto:bentenzha(at)outlook(dot)com>> wrote:
>
> Dear list,
>
> Recently I am getting feedback, data in my analytic report is not
> repeatable. From time to time they get different data for the same
> time
> span.
> (but IIRC previously it was OK). Therefore I started debuging the
> View
> chain for that report, during which I bumped into this
> issue/phenomenon.
>
> In a over -simplified version:
>
> CREATE VIEW2 AS SELECT * FROM VIEW1;
> SELECT  col1 FROM VIEW2 WHERE cond1=True;
> SELECT  col1 FROM VIEW1 WHERE cond1=True;
>
> Now col1 from both views looks different. I don't know where to
> start to
> solve this problem.
>
> The actual situation is a bit more than that, the following is the
> actual query:
>
>
>      -- trying to audit utlog weighed stat
>      with t as (
>      select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
> 0:0:0'::timestamp t1
>      )
>      --select * from t;
>      select *
>      -- from utlog.cache_stats_per_shift_per_reason_weighed_stats
>      -- from utlog.stats_per_shift_filtered_per_reason
>      from utlog.stats_per_shift_filtered (let's call
> it #View2 for short)
>      -- from utlog.stats_per_shift_filtered_b0206 (let's call it
> #View1 for short)
>      -- from utlog.stats_per_shift
>      cross join t
>      where wline = 'F02'  and wts >= t.t0 and wts < t.t1 and wsft ='D'
>      limit 100
>      ;
>
>
>
> Not sure if it might be something lost in your simplification here,
> but you have a LIMIT with no ORDER BY there. That basically means
> "give me 100 random rows" (but not with a very good random level). It
> does not return rows in a consistent/predictable order. So as long as
> that query is part of what you're doing, you should not be surprised
> if you get the rows in an inconsistent/unpredictable order, with
> whatever follow-on effects that might have. (And it can lead to weird
> follow-on effects like the ones you're talking about when used in
> larger query structures)
>
> --
>  Magnus Hagander
>  Me: https://www.hagander.net/ <http://www.hagander.net/>
>  Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Yessica Brinkmann 2020-09-16 13:03:20 Re: Problems with MemoryContextSwitchTo ()
Previous Message Pavel Stehule 2020-09-16 11:46:35 Re: Problems with MemoryContextSwitchTo ()