Re: Obvious data mismatch in View2 which basically SELECT * from View1

From: Jerry Sievers <gsievers19(at)comcast(dot)net>
To: Ben <bentenzha(at)outlook(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Obvious data mismatch in View2 which basically SELECT * from View1
Date: 2020-09-17 14:41:48
Message-ID: 87zh5opis3.fsf@jsievers.enova.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ben <bentenzha(at)outlook(dot)com> writes:

> Dear List,
>
> Some further investigation.
>
> Creating a fresh View3 on View1 gives exactly the same result as
> View1.
>
> The View1 View2 are both years old in a production database, in use
> for quite some time. (The database is production duty but not hosted
> in server room with UPS. It's like a edge PC in industry monitoring.
> Now am more concerned with its data integrity)
>
> The problem with the final report is reported recently. I am not sure
> what's broken in the database.
>
> I haven't replaced the broken View2 yet. Hope someone can point me to
> some further investigation.

Already mentioned downthread, but have a look at the view definitions
by...

select pg_get_viewdef('$your-view');

Or...

pg_dump --table $your-view

Expect to see something different if you repeat the above for the old
view giving undesired results and the new correct version.

HTH

> My concern is that if there are other views inside that database
> having similar integrity issue, how can I find them all (if any).
>
> It's beyond my regular SQL ability. I guess I really need help from
> people with maintenance experience.
>
> Any help will be appreciated, thanks in advance.
>
> Ben
>
>
>
> On September 16, 2020 3:40:34 AM UTC, Ben <bentenzha(at)outlook(dot)com>
> wrote:
>
> Dear list,
>
> Recently I am getting feedback, data in my analytic report is not
> repeatable. From time to time they get different data for the same time
> span.
> (but IIRC previously it was OK). Therefore I started debuging the View
> chain for that report, during which I bumped into this issue/phenomenon.
>
> In a over -simplified version:
>
> CREATE VIEW2 AS SELECT * FROM VIEW1;
> SELECT col1 FROM VIEW2 WHERE cond1=True;
> SELECT col1 FROM VIEW1 WHERE cond1=True;
>
> Now col1 from both views looks different. I don't know where to start to
> solve this problem.
>
> The actual situation is a bit more than that, the following is the
> actual query:
>
>
> -- trying to audit utlog weighed stat
> with t as (
> select '2020-07-01 00:00:00'::timestamp t0, '2020--07-02
> 0:0:0'::timestamp t1
> )
> --select * from t;
> select *
> -- from utlog.cache_stats_per_shift_per_reason_weighed_stats
> -- from utlog.stats_per_shift_filtered_per_reason
> from utlog.stats_per_shift_filtered (let's call
> it #View2 for short)
> -- from utlog.stats_per_shift_filtered_b0206 (let's call it
> #View1 for short)
> -- from utlog.stats_per_shift
> cross join t
> where wline = 'F02' and wts >= t.t0 and wts < t.t1 and wsft ='D'
> limit 100
> ;
>
> The Result for #View2
>
> wts | wsft | wspan | wstate | wline | rcodes
> --------------------+------+--------+--------+-------+-------
> 2020-07-01 08:00:00 | D | 0 | S00 | F02 | {PDCB}
> 2020-07-01 09:50:01 | D | 12.533 | S00 | F02 | {PDCB}
> 2020-07-01 11:35:46 | D | 12.217 | S00 | F02 | {CDSO}
> 2020-07-01 13:22:58 | D | 5.15 | S00 | F02 | {PDCB}
> 2020-07-01 14:57:38 | D | 6.8 | S00 | F02 | {PDCB}
>
> INDEX | COLUMN_NAME | DATA_TYPE
> ------+-------------+------------
> 1 | wts | timestamptz
> 3 | wsft | varchar
> 4 | wspan | float8
> 5 | wstate | varchar
> 6 | wline | varchar
> 7 | rcodes | text[]
>
>
> Same query, the Result for #View1
>
> wts | wsft | wspan | wstate | wline | rcodes
> --------------------+------+-------+--------+-------+-------
> 2020-07-01 08:00:00 | D | 5 | S00 | F02 | {PDCB}
> 2020-07-01 09:50:01 | D | 13 | S00 | F02 | {PDCB}
> 2020-07-01 11:35:46 | D | 12 | S00 | F02 | {CDSO}
> 2020-07-01 13:22:58 | D | 5 | S00 | F02 | {PDCB}
> 2020-07-01 14:57:38 | D | 7 | S00 | F02 | {PDCB}
>
> INDEX | COLUMN_NAME | DATA_TYPE
> ------+-------------+------------
> 1 | wts | timestamptz
> 3 | wsft | varchar
> 4 | wspan | float8
> 5 | wstate | varchar
> 6 | wline | varchar
> 7 | rcodes | varchar[]
>
> Reuslts in `wspan` column is inaccurate while both type are float8. Most
> weird thing is the 5 to 0 change. for Row 1.
>
> The `_b0206`(#View1) is just a version of
> `stats_per_shift_filtered`(#View2) from past revisions.
> I am sure the original CREATE statement for (#View2) is `CREATE VIEW ...
> AS SELECT * FROM ...._b0206`
>
> Definition of View2 in SQLWorkbench/J generated schema:
>
>
> CREATE OR REPLACE VIEW utlog.stats_per_shift_filtered (#View2)
> (
> wts,
> wdate,
> wsft,
> wspan,
> wstate,
> wline,
> rcodes
> )
> AS
> SELECT stats_per_shift_filtered_u0206.wts,
> stats_per_shift_filtered_u0206.wsft::character varying AS wsft,
> stats_per_shift_filtered_u0206.wspan,
> stats_per_shift_filtered_u0206.wstate,
> stats_per_shift_filtered_u0206.wline,
> stats_per_shift_filtered_u0206.rcodes
> FROM utlog.stats_per_shift_filtered_u0206; (as #View1 in this post)
>
>
> It feels like the utlog.stats_per_shift_filtered_u0206 in
> utlog.stats_per_shift_filtered definition is a different object from
> utlog.stats_per_shift_filtered_u0206?
>
> I am totally out of clues. Any help would be appreciated. Thanks.
>
>
> Regards,
>
> Ben
>
>
>
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres(dot)consulting(at)comcast(dot)net

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2020-09-17 17:26:59 Re: PostgreSQL processes use large amount of private memory on Windows
Previous Message Tom Lane 2020-09-17 14:06:07 Re: PostgreSQL processes use large amount of private memory on Windows