Re: Find inconsistencies in data with date range

From: Jason Aleski <jason(dot)aleski(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Find inconsistencies in data with date range
Date: 2015-03-12 04:10:08
Message-ID: 550111A0.2070105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

In case anyone else needs similar code, I was able to get this working.
Below is the code that pulls the missing dates using a cursor and
returns the information into a table. I'm sure there may be a way to
make the code more efficient, but considering this will only get ran
maybe once a quarter (for quarterly reports), it works for me. With
700+ stores, it takes about 30 minutes to fully run from a reporting
server. I have a JAVA program that queries the function "SELECT * FROM
eod_missing_dates();" Then sends all the missing dates to a RabbitMQ
server to with a worker program to try to rebuild the missing eod
summaries and if not, it will send a message to the store managers.
Hopefully this code will help someone else!

CREATE OR REPLACE FUNCTION eod_missing_dates()
RETURNS TABLE(store_id uuid, location character varying, missing_ts
timestamp with time zone) AS
$BODY$
DECLARE
location_cursor CURSOR FOR SELECT * FROM locations ORDER BY store_id;
store_rec location%ROWTYPE;
BEGIN

CREATE TEMP TABLE dr_temptable(dr_ts, dr_dow) AS (SELECT t1.GenDate
as gendate, extract(dow from GenDate) AS dayofweek
FROM (SELECT date as GenDate
FROM
generate_series('1950-01-01'::date,CURRENT_TIMESTAMP::date,'1
day'::interval) date
) AS t1
WHERE extract(dow from GenDate) NOT IN (0,6));

OPEN location_cursor;
LOOP
FETCH location_cursor INTO store_rec;
EXIT WHEN store_rec IS NULL;

IF NOT FOUND THEN
EXIT;
END IF;

RAISE INFO '%', 'Checking data/dates for ' || store_rec.location;
RETURN QUERY SELECT store_rec.row_id as store_id,
store_rec.location AS location, dr_ts AS missing_ts FROM dr_temptable AS t1
WHERE dr_ts NOT IN (SELECT t1.eod_ts FROM
daily_salessummary AS t1
JOIN location AS t2 ON t1.store_id = t2.row_id
WHERE t2.location = store_rec.location)
AND dr_ts > (SELECT start_date FROM locations
WHERE location=store_rec.location);
END LOOP;
CLOSE location_cursor;
DROP TABLE dr_temptable;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
;

Jason Aleski / IT Specialist

>
> On 6 March 2015 at 22:38, Jason Aleski <jason(dot)aleski(at)gmail(dot)com
> <mailto:jason(dot)aleski(at)gmail(dot)com>> wrote:
>
> I know I can do this Java, but I'd rather have this running as a
> Stored Procedure. What I am wanting to do is identify and
> potentially correct the summary data for date inconsistencies. We
> have policies/red flag reports in place to keep this from
> happening, but we are now cleaning up history. The query below
> works on a per store basis, but I'd like to be able to run this
> for all stores in the location table.
>
> I've looked at some procedure codes regarding looping, but
> everything I try to create seems to give me problems. THe code
> I'm trying is also below. Does anyone have any suggestions on how
> to accomplish this?
>
>
>
> _Working Tables_
> locations - table contains store information, startup date,
> address, etc
> daily_salessummary - table holds daily sales summary by store
> (summary should be updated nightly). eod_ts is End of Day Timestamp.
>
> _Query_
> WITH datelist AS(
> SELECT t1.GenDate as gendate, extract(dow from GenDate) AS dayofweek
> FROM (SELECT date as GenDate
> FROM
> generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1
> day'::interval) date
> ) AS t1
> )
> SELECT gendate FROM datelist AS t1
> WHERE gendate NOT IN (SELECT t1.eod_ts FROM daily_salessummary AS t1
> JOIN locations AS t2 ON t1.location_id =
> t2.row_id
> WHERE t2.locationCode = 'US_FL_TAMPA_141')
>
> AND gendate > (SELECT start_date FROM locations WHERE
> locationCode = 'US_FL_TAMPA_141')
>
>
> _Desired Output_ - could output to an exceptions table
> StoreCode 'US_FL_TAMA_141' missing daily summary for 1998-01-01
> StoreCode 'MX_OAXACA_SALINA_8344' missing daily summary for 2011-06-05
>
>
> _ProcedureSQL_ (contains unknown errors)
> DECLARE
> CURSOR location_table IS
> SELECT locationCode FROM locations;
> BEGIN
> FOR thisSymbol IN ticker_tables LOOP
> EXECUTE IMMEDIATE 'WITH datelist AS(
> SELECT t1.GenDate as
> gendate, extract(dow from GenDate) AS dayofweek
> FROM (SELECT date
> as GenDate
> FROM
> generate_series('1985-01-01'::date,CURRENT_TIMESTAMP::date,'1
> day'::interval) date
> ) AS t1
> )
> SELECT gendate FROM
> datelist AS t1
> WHERE gendate NOT IN
> (SELECT t1.eod_ts FROM daily_salessummary AS t1
> JOIN locations AS t2 ON t1.location_id =
> t2.row_id
> WHERE t2.locationCode = '' ||
> location_table.locationCode || '')
> AND gendate > (SELECT
> start_date FROM locations WHERE locationCode = '' ||
> location_table.locationCode || '')';
> END LOOP;
> END;
>
>
>
> --
> Jason Aleski / IT Specialist
>
>

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message bricklen 2015-03-14 14:20:54 Re: Strange Query - Reg
Previous Message Adrian Klaver 2015-03-10 15:53:15 Re: Strange Query - Reg