Re: Refactored queries needing validation of syntactic equivalence

From: Mike Adams <madams55075(at)comcast(dot)net>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-sql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Refactored queries needing validation of syntactic equivalence
Date: 2007-10-13 17:44:52
Message-ID: 47110414.6060907@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Richard Huxton wrote:
(quoted OP lines edited for brevity...)
> Mike Adams wrote:
...
>> I've reworked two
>> queries (as text attachment as they are wide lines) to enhance the
>> planner's chance of speeding up the queries (Oracle8i's).
>
> Well, I can't say it's standard procedure to look at Oracle queries, but
> if you don't tell anyone I won't :-)
No prob, my lips are sealed... ;-)

It's basically generic sql sanity checking that's needed anyhow.

>
>> I'm looking for someone to eyeball them and let me know if I've folded
>> the sub-selects up correctly
...
>>
>> Also unfortunately, there currently aren't any issues in the database
>> that these queries are designed to find. All I can say for sure is
>> (as you can see below each query) my refactored queries *at the least*
>> return *no* data faster than the legacy queries...
>
> Test data time then. No alternative to testing these things.
>
I do plan to run the old and the new until I'm sure the new queries
aren't borked and return the same set of info.

>> Thank you in advance
...
> OK, you've substituted and EXISTS check against a sub-query with a
> self-join. The key question of course is whether your join can return
> more than one match from "rcv_mo m" for each row in "rcv_mo o". I can't
> say without knowing more about your schema, and even then I'd want to
> test it.
>
Thanks for the response!
The schema is basically:

table rcv_mo(
CO_CD VCHR(3), --COMPANY CODE.
VE_CD NOT NULL VCHR(4), --VENDOR CODE.
IVC_CD VCHR(20), --INVOICE CODE.
PO_CD NOT NULL VCHR(13), --PURCHASE ORDER CODE.
ITM_CD NOT NULL VCHR(9), --ITEM CODE.
QTY NUM(13,2), --QUANTITY.
UNIT_CST NUM(13,2), --UNIT COST.
RCV_DT DATE, --RECEIVED DATE.
ORIGIN_CD NOT NULL VCHR(5), --CODE REPRESENTING THE PROGRAM WHICH
--CREATED THE RCV_MO RECORD.
STORE_CD VCHR(2), --RECEIVING STORE CODE.
WAYBILL_NUM VCHR(20), --WAYBILL NUMBER FROM RECEIVING BOL
ASSIGNED_DT DATE, --ASSIGNED DATE IS THE SYSTEM DATE WHEN
--THE INVOICE AND THE RECEIVINGS ARE
--LINKED TOGETHER.
TMP_ADJ_ROWID VCHR(40), --THIS FIELD WAS CREATED TO BE USED FOR
--SPECIAL PROCESSING DONE IN MPOI. UPON
--COMMITTING, THE TMP_ADJ_ROWID WILL
--ALWAYS BE NULL.
RCVR_ID VCHR(15), --ID OF THE PERSON RECEIVING THE ORDER.
EMP_CD VCHR(15), --ID OF THE LAST PERSON TO POST A

--CHANGE TO RCV_MO.
);

indexes:
NONUNIQE (CO_CD, VE_CD, IVC_CD, PO_CD, ITM_CD);
NONUNIQE (VE_CD, PO_CD);

Notice the date columns aren't indexed! If they were, even the original
queries would be *much* faster! Unfortunately I cannot get indexes
implemented (not w/o more aggravation than the wait for the original
queries provides).

Here's the "process": inventory mgmt system inserts tuples when/as
product arrives. If more than one of an item (itm_cd) is in the same
batch, it may, or may not, (it's very POM dependent ;) ) be split into
multiple tuples.

Then the accounting dpt enters the product invoice into the "Match Off
Management" system and assigns received product to the appropriate
vendor invoice.

Occasionally, the receiving dpt may post oh say 48 of an item in one
table entry, however, the acctng dpt has 2 invoices for 24 items each.
In MOM the acctng dpt /manually/ splits the entry, thus inserting 2
records who's origin is 'MOM', each for 24 items, and assigns them to
the invoice(s) (or maybe just assigns one since they've not yet rec'd an
invoice for the other 24). So, we can have *many* 'MOM' records.

They are /supposed/ to let me know so I can immediately assign the
original record to a "fake" invoice called "SPLIT IN MOM" and it drops
off the radar. So of course, I'm rarely notified. This table is used to
accrue for received but unpaid merchandise: "orphaned" entries inflate
the accrual and inflate the value of inventory (not good).

So.....
The first query should pull all 'MOM' records that have one or more
corresponding, and possibly orphaned, unassigned receiving records
belonging to the same po_cd and item_cd.

The second query should pull all unassigned, and possibly orphaned
receiving records that have one or more corresponding 'MOM' records once
again matching on po_cd and item_cd.

Using the results of both queries to double check each other, I can
figure out which (if any) open records are, in fact, orphans and do an
"after the fact" assignment to the "SPLIT IN MOM" invoice to reduce our
accrual.

Of course, our ERMS should take care of this automagically; but,
tragically, it seems "real" inventory cost flow was attached to the
system using duct tape, hot glue, and a couple of thumb tacks.

So, given all the administriva above, have I actually refactored them
correctly?

Thanks much!

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Richard Huxton 2007-10-16 08:27:20 Re: Refactored queries needing validation of syntactic equivalence
Previous Message Erik Jones 2007-10-12 16:17:51 Re: Accessing field of OLD in trigger