From: | PFC <lists(at)peufeu(dot)com> |
---|---|
To: | Justin <justin(at)emproshunts(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: need to speed up query |
Date: | 2008-05-06 16:35:16 |
Message-ID: | op.uaqq02fzcigqcu@apollo13.peufeu.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
>> What is a "period" ? Is it a month, or something more "custom" ?
>> Can periods overlap ?
> No periods can never overlap. If the periods did you would be in
> violation of many tax laws around the world. Plus it you would not know
> how much money you are making or losing.
I was wondering if you'd be using the same query to compute how much was
gained every month and every week, which would have complicated things.
But now it's clear.
> To make this really funky you can have a Fiscal Calendar year start
> June 15 2008 and end on June 14 2009
Don't you just love those guys ? Always trying new tricks to make your
life more interesting.
>> Note that here you are scanning the entire table multiple times,
>> the complexity of this is basically (rows in gltrans)^2 which is
>> something you'd like to avoid.
>>
> For accounting purposes you need to know the Beginning Balances,
> Debits, Credits, Difference between Debits to Credits and the Ending
> Balance for each account. We have 133 accounts with presently 12
> periods defined so we end up 1596 rows returned for this query.
Alright, I propose a solution which only works when periods don't overlap.
It will scan the entire table, but only once, not many times as your
current query does.
> So period 1 should have for the most part have Zero for Beginning
> Balances for most types of Accounts. Period 2 is Beginning Balance is
> Period 1 Ending Balance, Period 3 is Period 2 ending balance so and so
> on forever.
Precisely. So, it is not necessary to recompute everything for each
period.
Use the previous period's ending balance as the current period's starting
balance...
There are several ways to do this.
First, you could use your current query, but only compute the sum of what
happened during a period, for each period, and store that in a temporary
table.
Then, you use a plpgsql function, or you do that in your client, you take
the rows in chronological order, you sum them as they come, and you get
your balances. Use a NUMERIC type, not a FLOAT, to avoid rounding errors.
The other solution does the same thing but optimizes the first step like
this :
INSERT INTO temp_table SELECT period, sum(...) GROUP BY period
To do this you must be able to compute the period from the date and not
the other way around. You could store a period_id in your table, or use a
function.
Another much more efficient solution would be to have a summary table
which keeps the summary data for each period, with beginning balance and
end balance. This table will only need to be updated when someone finds an
old receipt in their pocket or something.
> This falls under the stupid question and i'm just curious what other
> people think what makes a query complex?
I have some rather complex queries which postgres burns in a few
milliseconds.
You could define complexity as the amount of brain sweat that went into
writing that query.
You could also define complexity as O(n) or O(n^2) etc, for instance your
query (as written) is O(n^2) which is something you don't want, I've seen
stuff that was O(2^n) or worse, O(n!) in software written by drunk
students, in this case getting rid of it is an emergency...
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2008-05-06 16:41:40 | Re: What constitutes a complex query |
Previous Message | Dennis Muhlestein | 2008-05-06 16:33:13 | Possible Redundancy/Performance Solution |