From: | Jan Nielsen <jan(dot)sture(dot)nielsen(at)gmail(dot)com> |
---|---|
To: | Venki Ramachandran <venki_ramachandran(at)yahoo(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Parallel Scaling of a pgplsql problem |
Date: | 2012-04-26 03:41:13 |
Message-ID: | CANxH4hFe=VDS4WzFAnOFArhVUUfarG4Vubku2CBxjbgz1TWmCQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Apr 25, 2012 at 12:52 PM, Venki Ramachandran <
venki_ramachandran(at)yahoo(dot)com> wrote:
> Hi all:
> Can someone please guide me as to how to solve this problem? If this is
> the wrong forum, please let me know which one to post this one in. I am new
> to Postgres (about 3 months into it)
>
> I have PostGres 9.0 database in a AWS server (x-large) and a pgplsql
> program that does some computation. It takes in a date range and for one
> pair of personnel (two employees in a company) it calculates some values
> over the time period. It takes about 40ms (milli seconds) to complete and
> give me the answer. All good so far.
>
> Now I have to run the same pgplsql on all possible combinations of
> employees and with 542 employees that is about say 300,000 unique pairs.
>
> So (300000 * 40)/(1000 * 60 * 60) = 3.33 hours and I have to rank them and
> show it on a screen. No user wants to wait for 3 hours, they can probably
> wait for 10 minutes (even that is too much for a UI application). How do I
> solve this scaling problem? Can I have multiple parellel sessions and each
> session have multiple/processes that do a pair each at 40 ms and then
> collate the results. Does PostGres or pgplsql have any parallel computing
> capability.
>
Setting aside the database concurrency question, have you considered
application-level solutions?
How often does a user expect their rank to change? If a daily rank change
is fine, trigger the (lengthy) ranking calculation nightly and cache the
results in a materialized view for all users; you could continuously
rebuild the view to improve freshness to within 4 hours. To go faster with
an application-level solution, you will have to reduce your calculation to
*what's* likely to be most important to the individual which, again, you
can cache; or, if you can predict *who's* most likely to request a ranking,
calculate these first; or, both.
These are likely good things to consider regardless of any improvements you
make to the back-end ranking calculation, though at you will hit a point of
diminishing returns if your ranking calculation drops below some
"tolerable" wait. In the web world "tolerable" is about 3 seconds for the
general public and about 30 seconds for a captured audience, e.g.,
employees. YMMV.
Cheers,
Jan
From | Date | Subject | |
---|---|---|---|
Next Message | Yeb Havinga | 2012-04-26 06:49:12 | Re: Parallel Scaling of a pgplsql problem |
Previous Message | Venki Ramachandran | 2012-04-26 02:40:18 | Re: Parallel Scaling of a pgplsql problem |