From: | Carl Sopchak <carl(dot)sopchak(at)cegis123(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Newbie questions relating to transactions |
Date: | 2009-03-08 17:17:12 |
Message-ID: | 200903081317.13158.carl.sopchak@cegis123.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sunday, March 08, 2009, Alvaro Herrera wrote:
> Carl Sopchak wrote:
> > Well, the upgrade to 8.3 seemed to rid me of the command limit, but now
> > I'm running out of memory. I have 2Gb physical and 8Gb swap (after
> > adding 4Gb).
>
> Do you have AFTER triggers on the involved tables? They are recorded on
> memory and we have no mechanism to spill to disk, so it's frequent that
> those cause out-of-memory. If that's the explanation, your workaround
> would be to get rid of them.
No triggers have been defined on any tables in the database...
>
> > Is there a way for me to run this outside of one huge transaction? This
> > really shouldn't be using more than a few hundred megs of RAM (assuming
> > cursor records are all stored in memory)...
>
> Hmm, maybe you're holding too many cursors open and not closing them
> timely? Did you post your function for review?
I am only using one cursor, which is opened and closed repeatedly. It pulls
the base data from the database for the calculations.
I have not posted the function for review yet because its function is
proprietary. However, I have stripped out the proprietary stuff, and include
the code below. I marked everything stripped out by placing a brief
description enclosed within {{ and }}. I left all of the places that the
database is accessed in the code. I changed some of the line wrapping to fit
a reasonable width (which I mention in case you see syntax type errors).
There is a few lines that save the calculation details based on a flag in the
trial_header table. This flag is set to N for the run that I am having
issues with, so these records are not being created. I left that code in
below for completeness...
I realize this isn't probably the cleanest code out there (I'm sure using
prepared statements would help speed), but it was really meant to be a "quick
and dirty" way to calculate the data I need. Any comments or suggestions on
improving the code is welcome.
create or replace function Run_Trial (tid integer) returns void as $proc$
declare
{{ declarations }}
begin
-- Set start time...
program_version := '1.16';
update trial_header set start_timestamp = clock_timestamp(),
run_version = program_version, end_timestamp = null
where trial_id = tid;
-- get rid of prior run, if any:
delete from trial_results where trial_id = tid;
delete from trial_calc_detail where trial_id = tid;
-- Get the trial parameters:
select * into trial_hdr from trial_header where trial_id = tid;
{{ Do some calculations }}
-- Create temp table of data. This simplifies the coding below A LOT.
{{ conditional calc }}
-- (I can't figure out how to do this with a dynamic select and
-- "insert into trial_data select" and I get an error
-- if I make the dynamic SQL a "select into temp table trial_data"...)
-- Do it by brute force, I guess...
drop table if exists trial_data;
create temp table trial_data(
{{ fields }}
);
for row in execute
'select {{ select statement }}' loop
execute 'insert into trial_data values(' || {{ fields }} || ')';
end loop;
create index trial_data_idx on trial_data (data_yyyymm, data_date);
create index trial_data_idx2 on trial_data (data_date);
-- Get date range for the data set we're using
for row in execute 'select min(data_date) as min_date,
max(data_date) as max_date from trial_data' loop
low_data_date := row.min_date;
high_data_date := row.max_date;
end loop;
-- Calculate maximum number of years that data covers
max_years = floor((high_data_date - low_data_date) / 365.25);
-- Loop through all possible "x year" periods
for cur_years in 1 .. max_years loop
-- start from the first period on file:
next_iteration_start := low_data_date;
num_periods := trial_hdr.periods_per_year * cur_years
+ trial_hdr.{{ field }};
for row in execute 'select count(*) as cnt from (
select data_date from trial_data where data_date >= ' ||
quote_literal(next_iteration_start) ||
' Limit ' || to_char(num_periods, '9999999999') || ') a' loop
data_periods := row.cnt;
end loop;
-- Do each "x year" period in data
while data_periods = num_periods loop
-- Initialize calculation
-- used to set sucessive values for next_iteration_start:
iteration_counter := 0;
{{ some calculations }}
for row in execute 'select max(data_date) as max_date' ||
' from ( select data_date from trial_data' ||
' where data_date >= ' ||
quote_literal(next_iteration_start) ||
' order by data_date' ||
' Limit ' || to_char(num_periods, '999999999') || ') a' loop
per_end_date := row.max_date;
end loop;
-- Get data for calculation
open data_cursor for execute 'select * from trial_data' ||
' where data_date >= ' || quote_literal(next_iteration_start) ||
' order by data_date ' ||
' Limit ' || to_char(num_periods, '999999999');
loop -- through periods for calculation
fetch data_cursor into data;
if not found then
exit;
end if;
-- determine next iteration start date:
iteration_counter := iteration_counter + 1;
if iteration_counter = 1 then
{{ calculations }}
end if;
if iteration_counter = 2 then
next_iteration_start := data.data_date;
end if;
{{ calculations based on row data }}
-- save details if requested:
if upper(trial_hdr.save_calc_details) = 'Y' then
insert into trial_calc_detail values( {{ fields }} );
end if;
{{ Calculation }}
end loop; -- through periods for calculation
-- Final calculations:
{{ calculations }}
-- save results:
insert into trial_results values( {{ fields }} );
close data_cursor;
for row in execute 'select count(*) as cnt from (
select data_date from trial_data where data_date >= ' ||
quote_literal(next_iteration_start) ||
' Limit ' || to_char(num_periods, '9999999999') || ') a'
loop
data_periods := row.cnt;
end loop;
end loop; -- Do each "x year" period in data
end loop; -- loop through all possible "x year" periods
-- mark finish time on trial:
update trial_header set end_timestamp = clock_timestamp()
where trial_id = tid;
-- clean up:
drop table trial_data;
end;
$proc$ language plpgsql;
From | Date | Subject | |
---|---|---|---|
Next Message | Christian Schröder | 2009-03-08 18:37:32 | Re: Performance of subselects |
Previous Message | Thomas Kellerer | 2009-03-08 17:13:12 | Re: Newbie questions relating to transactions |