From: | Kirk Wythers <kirk(dot)wythers(at)gmail(dot)com> |
---|---|
To: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | big un stacking query - help save me from myself |
Date: | 2013-03-13 21:45:34 |
Message-ID: | 5CCBF822-1E66-4B8B-A259-958DD384A0BA@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I hace a fairly large table with two columns that I need to "de-normalize" (235 million rows) There has got to be a better (i.e. faster) approach than what I am doing. I am using a MAX CASE on each of the 24 variables (column names variable and value) that I want to unstack. Any suggestions would be most appreciated. Here is the ANALYZE
GroupAggregate (cost=107156950.16..174988710.12 rows=23511876 width=286)
-> Sort (cost=107156950.16..107744747.04 rows=235118752 width=286)
Sort Key: site, canopy, block, plot, measurement_interval, warming_treatment, treatment_code, treatment_abbr, water_treatment, variable_name, variable_channel, variable_id, rowid, time2, "timestamp", block_name, table_name, batt_volt, program, flag1, flag2, flag3, amb_a_avg, amb_closed_avg, airtc_avg, airtemp_avg, airtemp_max, airtemp_min, all_avgt, am25tref1, ptemp_avg, rh, s_all_avgt_avg, vp_avg, tabove_sdupper, tabove_sdlower, tabove_meantrim, tabove_mean_dc, tsoil_sdupper, tsoil_sdlower, tsoil_meantrim, tsoil_mean_dc
-> Seq Scan on derived_15min_joined (cost=0.00..11782048.52 rows=235118752 width=286)
(4 rows)
And here is the table:
b4warmed3=# \d derived_15min_joined
Table "public.derived_15min_joined"
Column | Type | Modifiers
----------------------+-----------------------------+-----------
site | character varying(6) |
canopy | character varying(24) |
block | character(2) |
plot | character(6) |
measurement_interval | interval |
warming_treatment | character varying(24) |
treatment_code | character varying(24) |
treatment_abbr | character varying(24) |
water_treatment | character varying(24) |
variable_name | character varying(24) |
variable_channel | character varying(24) |
variable_id | character varying(24) |
rowid | character varying(48) |
time2 | timestamp without time zone |
timestamp | timestamp without time zone |
block_name | character varying(8) |
table_name | character varying(10) |
batt_volt | real |
program | character varying(48) |
flag1 | integer |
flag2 | integer |
flag3 | integer |
amb_a_avg | real |
amb_closed_avg | real |
airtc_avg | real |
airtemp_avg | real |
airtemp_max | real |
airtemp_min | real |
all_avgt | real |
am25tref1 | real |
ptemp_avg | real |
rh | real |
s_all_avgt_avg | real |
vp_avg | real |
tabove_sdupper | real |
tabove_sdlower | real |
tabove_meantrim | real |
tabove_mean_dc | real |
tsoil_sdupper | real |
tsoil_sdlower | real |
tsoil_meantrim | real |
tsoil_mean_dc | real |
variable | text |
value | real |
Indexes:
"derived_15min_joined_lower_idx" btree (lower(variable_name::text))
"derived_15min_joined_time2_idx" btree (time2)
b4warmed3=#
And here is my query.
SELECT
site,
canopy,
block,
plot,
measurement_interval,
warming_treatment,
treatment_code,
treatment_abbr,
water_treatment,
variable_name,
variable_channel,
variable_id,
rowid,
time2,
timestamp,
block_name,
table_name,
batt_volt,
program,
flag1,
flag2,
flag3,
amb_a_avg,
amb_closed_avg,
airtc_avg,
airtemp_avg,
airtemp_max,
airtemp_min,
all_avgt,
am25tref1,
ptemp_avg,
rh,
s_all_avgt_avg,
vp_avg,
tabove_sdupper,
tabove_sdlower,
tabove_meantrim,
tabove_mean_dc,
tsoil_sdupper,
tsoil_sdlower,
tsoil_meantrim,
tsoil_mean_dc,
MAX (
CASE
WHEN lower(variable_name) = 'tabove' THEN
value
END
) AS tabove,
MAX (
CASE
WHEN lower(variable_name) = 'tabove_sc' THEN
value
END
) AS tabove_sc,
MAX (
CASE
WHEN lower(variable_name) = 'tabove_delta' THEN
value
END
) AS tabove_delta,
MAX (
CASE
WHEN lower(variable_name) = 'tsoil' THEN
value
END
) AS tsoil,
MAX (
CASE
WHEN lower(variable_name) = 'tsoil_sc' THEN
value
END
) AS tsoil_sc,
MAX (
CASE
WHEN lower(variable_name) = 'tsoil_delta' THEN
value
END
) AS tsoil_delta,
MAX (
CASE
WHEN lower(variable_name) = 's_sdm_out' THEN
value
END
) AS s_sdm_out,
MAX (
CASE
WHEN lower(variable_name) = 'sbtemp' THEN
value
END
) AS sbtemp,
MAX (
CASE
WHEN lower(variable_name) = 'heat_a_avg' THEN
value
END
) AS heat_a_avg,
MAX (
CASE
WHEN lower(variable_name) = 'b_dc_avg' THEN
value
END
) AS b_dc_avg,
MAX (
CASE
WHEN lower(variable_name) = 'targettemp' THEN
value
END
) AS targettemp,
MAX (
CASE
WHEN lower(variable_name) = 's_scldout' THEN
value
END
) AS s_scldout,
MAX (
CASE
WHEN lower(variable_name) = 'tmv' THEN
value
END
) AS tmv,
MAX (
CASE
WHEN lower(variable_name) = 'a_dc' THEN
value
END
) AS a_dc,
MAX (
CASE
WHEN lower(variable_name) = 'a_targettemp' THEN
value
END
) AS a_targettemp,
MAX (
CASE
WHEN lower(variable_name) = 'scldout' THEN
value
END
) AS scldout,
MAX (
CASE
WHEN lower(variable_name) = 'pid_lmt' THEN
value
END
) AS pid_lmt,
MAX (
CASE
WHEN lower(variable_name) = 'targettemp_adj' THEN
value
END
) AS targettemp_adj,
MAX (
CASE
WHEN lower(variable_name) = 'sdm_out' THEN
value
END
) AS sdm_out,
MAX (
CASE
WHEN lower(variable_name) = 's_pid_lmt' THEN
value
END
) AS s_pid_lmt,
MAX (
CASE
WHEN lower(variable_name) = 'tsoilr' THEN
value
END
) AS tsoilr,
MAX (
CASE
WHEN lower(variable_name) = 's_pid_out' THEN
value
END
) AS s_pid_out,
MAX (
CASE
WHEN lower(variable_name) = 'sctemp' THEN
value
END
) AS sctemp,
MAX (
CASE
WHEN lower(variable_name) = 'amb_avg' THEN
value
END
) AS amb_avg
FROM
derived_15min_joined
GROUP BY
site,
canopy,
block,
plot,
measurement_interval,
warming_treatment,
treatment_code,
treatment_abbr,
water_treatment,
variable_name,
variable_channel,
variable_id,
rowid,
time2,
timestamp,
block_name,
table_name,
batt_volt,
program,
flag1,
flag2,
flag3,
amb_a_avg,
amb_closed_avg,
airtc_avg,
airtemp_avg,
airtemp_max,
airtemp_min,
all_avgt,
am25tref1,
ptemp_avg,
rh,
s_all_avgt_avg,
vp_avg,
tabove_sdupper,
tabove_sdlower,
tabove_meantrim,
tabove_mean_dc,
tsoil_sdupper,
tsoil_sdlower,
tsoil_meantrim,
tsoil_mean_dc
;
From | Date | Subject | |
---|---|---|---|
Next Message | Kirk Wythers | 2013-03-13 22:23:43 | big un stacking query - save me from myself |
Previous Message | Joshua Berkus | 2013-03-13 21:04:58 | Re: PostgreSQL 9.2.3 performance problem caused Exclusive locks |