Re: Long running INSERT+SELECT query

From: Steven Lembark <lembark(at)wrkhors(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Cc: lembark(at)wrkhors(dot)com
Subject: Re: Long running INSERT+SELECT query
Date: 2018-04-27 14:29:30
Message-ID: 20180427092930.1ee8aa43@wrkhors.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 26 Apr 2018 21:08:01 +0300
Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com> wrote:

>     INSERT INTO cmdb_sp_usage_history
>       (created_by, updated_by, created_on, updated_on, mod_count,
>       summary_on, quarter, product, used_from, "user",
>       keystrokes, minutes_in_use, times_started, avg_keystrokes,
> max_keystrokes, spkg_operational)
>     SELECT
>        2, 2, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP, 1,
>        CURRENT_TIMESTAMP, quarter.id, "spv"."product",
> "usage"."used_from", "usage"."user",
>        coalesce(sum("usage"."keystrokes"), 0),
>        coalesce(sum("usage"."minutes_in_use"), 0),
>        coalesce(sum("usage"."times_started"), 0),
>        coalesce(avg("usage"."keystrokes"), 0),
>        coalesce(max("usage"."keystrokes"), 0),
>        bool_or("cmdb_ci"."operational")
>     FROM
>       "cmdb_program_daily_usage" "usage"
>     LEFT OUTER JOIN
>       "cmdb_program_instance" "p" ON "p"."id" =
> "usage"."program_instance" LEFT OUTER JOIN
>       "cmdb_ci_spkg" "s" ON "s"."id" = "p"."spkg"
>     LEFT OUTER JOIN
>       "cmdb_ci" "cmdb_ci" ON "s"."id" = "cmdb_ci"."id"
>     LEFT OUTER JOIN
>       "cmdb_software_product_version" "spv" ON "spv"."id" =
> "s"."software" WHERE ("usage"."minutes_in_use" > 0)
>       AND ((NOT ("s"."software" IS NULL))
>            AND ((NOT ("s"."os" = TRUE))
>                 OR ("s"."os" IS NULL)))
>       AND ("usage"."usage_date" >= quarter.start_date)
>       AND ("usage"."usage_date" < quarter.end_date)
>     GROUP BY "spv"."product", "usage"."used_from", "usage"."user"
>     HAVING (coalesce(sum("usage"."keystrokes"), 0) > 0) OR
> (coalesce(sum("usage"."minutes_in_use"), 0) > 0) OR
> (coalesce(sum("usage"."times_started"), 0) > 0)
>     ORDER BY "spv"."product", "usage"."used_from", "usage"."user";

create temporary table
foobar

select
<unaggregatedl, un-coalesced data>
from
<join from hell, above, sans group by>
;

This isolates the lock time to performing the bare select, after
which you can coalesce and sum to your heart's content without
locking any of it.

The point is performing the absolute minimum of processing to
generate the temp table so as to release any locks quickly and
avoid "group by" in the main join.

Yes, this might end up creating a large-ish temp table :-)

One other approach would be selecting only incremental data
(e.g., daily) which locks a much smaller subset of the rows
and aggregating the daily totals into quarterly, whatever.
Call it daily usage, select where usage_date = today's or
timestamp && a tstzrange of ( 0000, 2400, [) ). That might
also simplify your query logic: all the coalesce op's end up
in your daily/weekly/monthly/whatever summary, the quarterly
values in the reporting are just sum X group by.

--
Steven Lembark 1505 National Ave
Workhorse Computing Rockford, IL 61103
lembark(at)wrkhors(dot)com +1 888 359 3508

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Steven Lembark 2018-04-27 14:30:27 Re: Long running INSERT+SELECT query
Previous Message a 2018-04-27 14:25:10 Re: Parameter passing in trigger function write in C