Re: Checkpoint_segments optimal value

From: Prabhjot Sheena <prabhjot(dot)sheena(at)rivalwatch(dot)com>
To: Potentialtech <wmoran(at)potentialtech(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Forums postgresql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Checkpoint_segments optimal value
Date: 2014-07-17 23:26:20
Message-ID: CAOf_bQbFEM6oT24sLrpw+-LKHGXvR=VomFQH7GSGV0jVtrCLOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Here is the explain plan. There query time went backup up to 2 to 3 minutes
from 3 second in just 2 hrs. Can anyone suggest something on how to fix
this or why this is happening

explain SELECT account.id, account.organization_id, run.application_id,
work_unit.script, work_unit.id, work_unit.start_time, run.id,
work_unit.priority FROM work_unit, run, account WHERE work_unit.status =
3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1
AND run.account_id = account.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..24.84 rows=1 width=64)
Join Filter: (run.id = work_unit.run_id)
-> Nested Loop (cost=0.00..16.55 rows=1 width=16)
-> Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1
width=12)
Index Cond: (status = 1)
-> Index Scan using account_pkey on account (cost=0.00..8.27
rows=1 width=8)
Index Cond: (account.id = run.account_id)
-> Index Scan using idx_work_unit_1_status_3_new on work_unit
(cost=0.00..8.28 rows=1 width=52)
Index Cond: (work_unit.status = 3)

Table Structure

\d work_unit
Table "public.work_unit"
Column | Type | Modifiers
------------+-----------------------------+-----------
id | integer | not null
run_id | integer |
status | integer |
script | character varying |
parameters | character varying |
input | character varying |
start_time | timestamp without time zone |
stop_time | timestamp without time zone |
priority | integer |
type | integer |
lineitems | integer |
outputs | integer |
sub_type | integer |
site_code | text |
Indexes:
"work_unit_pkey" PRIMARY KEY, btree (id)
"idx_work_unit_1_partial" btree (run_id, start_time) WHERE status >= 3
"idx_work_unit_1_run_id" btree (run_id)
"idx_work_unit_1_script" btree (script)
"idx_work_unit_1_site_code" btree (site_code)
"idx_work_unit_1_starttime" btree (start_time)
"idx_work_unit_1_status_3_new" btree (status, type) WHERE status = 3
AND type <> 1
"idx_work_unit_1_status_5" btree (status) WHERE status >= 4
"idx_work_unit_1_status_part_new" btree (status) WHERE status < 4
"idx_work_unit_1_stop_time" btree (stop_time)

\d run
Table "public.run"
Column | Type | Modifiers
---------------------+--------------------------+---------------
id | integer | not null
job_id | integer |
start_time | timestamp with time zone | not null
status | integer | not null
job_name | character varying |
account_id | integer |
application_id | integer |
postproc_script | character varying |
upload_url | character varying |
complete_time | timestamp with time zone |
email | character varying |
size | integer |
errors | integer |
raw_count | integer |
munge | integer |
job_details | character varying |
user_settings | character varying |
run_type | integer |
stop_after | timestamp with time zone |
total_work_time | integer |
flags | integer |
cluster_info_id | integer |
skiplineitems_count | integer |
last_update_ts | timestamp with time zone |
result_exists | boolean | default false
abort_type | integer |
Indexes:
"run_pkey" PRIMARY KEY, btree (id)
"idx_run_acc_stat_comp" btree (account_id, status, complete_time)
"idx_run_app" btree (application_id)
"idx_run_complete_time_2" btree (complete_time)
"idx_run_job_name" btree (job_name)
"run_application_account_idx" btree (application_id, account_id, status)
"run_job" btree (job_id)
"run_result_exists" btree (result_exists)
"run_start_time" btree (start_time)
"run_status" btree (status)
"run_status_1" btree (status) WHERE status = 1
"run_status_part_idx" btree (status) WHERE status < 3
Triggers:
run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE
PROCEDURE set_last_update_ts()
tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE
PROCEDURE tr_proc_update_job_summary()

On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech <wmoran(at)potentialtech(dot)com>
wrote:

> On Thu, 17 Jul 2014 11:28:04 -0700
> Prabhjot Sheena <prabhjot(dot)sheena(at)rivalwatch(dot)com> wrote:
>
> > Thanks i have changed that to 64 and reloaded it.
> >
> > When i had load issue today there was this exact same query that hits the
> > db like 50 to 60 times from different machines in 3 to 4 minutes and was
> > taking long time to execute and was holding up the database. i did
> recreate
> > an index and it started performing better. My question is why it is not
> > fetching the result from the memory since its the same query that runs
> > again and again.
> >
> > This is the actual query i m taking about:
> >
> > SELECT account.id, account.organization_id, run.application_id,
> > work_unit.script, work_unit.id, work_unit.start_time, run.id,
> > work_unit.priority FROM work_unit, run, account WHERE
> work_unit.status =
> > 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status
> = 1
> > AND run.account_id = account.id
> >
> > Pls suggest if i can do something to fix this
>
> Provide a lot more information if you want anyone on the list to be able
> to help: such as explain output while the problem is happening, and some
> information about the makeup of the tables (column types/indexes/# rows).
>
> Guessing, based on the little information you've provided, it's likely
> that you have something else going on at the same time that you're not
> aware of, and this particular query is only a symptom. I'm saying that
> because SELECTs don't generally create any WAL traffic, so there were
> probably some INSERT/UPDATE/DELETE running at the same time that both
> pushed those 3 tables out of memory and/or saturated disk activity to
> the point that accessing everything becomes slow for a short while, and
> it's just those queries that you noticed.
>
> Are you making the mistake where you set log_min_duration to 1s and only
> worry about queries that take longer than 1s? Because I've seen (on
> multiple occasions) where many 1000s of queries, each running less than
> 1s, are the actual cause of the problem. pgBadger is particularly helpful
> in tracking down situations like that.
>
> > On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > > Potentialtech <wmoran(at)potentialtech(dot)com> writes:
> > > > If the warning isn't happening too often, I would try increasing it
> only
> > > a
> > > > little and see if it helps. If it's not enough you can then
> increase it
> > > some
> > > > more. Various sources around the Internet suggest that you don't
> want
> > > to go
> > > > much larger than 256 for this (if only because it's uncommon to do so
> > > and is
> > > > probably indicative of other tuning that you need to do).
> > > Unfortunatley, you
> > > > need to restart PG for the change to take effect, so you have to
> balance
> > > > experimenting with your tuning against how often you can get away
> with a
> > > server
> > > > restart.
> > >
> > > Huh? You don't need a restart, just a reload (SIGHUP) to change that.
> > >
> > > regards, tom lane
> > >
>
>
> --
> Potentialtech <wmoran(at)potentialtech(dot)com>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Prabhjot Sheena 2014-07-17 23:39:40 Re: Checkpoint_segments optimal value
Previous Message Nathan Paul 2014-07-17 20:36:34 Wal E S3 questions

Browse pgsql-general by date

  From Date Subject
Next Message Prabhjot Sheena 2014-07-17 23:39:40 Re: Checkpoint_segments optimal value
Previous Message AI Rumman 2014-07-17 23:08:48 Re: Whats is lock type transactionid?