Help with slow table update

From: Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Help with slow table update
Date: 2015-04-09 23:18:55
Message-ID: CAMnJ+BfoHRxvu4LJ66jzZCcL+5D+YvQrtOSYwNgPMDyuYiLvzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi.

I have a plpgsql procedure that updates a few similar tables.
for some reason, updates on one of the tables take a lot longer the updates
on the other ones. The difference is, say, 7 seconds vs. 80 milliseconds.

the procedure uses cursors and record variables to do the updates. For
example:

update r_agrio_total set
unserved = unserved + (agrow->>'unserved')::numeric(38),
r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn),
where
tagid = _tagid and
unitid = (akey->>'unitid')::numeric and
placement = (akey->>'placement')::numeric and
device_type = (akey->>'device_type')::numeric;

There is another table (xq_agr) that is read record by record, and for each
of those records, such update is executed.

I was trying to select analyze the updates to see where the time could be
spent.
There are only 24 row in the "bad" table, and 3,400 rows in "good" table.
So, for the "bad" table, most of the updates will be on the same rows. The
times were measured on processing 100 original records.

When I'm analyzing pure update statements, I don't see anything strange.

"bad" table: explain analyze update r_agrio_total set unconfirmed =
unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0 and
device_type = 100;

RESULT:
Update on r_agrio_total (cost=0.42..4.46 rows=1 width=321) (actual
time=0.253..0.253 rows=0 loops=1)
-> Index Scan using tag_r_agrio_total on r_agrio_total
(cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1
loops=1)
Index Cond: (tagid = 1000::numeric)
Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric) AND
(device_type = 100::numeric))
Rows Removed by Filter: 7
Total runtime: 0.282 ms

"good" table: explain analyze update r_agrio_hourly set unconfirmed =
unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0 and
device_type = 100 and rowdate = '2015-02-23T13';

RESULT:
Update on r_agrio_hourly (cost=0.42..17.36 rows=6 width=329) (actual
time=0.102..0.102 rows=0 loops=1)
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly
(cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1
loops=1)
Index Cond: ((tagid = 1000::numeric) AND (unitid = 1000::numeric)
AND ((rowdate)::text = '2015-02-23T13'::text) AND (device_type =
100::numeric) AND (placement = 0::numeric))
Total runtime: 0.135 ms

When I try doing it with WITH statement (really, to apply the actual data
that the plpgsql function uses), there is something strange in the "bad"
table.

explain analyze
with SRC as (select * from xq_agr where id = 914830)
update r_agrio_total set
unconfirmed = unconfirmed +
(SRC.r_agrio->>'unconfirmed')::numeric(38)
from SRC
where
tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
placement = (SRC.r_agrio->'key'->>'placement')::numeric and
device_type = (SRC.r_agrio->'key'->>'device_type')::numeric;

RESULT:
Update on r_agrio_total (cost=8.91..32777.51 rows=19331 width=409)
(actual time=0.107..0.107 rows=0 loops=1)
CTE src
-> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44 rows=1
width=379) (actual time=0.026..0.027 rows=1 loops=1)
Index Cond: (id = 914830)
-> Nested Loop (cost=0.46..32769.07 rows=19331 width=409) (actual
time=0.107..0.107 rows=0 loops=1)
-> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual
time=0.032..0.033 rows=1 loops=1)
-> Index Scan using u_r_agrio_total on r_agrio_total
(*cost=0.46..32285.78
rows=19331* width=321) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text) ->>
'unit'::text))::numeric) AND (device_type = (((src.r_agrio -> 'key'::text)
->> 'device_type'::text))::numeric) AND (placement = (((src.r_agrio ->
'key'::text) ->> 'placement'::text))::numeric))
Total runtime: 0.155 ms

explain analyze
with SRC as (select * from xq_agr where id = 914830)
update r_agrio_hourly set
unconfirmed = unconfirmed +
(SRC.r_agrio->>'unconfirmed')::numeric(38)
from SRC
where
tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
placement = (SRC.r_agrio->'key'->>'placement')::numeric and
device_type = (SRC.r_agrio->'key'->>'device_type')::numeric and
rowdate = (SRC.r_agrio->'key'->>'rowdate');

RESULT:
Update on r_agrio_hourly (cost=8.91..52.91 rows=20 width=417) (actual
time=0.123..0.123 rows=0 loops=1)
CTE src
-> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44 rows=1
width=379) (actual time=0.023..0.024 rows=1 loops=1)
Index Cond: (id = 914830)
-> Nested Loop (cost=0.47..44.47 rows=20 width=417) (actual
time=0.121..0.121 rows=0 loops=1)
-> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual
time=0.030..0.031 rows=1 loops=1)
-> Index Scan using u_r_agrio_hourly on r_agrio_hourly
(*cost=0.47..43.95
rows=20* width=329) (actual time=0.000..0.000 rows=0 loops=1)
Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text) ->>
'unit'::text))::numeric) AND ((rowdate)::text = ((src.r_agrio ->
'key'::text) ->> 'rowdate'::text)) AND (device_type = (((src.r_agrio ->
'key'::text) ->> 'device_type'::text))::numeric) AND (placement =
(((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric))
Total runtime: 0.176 ms

I've tried doing vacuum full analyze, rebuilding indexes. Vacuum did help
somewhat, but the time differential is still huge (that 7sec vs. 80ms)

The table structure (some fields thrown out just to save output length):

=> \d r_agrio_total
Column | Type | Modifiers

-------------------+---------------+------------------------------------------------------------
id | bigint | not null default
nextval('r_agrio_total_id_seq'::regclass)
tagid | numeric(38,0) | not null
unitid | numeric(38,0) | not null
device_type | numeric(38,0) | not null
placement | numeric(38,0) | not null default 0
unserved | numeric(38,0) | not null default 0
r_brkconn | json |
Indexes:
"r_agrio_total_pkey" PRIMARY KEY, btree (id)
"u_r_agrio_total" UNIQUE, btree (tagid, unitid, device_type, placement)
"unit_r_agrio_total" btree (unitid)
"tag_r_agrio_total" btree (tagid)

=> \d r_agrio_hourly
Column | Type |
Modifiers
-------------------+-----------------------+-------------------------------------------------------------
id | bigint | not null default
nextval('r_agrio_hourly_id_seq'::regclass)
tagid | numeric(38,0) | not null
unitid | numeric(38,0) | not null
rowdate | character varying(15) | not null
device_type | numeric(38,0) | not null
placement | numeric(38,0) | not null default 0
unserved | numeric(38,0) | not null default 0
unconfirmed | numeric(38,0) | not null default 0
r_brkconn | json |
Indexes:
"r_agrio_hourly_pkey" PRIMARY KEY, btree (id)
"u_r_agrio_hourly" UNIQUE, btree (tagid, unitid, rowdate, device_type,
placement)
"unit_r_agrio_hourly" btree (unitid)
"rowdate_r_agrio_hourly" btree (rowdate)
"tag_r_agrio_hourly" btree (tagid)

Would appreciate any help on this, thank you!

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Melvin Davidson 2015-04-10 03:09:23 Re: no pg_hba.conf entry for replication connection from host
Previous Message Adrian Klaver 2015-04-09 22:43:55 Re: no pg_hba.conf entry for replication connection from host