Re: Help with slow table update

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Help with slow table update
Date: 2015-04-13 00:40:12
Message-ID: 552B106C.9030409@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 4/9/15 6:18 PM, Pawel Veselov wrote:
> Hi.
>
> I have a plpgsql procedure that updates a few similar tables.
> for some reason, updates on one of the tables take a lot longer the
> updates on the other ones. The difference is, say, 7 seconds vs. 80
> milliseconds.
>
> the procedure uses cursors and record variables to do the updates. For
> example:
>
> update r_agrio_total set
> unserved = unserved + (agrow->>'unserved')::numeric(38),
> r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn),
> where
> tagid = _tagid and
> unitid = (akey->>'unitid')::numeric and
> placement = (akey->>'placement')::numeric and
> device_type = (akey->>'device_type')::numeric;
>
> There is another table (xq_agr) that is read record by record, and for
> each of those records, such update is executed.
>
> I was trying to select analyze the updates to see where the time could
> be spent.
> There are only 24 row in the "bad" table, and 3,400 rows in "good"
> table. So, for the "bad" table, most of the updates will be on the same
> rows. The times were measured on processing 100 original records.
>
> When I'm analyzing pure update statements, I don't see anything strange.
>
> "bad" table: explain analyze update r_agrio_total set unconfirmed =
> unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
> and device_type = 100;
>
> RESULT:
> Update on r_agrio_total (cost=0.42..4.46 rows=1 width=321) (actual
> time=0.253..0.253 rows=0 loops=1)
> -> Index Scan using tag_r_agrio_total on r_agrio_total
> (cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1
> loops=1)
> Index Cond: (tagid = 1000::numeric)
> Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric)
> AND (device_type = 100::numeric))
> Rows Removed by Filter: 7
> Total runtime: 0.282 ms
>
> "good" table: explain analyze update r_agrio_hourly set unconfirmed =
> unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0
> and device_type = 100 and rowdate = '2015-02-23T13';
>
> RESULT:
> Update on r_agrio_hourly (cost=0.42..17.36 rows=6 width=329) (actual
> time=0.102..0.102 rows=0 loops=1)
> -> Index Scan using u_r_agrio_hourly on r_agrio_hourly
> (cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1
> loops=1)
> Index Cond: ((tagid = 1000::numeric) AND (unitid =
> 1000::numeric) AND ((rowdate)::text = '2015-02-23T13'::text) AND
> (device_type = 100::numeric) AND (placement = 0::numeric))
> Total runtime: 0.135 ms
>
> When I try doing it with WITH statement (really, to apply the actual
> data that the plpgsql function uses), there is something strange in the
> "bad" table.
>
> explain analyze
> with SRC as (select * from xq_agr where id = 914830)
> update r_agrio_total set
> unconfirmed = unconfirmed +
> (SRC.r_agrio->>'unconfirmed')::numeric(38)
> from SRC
> where
> tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
> unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
> placement = (SRC.r_agrio->'key'->>'placement')::numeric and
> device_type = (SRC.r_agrio->'key'->>'device_type')::numeric;
>
> RESULT:
> Update on r_agrio_total (cost=8.91..32777.51 rows=19331 width=409)
> (actual time=0.107..0.107 rows=0 loops=1)
> CTE src
> -> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44
> rows=1 width=379) (actual time=0.026..0.027 rows=1 loops=1)
> Index Cond: (id = 914830)
> -> Nested Loop (cost=0.46..32769.07 rows=19331 width=409) (actual
> time=0.107..0.107 rows=0 loops=1)
> -> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual
> time=0.032..0.033 rows=1 loops=1)
> -> Index Scan using u_r_agrio_total on r_agrio_total
> (*cost=0.46..32285.78 rows=19331* width=321) (actual time=0.001..0.001
> rows=0 loops=1)
> Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
> 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text)
> ->> 'unit'::text))::numeric) AND (device_type = (((src.r_agrio ->
> 'key'::text) ->> 'device_type'::text))::numeric) AND (placement =
> (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric))
> Total runtime: 0.155 ms
>
> explain analyze
> with SRC as (select * from xq_agr where id = 914830)
> update r_agrio_hourly set
> unconfirmed = unconfirmed +
> (SRC.r_agrio->>'unconfirmed')::numeric(38)
> from SRC
> where
> tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and
> unitid = (SRC.r_agrio->'key'->>'unit')::numeric and
> placement = (SRC.r_agrio->'key'->>'placement')::numeric and
> device_type = (SRC.r_agrio->'key'->>'device_type')::numeric and
> rowdate = (SRC.r_agrio->'key'->>'rowdate');
>
> RESULT:
> Update on r_agrio_hourly (cost=8.91..52.91 rows=20 width=417) (actual
> time=0.123..0.123 rows=0 loops=1)
> CTE src
> -> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44
> rows=1 width=379) (actual time=0.023..0.024 rows=1 loops=1)
> Index Cond: (id = 914830)
> -> Nested Loop (cost=0.47..44.47 rows=20 width=417) (actual
> time=0.121..0.121 rows=0 loops=1)
> -> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual
> time=0.030..0.031 rows=1 loops=1)
> -> Index Scan using u_r_agrio_hourly on r_agrio_hourly
> (*cost=0.47..43.95 rows=20* width=329) (actual time=0.000..0.000
> rows=0 loops=1)
> Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->>
> 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text)
> ->> 'unit'::text))::numeric) AND ((rowdate)::text = ((src.r_agrio ->
> 'key'::text) ->> 'rowdate'::text)) AND (device_type = (((src.r_agrio ->
> 'key'::text) ->> 'device_type'::text))::numeric) AND (placement =
> (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric))
> Total runtime: 0.176 ms
>
> I've tried doing vacuum full analyze, rebuilding indexes. Vacuum did
> help somewhat, but the time differential is still huge (that 7sec vs. 80ms)
>
> The table structure (some fields thrown out just to save output length):
>
> => \d r_agrio_total
> Column | Type | Modifiers
> -------------------+---------------+------------------------------------------------------------
> id | bigint | not null default
> nextval('r_agrio_total_id_seq'::regclass)
> tagid | numeric(38,0) | not null
> unitid | numeric(38,0) | not null
> device_type | numeric(38,0) | not null
> placement | numeric(38,0) | not null default 0
> unserved | numeric(38,0) | not null default 0
> r_brkconn | json |
> Indexes:
> "r_agrio_total_pkey" PRIMARY KEY, btree (id)
> "u_r_agrio_total" UNIQUE, btree (tagid, unitid, device_type, placement)
> "unit_r_agrio_total" btree (unitid)
> "tag_r_agrio_total" btree (tagid)
>
> => \d r_agrio_hourly
> Column | Type |
> Modifiers
> -------------------+-----------------------+-------------------------------------------------------------
> id | bigint | not null default
> nextval('r_agrio_hourly_id_seq'::regclass)
> tagid | numeric(38,0) | not null
> unitid | numeric(38,0) | not null
> rowdate | character varying(15) | not null
> device_type | numeric(38,0) | not null
> placement | numeric(38,0) | not null default 0
> unserved | numeric(38,0) | not null default 0
> unconfirmed | numeric(38,0) | not null default 0
> r_brkconn | json |
> Indexes:
> "r_agrio_hourly_pkey" PRIMARY KEY, btree (id)
> "u_r_agrio_hourly" UNIQUE, btree (tagid, unitid, rowdate,
> device_type, placement)
> "unit_r_agrio_hourly" btree (unitid)
> "rowdate_r_agrio_hourly" btree (rowdate)
> "tag_r_agrio_hourly" btree (tagid)
>
> Would appreciate any help on this, thank you!

What is the ->> operator? I'm not familiar with it.

numeric is FAR slower than int/bigint. Unless you *really* need 38
digits, use one of the other types.

Cursors tend to make things slow. Avoid them if you can.

As for your specific question, I suggest you modify the plpgsql function
so that it's doing an EXPLAIN ANALYZE on the slow table. EXPLAIN ANALYZE
actually returns a recordset the same way a SELECT would, with a single
column of type text. So you just need to do something with that output.
The easiest thing would be to replace this in your function:

UPDATE slow_table SET ...

to this (untested)

RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ...

and change the function so it returns SETOF text instead of whatever it
returns now.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Cheung 2015-04-13 02:08:08 schema or database
Previous Message Adrian Klaver 2015-04-13 00:30:44 Re: Re: Hot standby problems: consistent state not reached, no connection to master server.