From: | eudald_v <reaven(dot)galaeindael(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Sudden connection and load average spikes with postgresql 9.3 |
Date: | 2015-07-02 15:41:20 |
Message-ID: | 1435851680008-5856298.post@n5.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Dear Josh,
I'm sorry I didn't write before, but we have been very busy with this issue
and, you know, when something goes wrong, the apocalypse comes with it.
I've been working on everything you suggested.
I used your tables and script and I can give you a sample of it on
locked_query_start
2015-07-02 14:49:45.972129+02 | 15314 | | 4001 |
| "TABLE_Z" | tuple | ExclusiveLock | 24018:24 | relation
| ShareUpdateExclusiveLock | | | YYY.YYY.YYY.YYY/32
| 2015-07-02 14:49:26.635599+02 | 2015-07-02 14:49:26.635599+02 | 2015-07-02
14:49:26.635601+02 | INSERT INTO "TABLE_X" ("field1", "field2", "field3",
"field4", "field5", "field6", "field7") VALUES (22359509, 92, 5, 88713,
'XXX.XXX.XXX.XXX', 199, 10) | | | 2015-07-02
14:11:45.368709+02 | 2015-07-02 14:11:45.368709+02 | active |
2015-07-02 14:11:45.36871+02 | autovacuum: VACUUM ANALYZE public.TABLE_Z
2015-07-02 14:49:45.972129+02 | 15857 | | 4001 |
| "TABLE_Z" | tuple | ExclusiveLock | 24018:24 | relation
| ShareUpdateExclusiveLock | | | YYY.YYY.YYY.YYY/32
| 2015-07-02 14:49:22.79166+02 | 2015-07-02 14:49:22.79166+02 | 2015-07-02
14:49:22.791665+02 | INSERT INTO "TABLE_X" ("field1", "field2", "field3",
"field4", "field5", "field6", "field7") VALUES (14515978, 92, 5, 88713,
'XXX.XXX.XXX.XXX', 199, 10) | | | 2015-07-02
14:11:45.368709+02 | 2015-07-02 14:11:45.368709+02 | active |
2015-07-02 14:11:45.36871+02 | autovacuum: VACUUM ANALYZE public.TABLE_Z
2015-07-02 14:49:45.972129+02 | 15314 | | 14712 |
| "TABLE_Z" | tuple | ExclusiveLock | 24018:24 | relation
| AccessShareLock | | |
1YYY.YYY.YYY.YYY/32 | 2015-07-02 14:49:26.635599+02 | 2015-07-02
14:49:26.635599+02 | 2015-07-02 14:49:26.635601+02 | INSERT INTO "TABLE_X"
("field1", "field2", "field3", "field4", "field5", "field6", "field") VALUES
(22359509, 92, 5, 88713, 'XXX.XXX.XXX.XXX', 199, 10) | |
185.10.253.72/32 | 2015-07-02 14:48:48.841375+02 | 2015-07-02
14:48:48.841375+02 | active | 2015-07-02 14:48:48.841384+02 | INSERT
INTO "TABLE_Y" ("email_id", "sendout_id", "feed_id", "isp_id") VALUES
(46015879, 75471, 419, 0)
All that was recorded during a spike. From this log I have to point
something:
Tables TABLE_X and TABLE_Y have both a TRIGGER that does an INSERT to
TABLE_Z
As you can see, TABLE_Z was being VACUUM ANALYZED. I wonder if TRIGGERS and
VACUUM work well together, just to check another perspective.
We also have carefully looked at our scripts and we have performed some code
optimitzations (like close db connections earlier), but the spikes continue
to happen.
FS is ext4 and I don't know how can I check the transaction log
configuration
This is how IO lookslike before and after any problem happens:
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 357,00 7468,00 8840,00 7468 8840
avg-cpu: %user %nice %system %iowait %steal %idle
5,02 0,00 2,44 0,06 0,00 92,47
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 471,00 7032,00 13760,00 7032 13760
avg-cpu: %user %nice %system %iowait %steal %idle
5,14 0,00 2,92 0,03 0,00 91,92
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 376,00 7192,00 8048,00 7192 8048
avg-cpu: %user %nice %system %iowait %steal %idle
4,77 0,00 2,57 0,03 0,00 92,63
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 304,00 7280,00 8252,00 7280 8252
And this is how it looks like when the spike happens:
http://pastebin.com/2hAYuDZ5
Hope it can help into determining what's happening.
Thanks for all your efforts and collaboration!
Eudald
--
View this message in context: http://postgresql.nabble.com/Sudden-connection-and-load-average-spikes-with-postgresql-9-3-tp5855895p5856298.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
From | Date | Subject | |
---|---|---|---|
Next Message | Graeme B. Bell | 2015-07-02 16:15:52 | Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? |
Previous Message | Merlin Moncure | 2015-07-02 14:25:12 | Re: New server: SSD/RAID recommendations? |