Re: Postgres becoming slow, only full vacuum fixes it

From: Kiriakos Tsourapas <ktsour(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres becoming slow, only full vacuum fixes it
Date: 2012-09-24 11:55:01
Message-ID: 71DC95CE-16EF-4D53-9971-C573DCD30622@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry, forgot to mention the most obvious and important information :
My postgres is 8.4.2

On Sep 24, 2012, at 13:33, Kiriakos Tsourapas wrote:

> Hi,
>
> The problem : Postgres is becoming slow, day after day, and only a full vacuum fixes the problem.
>
> Information you may need to evaluate :
>
> The problem lies on all tables and queries, as far as I can tell, but we can focus on a single table for better comprehension.
>
> The queries I am running to test the speed are :
> INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428', '1111', 1, '2012-06-16 13:39:19', '111');
> DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND "CONNECTIONID" = 1;
> SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND "CONNECTIONID" = 1;
>
> After a full vacuum, they run in about 100ms.
> Today, before the full vacuum, they were taking around 500ms.
>
> Below is an explain analyze of the commands AFTER a full vacuum. I did not run it before, so I can not post relevant info before the vacuum. So, after the full vacuum :
>
> explain analyze INSERT INTO "AWAITINGSTATUSSMPP" VALUES('143428', '1111', 1, '2012-06-16 13:39:19', '111');
> "Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1)"
> "Trigger for constraint FK_AWAITINGSTATUSSMPP_MESSAGES: time=0.131 calls=1"
> "Trigger bucardo_add_delta: time=0.454 calls=1"
> "Trigger bucardo_triggerkick_MassSMs: time=0.032 calls=1"
> "Total runtime: 0.818 ms"
>
> explain analyze DELETE FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND "CONNECTIONID" = 1;"Seq Scan on "AWAITINGSTATUSSMPP" (cost=0.00..2.29 rows=1 width=6) (actual time=0.035..0.035 rows=0 loops=1)"
> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID" = 1))"
> "Trigger bucardo_triggerkick_MassSMs: time=0.066 calls=1"
> "Total runtime: 0.146 ms"
>
> explain analyze SELECT * FROM "AWAITINGSTATUSSMPP" WHERE "SMSCMSGID" = '1111' AND "CONNECTIONID" = 1;
> "Seq Scan on "AWAITINGSTATUSSMPP" (cost=0.00..2.29 rows=1 width=557) (actual time=0.028..0.028 rows=0 loops=1)"
> " Filter: ((("SMSCMSGID")::text = '1111'::text) AND ("CONNECTIONID" = 1))"
> "Total runtime: 0.053 ms"
>
> Below are the metadata of the table :
> =====================================
> CREATE TABLE "AWAITINGSTATUSSMPP"
> (
> "MESSAGEID" bigint NOT NULL,
> "SMSCMSGID" character varying(50) NOT NULL,
> "CONNECTIONID" smallint NOT NULL,
> "EXPIRE_TIME" timestamp without time zone NOT NULL,
> "RECIPIENT" character varying(20) NOT NULL,
> "CLIENT_MSG_ID" character varying(255),
> CONSTRAINT "PK_AWAITINGSTATUSSMPP" PRIMARY KEY ("SMSCMSGID", "CONNECTIONID"),
> CONSTRAINT "FK_AWAITINGSTATUSSMPP_MESSAGES" FOREIGN KEY ("MESSAGEID")
> REFERENCES "MESSAGES" ("ID") MATCH SIMPLE
> ON UPDATE NO ACTION ON DELETE CASCADE
> )
> WITH (
> OIDS=FALSE
> );
> ALTER TABLE "AWAITINGSTATUSSMPP" OWNER TO postgres;
> GRANT ALL ON TABLE "AWAITINGSTATUSSMPP" TO "MassSMsUsers";
>
> CREATE INDEX "IX_AWAITINGSTATUSSMPP_MSGID_RCP"
> ON "AWAITINGSTATUSSMPP"
> USING btree
> ("MESSAGEID", "RECIPIENT");
>
> CREATE TRIGGER bucardo_add_delta
> AFTER INSERT OR UPDATE OR DELETE
> ON "AWAITINGSTATUSSMPP"
> FOR EACH ROW
> EXECUTE PROCEDURE bucardo."bucardo_add_delta_SMSCMSGID|CONNECTIONID"();
>
> CREATE TRIGGER "bucardo_triggerkick_MassSMs"
> AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE
> ON "AWAITINGSTATUSSMPP"
> FOR EACH STATEMENT
> EXECUTE PROCEDURE bucardo."bucardo_triggerkick_MassSMs"();
> =====================================
>
> The table only has about 200 records because it is being used a temporary storage and records are constantly inserted and deleted.
> BUT please don't get hold on this fact, because as I already said, the speed problem is not restricted to this table. The same problems appear on the following query
> UPDATE "MESSAGES" SET "SENT" = "SENT" + 1 WHERE "ID" = 143447;
> and MESSAGES table has mainly inserts and few deletes...
>
> My postgresql.conf file :
> ======================
> port = 5433 # (change requires restart)
> max_connections = 100 # (change requires restart)
> shared_buffers = 256MB # min 128kB. DoubleIP - Default was 32MB
> synchronous_commit = off # immediate fsync at commit. DoubleIP - Default was on
> effective_cache_size = 512MB # DoubleIP - Default was 128MB
> log_destination = 'stderr' # Valid values are combinations of
> logging_collector = on # Enable capturing of stderr and csvlog
> silent_mode = on # Run server silently.
> log_line_prefix = '%t %d %u ' # special values:
> log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
> autovacuum_naptime = 28800 # time between autovacuum runs. DoubleIP - default was 1min
> autovacuum_vacuum_threshold = 100 # min number of row updates before
> autovacuum_vacuum_scale_factor = 0.0 # fraction of table size before vacuum. DoubleIP - default was 0.2
> datestyle = 'iso, mdy'
> lc_messages = 'en_US.UTF-8' # locale for system error message
> lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
> lc_numeric = 'en_US.UTF-8' # locale for number formatting
> lc_time = 'en_US.UTF-8' # locale for time formatting
> default_text_search_config = 'pg_catalog.english'
> =======================
>
> As you will see, I have altered the shared_buffers and synchronous_commit values.
> The shared_buffers had the default value 32Mb. When I changed it to 256Mb the problem still appears but it takes more time to appear (3-4 days). With 32MB, it appeared faster, probably after 24 hours.
> Also, I have changed the autovacuum daemon to work every 8 hours but I changed its values to make sure it vacuums pretty much all tables (the ones for which at least 100 rows have changed).
> Please note, though, that my problem existed even before playing around with the autovacuum. This is why I tried to change its values in the first place.
>
> The server is synchronized with another server using bucardo. Bucardo process is running on the other server.
> The same problem appears on the 2nd server too... after 3-4 days, postgres is running slower and slower.
>
> Our server configuration :
> DELL PowerEdge T610 Tower Chassis for Up to 8x 3.5" HDDs
> 2x Intel Xeon E5520 Processor (2.26GHz, 8M Cache, 5.86 GT/s QPI, Turbo, HT), 1066MHz Max Memory
> 8GB Memory,1333MHz
> 2 x 146GB SAS 15k 3.5" HD Hot Plug
> 6 x 1TB SATA 7.2k 3.5" Additional HD Hot Plug
> PERC 6/i RAID Controller Card 256MB PCIe, 2x4 Connectors
> SUSE Linux Enterprise Server 10, SP2
>
> The 2 HDs are set up with RAID-1
> The 6 HDs are set up with RAID-5
>
> Linux is running on the RAID-1 configuration
> Postgres is running on the RAID-5 configuration
>
>
> Finally a top before and after the full vacuum :
> top - 11:27:44 up 72 days, 13:27, 37 users, load average: 1.05, 1.31, 1.45
> Tasks: 279 total, 3 running, 276 sleeping, 0 stopped, 0 zombie
> Cpu(s): 3.6%us, 0.8%sy, 0.0%ni, 95.5%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 8166432k total, 7963116k used, 203316k free, 115344k buffers
> Swap: 2097144k total, 2097020k used, 124k free, 2337636k cached
>
> top - 11:30:58 up 72 days, 13:31, 38 users, load average: 1.53, 1.59, 1.53
> Tasks: 267 total, 2 running, 265 sleeping, 0 stopped, 0 zombie
> Cpu(s): 1.3%us, 0.4%sy, 0.0%ni, 98.0%id, 0.3%wa, 0.0%hi, 0.1%si, 0.0%st
> Mem: 8166432k total, 6016268k used, 2150164k free, 61092k buffers
> Swap: 2097144k total, 2010204k used, 86940k free, 2262896k cached
>
>
> I hope I have provided enough info and hope that someone can point me to the correct direction.
>
>
> Thank you very much even for reading up to here !
>
> Best regards,
> Kiriakos

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Julien Cigar 2012-09-24 12:21:09 Re: Postgres becoming slow, only full vacuum fixes it
Previous Message Kiriakos Tsourapas 2012-09-24 10:33:25 Postgres becoming slow, only full vacuum fixes it