Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.

From: David Gould <daveg(at)sonic(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.
Date: 2016-03-02 11:45:37
Message-ID: 20160302034537.0b1c2da7@engels
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, 29 Feb 2016 18:33:50 -0300
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> Hi David, did you ever post an updated version of this patch?

No. Let me fix that now. I've attached my current revision of the patch
based on master. This version is significantly better than the original
version and resolves multiple issues:

- autovacuum workers no longer race each other
- autovacuum workers do not revacuum each others tables
- autovacuums workers no longer thrash the stats collector which saves a
lot of IO when the stats is large.

Hopefully the earlier discussion and the comments in the patch are
sufficient, but please ask questions if it is not clear.

The result is that on a freshly created 40,000 table database with tiny
tables that all need an initial analyze the unpatched postgres saturates
an SSD updating the stats and manages to process less than tables per
minute. With the attached patch it processes several thousand tables per
minute.

The following is a summary of strace output for the autovacuum workers
and the stats collector while the 40k table test is running. The counts and
times are the cost per table.

postgresql 9.5: 85 tables per minute.

Operations per Table
calls msec system call [ 4 autovacuum workers ]
------ ------ -------------------
19.46 196.09 select(0, <<< Waiting for stats snapshot
3.26 1040.46 semop(43188238, <<< Waiting for AutovacuumScheduleLock
2.05 0.83 sendto(8, <<< Asking for stats snapshot

calls msec system call [ stats collector ]
------ ------ -------------------
3.02 0.05 recvfrom(8, <<< Request for snapshot refresh
1.55 248.64 rename("pg_stat_tmp/db_16385.tmp", <<< Snapshot refresh

+ autovacuum contention patch: 5225 tables per minute

Operations per Table
calls msec system call [ 4 autovacuum workers ]
------ ------ -------------------
0.63 6.34 select(0, <<< Waiting for stats snapshot
0.21 0.01 sendto(8, <<< Asking for stats snapshot
0.07 0.00 semop(43712518, <<< Waiting for AutovacuumLock

calls msec system call [ stats collector ]
------ ------ -------------------
0.40 0.01 recvfrom(8, <<< Request for snapshot refresh
0.04 6.75 rename("pg_stat_tmp/db_16385.tmp", <<< Snapshot refresh

Regards,

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

Attachment Content-Type Size
autovacuum_contention_0301.patch text/x-patch 10.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Gould 2016-03-02 12:19:54 Re: Prepared statements
Previous Message John R Pierce 2016-03-02 08:39:28 Re: could not migrate 8.0.13 database with large object data to 9.5.1