PostgreSQL databases require periodic
   maintenance known as vacuuming, and require
   periodic updates to the statistics used by the
   PostgreSQL query planner.  These
   maintenance tasks are performed by the VACUUM and ANALYZE commands
   respectively.  For most installations, it is sufficient to let the
   autovacuum daemon determine when to perform
   these maintenance tasks (which is partly determined by configurable
   table-level thresholds; see Section 25.1).
  
The autovacuum daemon has to process each table on a regular basis for several reasons:
VACUUM
     operation avoid needlessly scanning pages that are already
     frozen
   Maintenance work within the scope of items 1, 2, 3, and 4 is
   performed by the VACUUM command internally.
   Item 5 (maintenance of planner statistics) is handled by the
   ANALYZE command internally.  Although this
   section presents information about autovacuum, there is no
   difference between manually-issued VACUUM and
   ANALYZE commands and those run by the autovacuum
   daemon (though there are autovacuum-specific variants of a small
   number of settings that control VACUUM).
  
Autovacuum creates a substantial amount of I/O traffic, which can cause poor performance for other active sessions. There are configuration parameters that can be adjusted to reduce the performance impact of background vacuuming. See the autovacuum-specific cost delay settings described in Section 20.10, and additional cost delay settings described in Section 20.4.4.
   Some database administrators will want to supplement the daemon's
   activities with manually-managed VACUUM
   commands, which typically are executed according to a schedule by
   cron or Task
    Scheduler scripts.  It can be useful to perform
   off-hours VACUUM commands during periods where
   reduced load is expected.
  
    In PostgreSQL, an
    UPDATE or DELETE of a row does not
    immediately remove the old version of the row.
    This approach is necessary to gain the benefits of multiversion
    concurrency control (MVCC, see Chapter 13): the row version
    must not be deleted while it is still potentially visible to other
    transactions.  A deleted row version (whether from an
    UPDATE or DELETE) will
    usually cease to be of interest to any still running transaction
    shortly after the original deleting transaction commits.
   
    The space dead tuples occupy must eventually be reclaimed for
    reuse by new rows, to avoid unbounded growth of disk space
    requirements.  Reclaiming space from dead rows is
    VACUUM's main responsibility.
   
    The XID cutoff point that VACUUM uses to
    determine whether or not a deleted tuple is safe to physically
    remove is reported under removable cutoff in
    the server log when autovacuum logging (controlled by log_autovacuum_min_duration) reports on a
    VACUUM operation executed by autovacuum.
    Tuples that are not yet safe to remove are counted as
    dead but not yet removable tuples in the log
    report.  VACUUM establishes its
    removable cutoff once, at the start of the
    operation.  Any older snapshot (or transaction that allocates an
    XID) that's still running when the cutoff is established may hold
    it back.
   
     It's important that no long running transactions ever be allowed
     to hold back every VACUUM operation's cutoff
     for an extended period.  You may wish to monitor this.
    
     Tuples inserted by aborted transactions can be removed by
     VACUUM immediately
    
    VACUUM will not return space to the operating
    system, except in the special case where a group of contiguous
    pages at the end of a table become entirely free and an exclusive
    table lock can be easily obtained.  This relation truncation
    behavior can be disabled in tables where the exclusive lock is
    disruptive by setting the table's vacuum_truncate
    storage parameter to off.
   
     If you have a table whose entire contents are deleted on a
     periodic basis, consider doing it with TRUNCATE rather
     than relying on VACUUM.
     TRUNCATE removes the entire contents of the
     table immediately, avoiding the need to set
     xmax to the deleting transaction's XID.
     One disadvantage is that strict MVCC semantics are violated.
    
     VACUUM FULL or CLUSTER can
     be useful when dealing with extreme amounts of dead tuples.  It
     can reclaim more disk space, but runs much more slowly.  It
     rewrites an entire new copy of the table and rebuilds all indexes.
     This typically has much higher overhead than
     VACUUM.  Generally, therefore, administrators
     should avoid using VACUUM FULL except in the
     most extreme cases.
    
     Although VACUUM FULL is technically an option
     of the VACUUM command, VACUUM
      FULL uses a completely different implementation.
     VACUUM FULL is essentially a variant of
     CLUSTER.  (The name VACUUM
      FULL is historical; the original implementation was
     somewhat closer to standard VACUUM.)
    
     TRUNCATE, VACUUM FULL, and
     CLUSTER all require an ACCESS
      EXCLUSIVE lock, which can be highly disruptive
     (SELECT, INSERT,
     UPDATE, and DELETE commands
     will all be blocked).
    
     VACUUM FULL (and CLUSTER)
     temporarily uses extra disk space approximately equal to the size
     of the table, since the old copies of the table and indexes can't
     be released until the new ones are complete.
    
    VACUUM often marks certain pages
    frozen, indicating that all eligible rows on
    the page were inserted by a transaction that committed
    sufficiently far in the past that the effects of the inserting
    transaction are certain to be visible to all current and future
    transactions.  The specific transaction ID number
     (XID) stored in a frozen heap
    row's xmin field is no longer needed to
    determine anything about the row's visibility.
    Furthermore, when a row undergoing freezing happens to have an XID
    set in its xmax field (possibly an XID
    left behind by an earlier SELECT FOR UPDATE row
    locker), the xmax field's XID is
    typically also removed (actually, xmax
    is set to the special XID value 0, also known
    as InvalidTransactionId).  See Section 73.6.1.1 for further background
    information.
   
    Once frozen, heap pages are “self-contained”.  All of
    the page's rows can be read by every transaction, without any
    transaction ever needing to consult externally stored transaction
    status metadata (most notably, transaction commit/abort status
    information from pg_xact won't ever be
    required).
   
    It can be useful for VACUUM to put off some of
    the work of freezing, but freezing cannot be put off indefinitely.
    Since on-disk storage of transaction IDs in heap row headers uses
    a truncated 32-bit representation to save space (rather than the
    full 64-bit representation), freezing plays a crucial role in
    enabling management of the XID
     address space by VACUUM.  If freezing
    by VACUUM is somehow impeded (in a database
    that continues to allocate new transaction IDs), the system will
    eventually refuse to allocate new
     transaction IDs.  This generally only happens in extreme
    cases where the system has been misconfigured.
   
    vacuum_freeze_min_age can be used to control
    when freezing takes place.  When VACUUM scans a
    heap page containing even one XID that already attained an age
    exceeding this value, the page is frozen.
   
    Multixact IDs are used to support row
    locking by multiple transactions.  Since there is only limited
    space in a tuple header to store lock information, that
    information is encoded as a “multiple transaction
     ID”, or multixact ID for short, whenever there is more
    than one transaction concurrently locking a row.  Information
    about which transaction IDs are included in any particular
    multixact ID is stored separately, and only the multixact ID
    appears in the xmax field in the tuple
    header.  Like transaction IDs, multixact IDs are implemented as a
    32-bit counter and corresponding storage.  Since MultiXact IDs are
    stored in the xmax field of heap rows
    (and have an analogous dependency on external transaction status
    information), they may also need to be removed during freezing.
   
    vacuum_multixact_freeze_min_age also
    controls when freezing takes place.   It is analogous to
    vacuum_freeze_min_age, but “age”
    is expressed in units of Multixact ID (not in units of XID).
    vacuum_multixact_freeze_min_age typically has
    only a minimal impact on how many pages are frozen, partly because
    VACUUM usually prefers to remove MultiXact IDs
    proactively based on low-level considerations around the cost of
    freezing.  vacuum_multixact_freeze_min_age
    forces VACUUM to process
    MultiXact IDs in certain rare cases where the implementation would
    not ordinarily do so.
   
    Managing the added WAL volume from freezing
    over time is an important consideration for
    VACUUM.  This is why VACUUM
    doesn't just freeze every eligible tuple at the earliest
    opportunity: the WAL written to freeze a page's
    tuples “goes to waste” in cases where the resulting
    frozen tuples are soon deleted or updated anyway.  It's also why
    VACUUM will freeze all
    eligible tuples from a heap page once the decision to freeze at
    least one tuple is taken: at that point the added cost to freeze
    all eligible tuples eagerly (measured in “extra bytes of
     WAL written”) is far lower than the
    probable cost of deferring freezing until a future
    VACUUM operation against the same table.
    Furthermore, once the page is frozen it can be marked all-frozen
    in the visibility map right away.
   
     In PostgreSQL versions before 16,
     freezing was triggered at the level of individual
     xmin and
     xmax fields.  Freezing only affected
     the exact XIDs that had already attained an age at or exceeding
     vacuum_freeze_min_age, regardless of costs.
    
    VACUUM also triggers freezing of pages in cases
    where it already proved necessary to write out an
    FPI (full page image) alongside a
    WAL record generated while removing dead tuples
    (see Section 30.1 for background information
    about how FPIs provide torn page protection).
    This “freeze on an FPI write”
    mechanism is designed to lower the absolute volume of
    WAL written over time by
    VACUUM, across multiple
    VACUUM operations against the same table.  The
    mechanism often prevents future VACUUM
    operations from having to write a second FPI
    for the same page much later on.  In effect,
    VACUUM writes slightly more
    WAL in the short term with the aim of
    ultimately needing to write much less WAL in
    the long term.
   
    VACUUM may not be able to freeze every tuple's
    xmin in relatively rare cases.  The
    criteria that determines basic eligibility for freezing is exactly
    the same as the one that determines if a deleted tuple should be
    considered removable or merely dead
     but not yet removable (namely, the XID-based
    removable cutoff).  In extreme cases a
    long-running transaction can hold back every
    VACUUM's removable cutoff for so long that the
    system is eventually forced to activate xidStopLimit mode
     protections.
   
VACUUM's aggressive strategy #
     As already noted briefly in the introductory section, freezing
     doesn't just allow queries to avoid lookups of subsidiary
     transaction status information in structures such as
     pg_xact.  Freezing also plays a crucial role
     in enabling management of the XID address space by
     VACUUM.
    
     VACUUM maintains information about the oldest
     unfrozen XID that remains in the table when it uses its
     aggressive strategy.  This information is
     stored in the pg_class system table at
     the end of each aggressive VACUUM: the table
     processed by aggressive VACUUM has its
     pg_class.relfrozenxid
     updated (relfrozenxid
     “advances” by a certain number of XIDs).  Similarly,
     the datfrozenxid column of a
     database's pg_database row is a lower
     bound on the unfrozen XIDs appearing in that database — it
     is just the minimum of the per-table
     relfrozenxid values within the
     database.  The system also maintains
     pg_class.relminmxid and
     pg_database.datminmxid
     fields to track the oldest MultiXact ID, while following
     analogous rules.
    
      When the VACUUM command's
      VERBOSE parameter is specified,
      VACUUM prints various statistics about the
      table.  This includes information about how
      relfrozenxid and
      relminmxid advanced, and the number
      of newly frozen pages.  The same details appear in the server
      log when autovacuum logging (controlled by log_autovacuum_min_duration) reports on a
      VACUUM operation executed by autovacuum.
     
     This process is intended to reliably prevent the entire database
     from ever having a transaction ID that is excessively far in the
     past.  The maximum “distance” that the system can
     tolerate between the oldest unfrozen transaction ID and the next
     (unallocated) transaction ID is about 2.1 billion transaction
     IDs.  That is an upper limit; the greatest
     age(relfrozenxid)/age(datfrozenxid)
     in the system should ideally never exceed a fraction of that
     upper limit.  If that upper limit is ever reached, then the
     system will activate xidStopLimit mode
      protections.  These protections  will remain in force
     until VACUUM (typically autovacuum) manages to
     advance the oldest datfrozenxid in the cluster
     (by advancing that database's oldest relfrozenxid via an
     aggressive VACUUM).
    
     The 2.1 billion XIDs “distance” invariant is a
     consequence of the fact that on-disk storage of transaction IDs
     in heap row headers uses a truncated 32-bit representation to
     save space (rather than the full 64-bit representation).  Since
     all unfrozen transaction IDs from heap tuple headers
     must be from the same transaction ID epoch
     (which is what the invariant actually assures), there isn't any
     need to store a separate epoch field in each tuple header.  The
     downside is that the system depends on freezing (and
     relfrozenxid advancement during
     aggressive VACUUMs) to make sure that the
     “available supply” of transaction IDs never exceeds
     the “demand”.
    
      In practice most tables require periodic aggressive vacuuming.
      However, some individual non-aggressive
      VACUUM operations may be able to advance
      relfrozenxid and/or
      relminmxid.  This is most common in
      small, frequently modified tables, where
      VACUUM happens to scan all pages (or at least
      all pages not marked all-frozen in the visibility map) in the
      course of removing dead tuples.
     
     VACUUM/autovacuum also use vacuum_multixact_freeze_table_age and autovacuum_multixact_freeze_max_age settings as
     independent Multixact ID orientated controls for aggressive mode
     VACUUM and anti-wraparound autovacuum.
     These work analogously to the XID-based
     vacuum_freeze_table_age and
     autovacuum_freeze_max_age, respectively.
     Note, however, that if the multixacts members storage
      area exceeds 2GB, then the effective value of
     autovacuum_multixact_freeze_max_age will be
     lower, resulting in more frequent aggressive mode VACUUMs.
    
     There is only one major runtime behavioral differences between
     aggressive mode VACUUM and non-aggressive
     (standard) VACUUM.  Both kinds of
     VACUUM use the visibility map to determine which
     pages of a table must be scanned, and which can be skipped.
     However, only non-aggressive VACUUM will skip
     pages that don't have any dead row versions even if those pages
     might still have row versions with old XID values; aggressive
     VACUUMs are limited to skipping pages already
     marked all-frozen (and all-visible).
    
     As a consequence of all this, non-aggressive
     VACUUMs usually won't freeze
     every page with an old row version in the
     table.  Most individual tables will eventually need an aggressive
     VACUUM, which will reliably freeze all pages
     with XID and MXID values older than
     vacuum_freeze_min_age, including those from
     all-visible but not all-frozen pages (and then update
     pg_class).  vacuum_freeze_table_age controls when
     VACUUM must use its aggressive strategy.
     Since the setting is applied against
     age(relfrozenxid), settings like
     vacuum_freeze_min_age may influence the exact
     cadence of aggressive vacuuming.  Setting
     vacuum_freeze_table_age to 0 forces
     VACUUM to always use its aggressive strategy.
    
      Aggressive VACUUMs apply the same rules for
      freezing as non-aggressive VACUUMs.  You may
      nevertheless notice that aggressive VACUUMs
      perform a disproportionately large amount of the total required
      freezing in larger tables.
     
      This is an indirect consequence of the fact that non-aggressive
      VACUUMs won't scan pages that are marked
      all-visible but not also marked all-frozen in the visibility
      map.  VACUUM can only consider freezing those
      pages that it actually gets to scan.
     
      Note in particular that vacuum_freeze_min_age
      isn't very likely to trigger freezing in non-aggressive
      VACUUMs, at least with default settings.  The
      “freeze on an FPI write”
      mechanism is somewhat more likely to trigger in non-aggressive
      VACUUMs in practice, though.  Much depends on
      workload characteristics.
     
     To ensure that every table has its
     relfrozenxid advanced at somewhat
     regular intervals, including totally static tables, autovacuum is
     invoked on any table that might contain unfrozen rows with XIDs
     older than the age specified by the configuration parameter autovacuum_freeze_max_age.  This will happen
     even if autovacuum is disabled.
    
     
     In practice all anti-wraparound autovacuums will use
     VACUUM's aggressive strategy.  This is assured
     because the effective value of
     vacuum_freeze_table_age is
     “clamped” to a value no greater than 95% of the
     current value of autovacuum_freeze_max_age.
     As a rule of thumb, vacuum_freeze_table_age
     should be set to a value somewhat below
     autovacuum_freeze_max_age, leaving enough gap
     so that a regularly scheduled VACUUM or an
     autovacuum triggered by inserts, updates and deletes is run in
     that window.  Anti-wraparound autovacuums can be avoided
     altogether in tables that reliably receive
     some VACUUMs that use the
     aggressive strategy.
    
     A convenient way to examine information about
     relfrozenxid and
     relminmxid is to execute queries such as:
SELECT c.oid::regclass as table_name,
greatest(age(c.relfrozenxid),
         age(t.relfrozenxid)) as xid_age,
mxid_age(c.relminmxid)
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm');
SELECT datname,
age(datfrozenxid) as xid_age,
mxid_age(datminmxid)
FROM pg_database;
     The age column measures the number of
     transactions from the cutoff XID to the next unallocated
     transactions ID.  The mxid_age column
     measures the number of MultiXactIds from the cutoff MultiXactId
     to the next unallocated multixact ID.
    
xidStopLimit mode #
     If for some reason autovacuum utterly fails to advance any
     table's relfrozenxid or
     relminmxid for an extended period, and
     if XIDs and/or MultiXactIds continue to be allocated, the system
     will begin to emit warning messages like this when the database's
     oldest XIDs reach forty million transactions from the wraparound
     point:
WARNING: database "mydb" must be vacuumed within 39985967 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
     (A manual VACUUM should fix the problem, as suggested by the
     hint; but note that the VACUUM must be performed by a
     superuser, else it will fail to process system catalogs and thus not
     be able to advance the database's datfrozenxid.)
     If these warnings are ignored, the system will eventually refuse
     to start any new transactions.  This happens at the point that
     there are fewer than three million transactions left:
ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb" HINT: Stop the postmaster and vacuum that database in single-user mode.
     The three-million-transaction safety margin exists to let the
     administrator recover without data loss, by manually executing the
     required VACUUM commands.  However, since the system will not
     execute commands once it has gone into the safety shutdown mode,
     the only way to do this is to stop the server and start the server in single-user
     mode to execute VACUUM.  The shutdown mode is not enforced
     in single-user mode.  See the postgres reference
     page for details about using single-user mode.
    
     Anything that influences when and how
     relfrozenxid and
     relminmxid advance will also directly
     affect the high watermark storage overhead from storing a great
     deal of historical transaction status information.  The
     additional space
      overhead is usually of fairly minimal concern.  It is
     noted as an additional downside of allowing the system to get
     close to xidStopLimit for the sake of
     completeness.
    
The term “wraparound” is inaccurate. Also, there is no “data loss” here — the message is simply wrong.
XXX: We really need to fix the situation with single user mode to put things on a good footing.
     In emergencies, VACUUM will take extraordinary
     measures to avoid xidStopLimit mode.  A
     failsafe mechanism is triggered when thresholds controlled by
     vacuum_failsafe_age and vacuum_multixact_failsafe_age are reached.  The
     failsafe prioritizes advancing
     relfrozenxid and/or
     relminmxid as quickly as possible.
     Once the failsafe triggers, VACUUM bypasses
     all remaining non-essential maintenance tasks, and stops applying
     any cost-based delay that was in effect.  Any Buffer Access
     Strategy in use will also be disabled.
    
    Vacuum maintains a visibility
     map for each table to keep track of which pages contain
    only tuples that are known to be visible to all active
    transactions (and all future transactions, until the page is again
    modified).  This has two purposes.  First, vacuum itself can skip
    such pages on the next run, since there is nothing to clean up.
    Even VACUUMs that use the aggressive strategy can skip
    pages that are both all-visible and all-frozen (the visibility map
    keeps track of which pages are all-frozen separately).
   
Second, it allows PostgreSQL to answer some queries using only the index, without reference to the underlying table. Since PostgreSQL indexes don't contain tuple visibility information, a normal index scan fetches the heap tuple for each matching index entry, to check whether it should be seen by the current transaction. An index-only scan, on the other hand, checks the visibility map first. If it's known that all tuples on the page are visible, the heap fetch can be skipped. This is most useful on large data sets where the visibility map can prevent disk accesses. The visibility map is vastly smaller than the heap, so it can easily be cached even when the heap is very large.
    As noted in Section 25.2.2.2, anything that
    influences when and how relfrozenxid and
    relminmxid advance will also directly
    affect the high watermark storage overhead needed to store
    historical transaction status information.  For example,
    increasing autovacuum_freeze_max_age (and
    vacuum_freeze_table_age along with it) will
    make the pg_xact and
    pg_commit_ts subdirectories of the database
    cluster take more space, because they store the commit status and
    (if track_commit_timestamp is enabled)
    timestamp of all transactions back to the
    datfrozenxid horizon (the earliest
    datfrozenxid in the entire cluster).
   
    The commit status uses two bits per transaction.  The default
    autovacuum_freeze_max_age setting of 200
    million transactions translates to about 50MB of
    pg_xact storage.  When
    track_commit_timestamp is enabled, about 2GB of
    pg_commit_ts storage will also be required.
   
    MultiXactId status information is implemented as two separate
    SLRU storage areas:
    pg_multixact/members, and
    pg_multixact/offsets.  There is no simple
    formula to determine the storage overhead per MultiXactId, since
    MultiXactIds have a variable number of member XIDs.
   
    Truncating of transaction status information is only possible at
    the end of VACUUMs that advance
    relfrozenxid (in the case of
    pg_xact and
    pg_commit_ts) or
    relminmxid (in the case of
    (pg_multixact/members and
    pg_multixact/offsets) of whatever table
    happened to have the oldest value in the cluster when the
    VACUUM began.  This typically happens very
    infrequently, often during aggressive strategy
    VACUUMs of one of the database's largest
    tables.
   
    The PostgreSQL query planner relies on
    statistical information about the contents of tables in order to
    generate good plans for queries.  These statistics are gathered by
    the ANALYZE command,
    which can be invoked by itself or
    as an optional step in VACUUM.  It is important to have
    reasonably accurate statistics, otherwise poor choices of plans might
    degrade database performance.
   
    The autovacuum daemon, if enabled, will automatically issue
    ANALYZE commands whenever the content of a table has
    changed sufficiently.  However, administrators might prefer to rely
    on manually-scheduled ANALYZE operations, particularly
    if it is known that update activity on a table will not affect the
    statistics of “interesting” columns.  The daemon schedules
    ANALYZE strictly as a function of the number of rows
    inserted or updated; it has no knowledge of whether that will lead
    to meaningful statistical changes.
   
    Tuples changed in partitions and inheritance children do not trigger
    analyze on the parent table.  If the parent table is empty or rarely
    changed, it may never be processed by autovacuum, and the statistics for
    the inheritance tree as a whole won't be collected. It is necessary to
    run ANALYZE on the parent table manually in order to
    keep the statistics up to date.
   
    As with vacuuming for space recovery, frequent updates of statistics
    are more useful for heavily-updated tables than for seldom-updated
    ones. But even for a heavily-updated table, there might be no need for
    statistics updates if the statistical distribution of the data is
    not changing much. A simple rule of thumb is to think about how much
    the minimum and maximum values of the columns in the table change.
    For example, a timestamp column that contains the time
    of row update will have a constantly-increasing maximum value as
    rows are added and updated; such a column will probably need more
    frequent statistics updates than, say, a column containing URLs for
    pages accessed on a website. The URL column might receive changes just
    as often, but the statistical distribution of its values probably
    changes relatively slowly.
   
    It is possible to run ANALYZE on specific tables and even
    just specific columns of a table, so the flexibility exists to update some
    statistics more frequently than others if your application requires it.
    In practice, however, it is usually best to just analyze the entire
    database, because it is a fast operation.  ANALYZE uses a
    statistically random sampling of the rows of a table rather than reading
    every single row.
   
     Although per-column tweaking of ANALYZE frequency might not be
     very productive, you might find it worthwhile to do per-column
     adjustment of the level of detail of the statistics collected by
     ANALYZE.  Columns that are heavily used in WHERE
     clauses and have highly irregular data distributions might require a
     finer-grain data histogram than other columns.  See ALTER TABLE
      SET STATISTICS, or change the database-wide default using the default_statistics_target configuration parameter.
    
Also, by default there is limited information available about the selectivity of functions. However, if you create a statistics object or an expression index that uses a function call, useful statistics will be gathered about the function, which can greatly improve query plans that use the expression index.
     The autovacuum daemon does not issue ANALYZE commands for
     foreign tables, since it has no means of determining how often that
     might be useful.  If your queries require statistics on foreign tables
     for proper planning, it's a good idea to run manually-managed
     ANALYZE commands on those tables on a suitable schedule.
    
     The autovacuum daemon does not issue ANALYZE commands
     for partitioned tables.  Inheritance parents will only be analyzed if the
     parent itself is changed - changes to child tables do not trigger
     autoanalyze on the parent table.  If your queries require statistics on
     parent tables for proper planning, it is necessary to periodically run
     a manual ANALYZE on those tables to keep the statistics
     up to date.