Brian Fehrle <brianf(at)consistentstate(dot)com> wrote:
> I have a very active database that generates between 200 and 400
> WAL files that are archived for backups. Yesterday, the average
> number of WAL files being generated skyrocketed.
>
> We took a look at some WAL files using the strings command, and
> notice quite a bit of data from a specific table with timestamps
> of last year. This table does some basic tracking, where a row is
> inserted, then it's updated, and after that it's never modified
> again. So we're having data appear in our WAL files from rows that
> haven't (or should not have been) modified at all in over a year.
>
> We figured that the rows were in fact being updated, so we turned
> log_min_duration_statement to 0 for a period of time and monitored
> all queries going through, however we never found a single query
> that could possibly be updating these pieces of data.
> Can server processes such as autovacuum processes cause this
> behavior?
Yes, to prevent transaction ID wrap-around it must FREEZE each tuple
at some point, which is a WAL-logged operation. If many rows were
loaded at about the same time, and they were not subsequently
updated or deleted, you can see quite a burst of activity when
things hit this point. If you're using rsync for your base backups,
expect a lot more data to need to be written, too, because tuples
which have been stable for the last year have now changed.
-Kevin