Re: Why are data files getting modified?

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Abhishek Bhola <abhishek(dot)bhola(at)japannext(dot)co(dot)jp>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why are data files getting modified?
Date: 2022-02-09 15:14:05
Message-ID: a256d775a579976d7895248fed6f5ca101a4d4a4.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 2022-02-09 at 17:29 +0900, Abhishek Bhola wrote:
> I have a table with monthly partitions, starting from 2011.
> Each partition has about 2-3 billion rows and is about 40TB of data in total.
> I am running a select query to \copy the data into csv files for each month.
>
> I see there is an AUTOVACUUM job started on all the partitions that have been
> read (on which select has run). I am assuming the reason why Postgres is
> autovacuuming these large partitions is this:
>
> db=# SELECT c.relname as table_name, c.relkind as type, age(c.relfrozenxid) as age,
> c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <> 2147483647 ORDER BY 3 DESC LIMIT 20;
>         table_name         | type |    age    | relfrozenxid
> ---------------------------+------+-----------+--------------
>  hist_omx_orderaudit201506 | r    | 201463508 |       319891
>  hist_omx_orderaudit201509 | r    | 201418497 |       364902
>  hist_omx_orderaudit201205 | r    | 201142042 |       641357
>
> autovacuum_freeze_max_age
> ---------------------------
>  200000000
> (1 row)
>
> Am I wrong in my understanding why this autovacuum is getting triggered?
>
> However, my main question is that why do the data files in the tablespace, corresponding
> to these partition tables (that have been read or selected recently), or the ones
> autovacuumed recently have a latest timestamp?
>
> There was no insert or update on these partitions, so why do I see an updated timestamp?
> Does autovacuum change something on the datafiles?

Yes, autovacuum will freeze old rows in the table, which modifies the data files.

To keep this to a minimum, you can run VACUUM (FREEZE) on a partition as soon
as you know that its data won't get modified any more.
Still, I wouldn't be sure that the file modification timestamp won't change
any more after that. But that should be pretty irrelevant; at any rate, no
substantial data modifications will occur after that.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Abhishek Bhola 2022-02-09 15:29:38 Re: Why are data files getting modified?
Previous Message Tom Lane 2022-02-09 14:38:43 Re: Compile 14.1 in EL5.8