Re: Modification of data in base folder and very large tables

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Ogden Brash <info(at)litika(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Modification of data in base folder and very large tables
Date: 2019-10-09 18:42:31
Message-ID: CAMkU=1xLS4nMcddSSLZYF8xEA-0AasZTcXYBn+315KrQTP6Rbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Oct 9, 2019 at 4:33 AM Ogden Brash <info(at)litika(dot)com> wrote:

> # lsof -p 6600 | wc -l;
> 840
>
> # lsof -p 6601 | wc -l;
> 906
>
> Is that normal? That there be so many open file pointers? ~900 open file
> pointers for each of the processes?
>

I don't think PostgreSQL makes any effort to conserve file handles, until
it starts reaching the max. So any file that has ever been opened will
remain open, unless it was somehow invalidated (e.g. the file needs to be
deleted). If those processes were previously loading smaller tables before
the got bogged down in the huge ones, a large number of handles would not
be unexpected.

> The next I did was go to see the actual data files, to see how many there
> are. In my case they are in postgresql/11/main/base/24576 and there are
> 2076 files there. That made sense. However, I found that when I list them
> by modification date I see something interesting:
>
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.7
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.8
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.9
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.10
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.11
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.12
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.13
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.14
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.16
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.15
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.17
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.18
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.19
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.21
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.22
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.23
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.24
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.25
> -rw------- 1 postgres postgres 1073741824 Oct 8 13:05 27083.26
> -rw------- 1 postgres postgres 19062784 Oct 8 13:05 27082_fsm
> -rw------- 1 postgres postgres 544489472 Oct 8 13:05 27077.34
> -rw------- 1 postgres postgres 169705472 Oct 8 13:05 27082.72
> -rw------- 1 postgres postgres 978321408 Oct 8 13:05 27083.27
> -rw------- 1 postgres postgres 342925312 Oct 8 13:05 27076.88
>
> If you notice, the file size is capped at 1 GB and as the giant table has
> grown it has added more files in this directory. However, the mysterious
> thing to me is that it keeps modifying those files constantly - even the
> ones that are completely full. So for the two large tables it has been
> restoring all week, the modification time for the ever growing list of
> files is being updating constantly.
>

The bgwriter, the checkpointer, and autovac, plus any backends that decide
they need a clean page from the buffer cache can all touch those files.
They might touch them in ways that are not IO intensive, but still cause
the modification time to get updated. In my hands, one all dirty buffers a
given file have been flushed and all contents in the file have been
vacuumed, its mtime stops changing just due to copy in which is directed to
later files.

It is also squishy what it even means to modify a file. I think
filesystems have heuristics to avoid "gratuitous" updates to mtime, which
make it hard to recon with.

>
> Could it be that thats why I am seeing a slowdown over the course of the
> week - that for some reason as the number of files for the table has grown,
> the system is spending more and more time seeking around the disk to touch
> all those files for some reason?
>

I don't think lsof or mtime are effective ways to research this. How about
running strace -ttt -T -y on those processes?

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Lewis 2019-10-09 19:36:00 Re: Query slow again after adding an `OR` operation (was: Slow PostgreSQL 10.6 query)
Previous Message David Rowley 2019-10-09 18:15:50 Re: Get the planner used by a query?