Re: Disk filled-up issue after a lot of inserts and drop schema

From: Rick Otten <rottenwindfish(at)gmail(dot)com>
To: Pietro Pugni <pietro(dot)pugni(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Disk filled-up issue after a lot of inserts and drop schema
Date: 2016-09-14 14:17:41
Message-ID: CAMAYy4K6nwkqqdRcWR4cNeYvCXAH7h4WAo6vq1AkX3+MYOezsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

In Unix/Linux with many of the common file system types, if you delete a
file, but a process still has it open, it will continue to "own" the disk
space until that process closes the file descriptor or dies. If you try
"ls" or other file system commands, you won't actually see the file there,
yet it really is, and it still has exclusive control of a portion of the
disk. The file is "unlinked" but the data blocks for the file are still
reserved.

Like 'ls', the 'du' command only looks at files that still exist and adds
up the disk space for those files. It does not know about these files that
have been unlinked, but still reserve a large portion of the disk.

I don't know why something still has an open file descriptor on something
you believe has been removed, but at least that explains why you are
experiencing the discrepancy between "du" and the real available space on
the disk.

On Wed, Sep 14, 2016 at 9:53 AM, Pietro Pugni <pietro(dot)pugni(at)gmail(dot)com>
wrote:

> I’m running PostgreSQL 9.5.4 on a virtual machine for production purposes.
> It runs Ubuntu 16.04.1 LTS 64bit, 32GB RAM, 461GB disk space and 4 x
> logical CPUs.
>
> Postgres executes the following activities:
> - many INSERTS for ETL
> - a lot of read and write operations for the main OLTP application
>
> The ETL job is still under development, so I’m launching several
> sequential “tries” in order to get the whole thing working. The ETL
> procedure consists of a lot of inserts packed inside transactions. At the
> moment each transaction consists of 100k inserts, so for a 90mln rows table
> I get 90mln inserts packed in 900 transactions. I know it’s not the best,
> but JDBC drivers combined with Pentaho doesn’t seem to pack more inserts
> into one, so I get a lot of overhead. I can see INSERT, BIND and PARSE
> called for each insert.. I think it’s Pentaho which embeds the INSERT in a
> parametric query.. I hate Pentaho.. anyway..
>
> The ETL procedure does the following:
> 1) DROP SCHEMA IF EXISTS data_schema CASCADE;
> 2) creates the “data_schema” schema and populates it with tables and rows
> using INSERTs as described before;
> 3) if an error occurs, drop the schema
>
> I’m repeating the previous steps many times because of some Pentaho errors
> which the team is working on in order to get it working. This stresses the
> WAL because the interruption of the process interrupts the current
> transaction and is followed by a DROP SCHEMA .. CASCADE.
>
> *After few days since we began debugging the ETL elaboration, the disk
> filled up and the last ETL job was automatically aborted*. Note that the
> DB data directory is located on the same root disk at
> /var/lib/postgresql/9.5/main
>
> What shocked me was that the *data directory of Postgres was just 815MB*
> in size ($ du -h /var/lib/postgresql/9.5/main ) and pg_xlog was 705MB, *but
> the entire disk was full *("df -h" returned a disk usage of 100%).
>
> I looked for any postgres activity and only noticed a checkpoint writer
> process that was writing at low speeds (IO usage was about 5%).
> Also, "SELECT * FROM pg_stat_activity" returned nothing and the most
> shocking part was that the "du -h /“ command returned 56GB as the total
> size of files stored on the whole disk!!! The same was for “du -ha /“,
> which returns the apparent size.
>
> The total disk size is 461GB, *so how is it possible that “df -h”
> resulted in 461GB occupied while “du -h /“ returned just 56GB?*
>
> After executing:
> $ service postgresql stop
> $ service postgresql start
>
> *the disk was freed and “df -h” returned a usage of just 16%!*
>
> The other questions are:
> - *how can I prevent the disk from filling up? I’m using the default
> configuration for the WAL (1GB max size).*
> - *how can I tune Postgres to speed up the INSERTs?*
>
> The *actual configuration* is the following:
> listen_addresses = 'localhost'
> max_connections = 32
> shared_buffers = 16GB
> work_mem = 128MB
> maintenance_work_mem = 512MB
> effective_io_concurrency = 10
> checkpoint_completion_target = 0.9
> cpu_tuple_cost = 0.02
> cpu_index_tuple_cost = 0.01
> cpu_operator_cost = 0.005
> effective_cache_size = 24GB
> default_statistics_target = 1000
>
> *May be that some of these parameters causes this strange behavior?
> checkpoint_completion_target?*
>
> Thanks to everyone for the support.
>
> Best regards,
> Pietro Pugni
>
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2016-09-14 14:44:39 Re: Disk filled-up issue after a lot of inserts and drop schema
Previous Message Pietro Pugni 2016-09-14 13:53:33 Disk filled-up issue after a lot of inserts and drop schema