Re: Unexplained disk usage in AWS Aurora Postgres

From: Chris Borckholder <chris(dot)borckholder(at)bitpanda(dot)com>
To: Srinivasa T N <seenutn(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Unexplained disk usage in AWS Aurora Postgres
Date: 2020-08-07 12:34:06
Message-ID: CADPUTkRYyUie6tAUXvoPhVF01NDKdHRTR5S8jQuQjpgLWm41Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you for your insight Seenu!

That is a good point, unfortunately we do not have access to the
server/file system as the database is a managed service.
Access to the file system from postgres like pg_ls_dir is also blocked.

Are you aware of another, creative way to infer the wal file size from
within postgres?

Best Regards
Chris

On Tue, Aug 4, 2020 at 11:39 AM Srinivasa T N <seenutn(at)gmail(dot)com> wrote:

> There may be lot of wal files or the size of log files in pg_log might be
> huge. "du -sh *" of data directory holding the database might help.
>
> Regards,
> Seenu.
>
>
> On Tue, Aug 4, 2020 at 2:09 PM Chris Borckholder <
> chris(dot)borckholder(at)bitpanda(dot)com> wrote:
>
>> Hi!
>>
>> We are experiencing a strange situation with an AWS Aurora postgres
>> instance.
>> The database steadily grows in size, which is expected and normal.
>> After enabling logical replication, the disk usage reported by AWS
>> metrics increases much faster then the database size (as seen by \l+ in
>> psql). The current state is that database size is ~290GB, while AWS reports
>> >640GB disk usage.
>> We reached out to AWS support of course, which is ultimately responsible.
>> Unfortunately they were not able to diagnose this until now.
>>
>> I checked with the queries from wiki
>> https://wiki.postgresql.org/wiki/Disk_Usage , which essentially give the
>> same result.
>> I tried to check on wal segment file size, but we have no permission to
>> execute select pg_ls_waldir().
>> The replication slot is active and it also progresses
>> (pg_replication_slots.confirmed_flush_lsn increases and is close to
>> pg_current_wal_flush_lsn).
>>
>> Can you imagine other things that I could check from within postgres with
>> limited permissions to diagnose this?
>>
>> Best Regards
>> Chris
>>
>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Chris Borckholder 2020-08-07 12:37:50 Re: Unexplained disk usage in AWS Aurora Postgres
Previous Message Christophe Pettus 2020-08-07 02:00:11 Re: Advancing the archiver position safely