Re: is it safe to drop 25 tb schema with cascade option?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Julie Nishimura <juliezain(at)hotmail(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: is it safe to drop 25 tb schema with cascade option?
Date: 2019-09-19 21:31:22
Message-ID: d11cac74-fc57-6949-dfc5-cc50d0c0810d@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 9/19/19 2:24 PM, Julie Nishimura wrote:
> Adrian,
> We do run vacuum w/o FULL every day:
> SYSTABLES="' pg_catalog.' || relname || ';' from pg_class a,
> pg_namespace b where a.relnamespace=b.oid and b.nspname='pg_catalog' an
> d a.relkind='r'"
>
> But it does not look like it frees up the space...

It won't return space to the OS it just marks it as available for reuse
by Postgres.

>
> Or you meant we need to run vacuum on 'my_db_name' without parameters,
> that it runs for every table? I am just not sure how long it will take
> to run for 39 tb...:(

Not sure. The bottom line is you are running out of transaction ids and
if the txid counter wraps things get ugly. You could try vacuuming
individual non-system tables that have a lot of churn(UPDATES/DELETES)
and see if that buys you some ids.

>
> Thanks
>
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> *Sent:* Thursday, September 19, 2019 2:06 PM
> *To:* Julie Nishimura <juliezain(at)hotmail(dot)com>
> *Subject:* Re: is it safe to drop 25 tb schema with cascade option?
> On 9/19/19 1:30 PM, Julie Nishimura wrote:
>> Adrian, thanks for your reply. We do run VACUUM on pg_catalog every day
>> (while the system is online). Should I try to run VACUUM FULL on
>> pg_catalog? is it the same as you referring system catalogs?
>
> I would avoid VACUUM FULL as it acquires an exclusive lock on the table
> and rewrites the table.:
>
> https://www.postgresql.org/docs/8.2/sql-vacuum.html
>
> A VACUUM w/o FULL will make space available for new tuples which is what
> you want.
>
>>
>> Thank you!
>>
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>> *Sent:* Thursday, September 19, 2019 12:38 PM
>> *To:* Julie Nishimura <juliezain(at)hotmail(dot)com>; t
>> pgsql-general(at)lists(dot)postgresql(dot)org <pgsql-general(at)lists(dot)postgresql(dot)org>;
>> pgsql-general <pgsql-general(at)postgresql(dot)org>
>> *Subject:* Re: is it safe to drop 25 tb schema with cascade option?
>> On 9/19/19 12:06 PM, Julie Nishimura wrote:
>>> Hello, we've recently inherited large Greenplum system (master with
>>> standby and 8 segment nodes), which is running old version of GP:
>>>
>>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>>> Greenplum initsystem version   = 4.3.4.0 build 1
>>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>>> Greenplum current version      = PostgreSQL 8.2.15 (Greenplum
>>>   Database 4.3.8.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC
>>> gcc (GCC) 4.4.2 compiled on Apr 20 2016 08:08:56
>>> 20190919:15:22:01:122002 gpstate:hitw-mck-gp1-mdw-1:gpadmin-[INFO]:-
>>> Postgres version               = 8.2.15
>>>
>>> If I scan logs, for the last 6 months I see the following warning after
>>> every transaction:
>>> 04 UTC,0,con9059926,cmd1,seg-1,,,,sx1,"WARNING","01000","database
>>> ""my_db_name"" must be vacuumed within 1607900488 transactions",,"To
>>> avoid a database shutdown, execute a full-database VACUUM in
>>> ""my_db_name"".",,,,"set client_encoding to 'SQL_ASCII'",0,,"varsup.c",109,
>>>
>>> The database "my_db_name" is 32 TB. According to the crontab logs, we
>>> run VACUUM on pg_catalog every day (while the system is online). Should
>>> I try to run VACUUM FULL on pg_catalog first, or I need to run VACUUM on
>>> the entire "my_db_name"? I am not sure what I should try first.
>>
>> The vacuum warning is about transaction id wrap around:
>>
>> https://www.postgresql.org/docs/8.2/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND
>>
>> You will need to vacuum more then just pg_catalog. You will need to do
>> what the message says, vacuum the entire database.
>>
>>>
>>> For the full picture: the largest schema on "my_db_name" was "temp", it
>>> was consuming about 25 tb. So what we did - we renamed this schema to
>>> "temp_orig", and created brand new schema "temp" (to make drop objects
>>> from temp_orig easier and isolated). However, I was hesitating to drop
>>> the entire schema that big in one transaction, and started dropping
>>> tables from "temp_orig", however, there are millions of objects in that
>>> schema, and as a result, number of "drop table" transactions are very
>>> high. How safe is it to run "DROPSCHEMAtemp_orig CASCADE" if the schema
>>> is almost 25 tb?
>>
>> Not sure.
>>
>>>
>>> We are running out of space very quickly. we have only 5% left on a device
>>>
>>> Last time when we dropped millions of objects from that old schema, we
>>> were able to free up some space, but this time around even though I am
>>> running a lot of "drop tables", the space temporarily goes down
>>> (according to df -h), then it goes back again, even faster than I am
>>> freeing it up. Which makes me believe the system catalog is bloated now.
>>
>> Probably due to all the other operations hitting the database.
>>
>> Have you tried vacuuming the system catalogs?
>>
>>>
>>> Any advice is appreciated.
>>>
>>> Thanks a lot!
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2019-09-19 21:34:27 Re: is it safe to drop 25 tb schema with cascade option?
Previous Message Julie Nishimura 2019-09-19 21:24:57 Re: is it safe to drop 25 tb schema with cascade option?