From: | Ron <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Recommendation to run vacuum FULL in parallel |
Date: | 2019-04-11 07:01:50 |
Message-ID: | b35d52ac-6911-53b3-1aa0-f3fc32d6d02b@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Look also at pg_stat_all_tables.n_dead_tup for tables which are candidates
for vacuuming.
On 4/10/19 11:49 PM, Perumal Raj wrote:
> Thanks Kevin for the inputs,
>
> In my Case there are 500+ Tables and biggest chunk 30GB ( Table only) +
> its indexes.
> So i have created 6 batches and executed in parallel . All my scripts
> completed in 2 Hours and my DB size came down from 500GB to 300GB.
>
> Yes i do see CPU spike, But i did whole activity with full apps down time.
>
> Going forward i am going to run vacuum daily basis to maintain the DB size.
>
> Also Table/DB Age came down drastically.
>
> Thanks
> Raj
>
> On Thu, Apr 4, 2019 at 12:53 PM Kevin Brannen <KBrannen(at)efji(dot)com
> <mailto:KBrannen(at)efji(dot)com>> wrote:
>
> *From:* Perumal Raj <perucinci(at)gmail(dot)com <mailto:perucinci(at)gmail(dot)com>>
>
> **
>
> So conclude the requirement here , The only way to parallelism is
> multiple script. And no need to do REINDEX exclusively.
>
> Question : Do we need to consider Table dependencies while preparing
> script in order to avoid table locks during vacuum full ?
>
> We have a small bash script (see below) that get the list of tables
> and their sizes, sorted smallest to largest, and do “vacuum full” one
> at a time because (as someone else pointed out) this is very I/O
> intensive. That order also helps to ensure we finish because some of
> our installs are at the edge of running out of space (an issue we’re
> dealing with). I probably wouldn’t have a problem doing 2 at a time,
> but we do this in the middle of the night when activity is lowest and
> it only takes 1-2 hours, so we’re good with it. It sounds like you
> have a lot more data though.
>
> You might also consider putting the data into different tablespaces
> which are spread over multiple disks to help I/O. If you can, use SSD
> drives, they help with speed quite a bit. 😊
>
> Don’t worry about table dependencies. This is a physical operation,
> not a data operation.
>
> HTH,
>
> Kevin
>
> $PGPATH/psql -t -c "
>
> WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME,
> pg_total_relation_size(c.oid) AS total_bytes
>
> FROM pg_class c
>
> LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
>
> WHERE relkind = 'r' AND nspname NOT IN (
> 'pg_catalog', 'information_schema' )
>
> ORDER BY 2 )
>
> SELECT table_name FROM s
>
> " |
>
> while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t
> $t ; done
>
> ###
>
> This e-mail transmission, and any documents, files or previous e-mail
> messages attached to it, may contain confidential information. If you
> are not the intended recipient, or a person responsible for delivering
> it to the intended recipient, you are hereby notified that any
> disclosure, distribution, review, copy or use of any of the
> information contained in or attached to this message is STRICTLY
> PROHIBITED. If you have received this transmission in error, please
> immediately notify us by reply e-mail, and destroy the original
> transmission and its attachments without reading them or saving them
> to disk. Thank you.
>
--
Angular momentum makes the world go 'round.
From | Date | Subject | |
---|---|---|---|
Next Message | rihad | 2019-04-11 07:19:43 | When do vacuumed pages/tuples become available for reuse? |
Previous Message | Perumal Raj | 2019-04-11 04:49:42 | Re: Recommendation to run vacuum FULL in parallel |