From: | Perumal Raj <perucinci(at)gmail(dot)com> |
---|---|
To: | Kevin Brannen <KBrannen(at)efji(dot)com> |
Cc: | pgsql-general General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Recommendation to run vacuum FULL in parallel |
Date: | 2019-04-11 04:49:42 |
Message-ID: | CALvqh4rATReNKQLmrU0xHe5tpr_EFO5h2-n-w-_QFwYmdGVbbg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks Kevin for the inputs,
In my Case there are 500+ Tables and biggest chunk 30GB ( Table only) + its
indexes.
So i have created 6 batches and executed in parallel . All my scripts
completed in 2 Hours and my DB size came down from 500GB to 300GB.
Yes i do see CPU spike, But i did whole activity with full apps down time.
Going forward i am going to run vacuum daily basis to maintain the DB size.
Also Table/DB Age came down drastically.
Thanks
Raj
On Thu, Apr 4, 2019 at 12:53 PM Kevin Brannen <KBrannen(at)efji(dot)com> wrote:
> *From:* Perumal Raj <perucinci(at)gmail(dot)com>
>
> So conclude the requirement here , The only way to parallelism is multiple
> script. And no need to do REINDEX exclusively.
>
> Question : Do we need to consider Table dependencies while preparing
> script in order to avoid table locks during vacuum full ?
>
>
>
> We have a small bash script (see below) that get the list of tables and
> their sizes, sorted smallest to largest, and do “vacuum full” one at a time
> because (as someone else pointed out) this is very I/O intensive. That
> order also helps to ensure we finish because some of our installs are at
> the edge of running out of space (an issue we’re dealing with). I probably
> wouldn’t have a problem doing 2 at a time, but we do this in the middle of
> the night when activity is lowest and it only takes 1-2 hours, so we’re
> good with it. It sounds like you have a lot more data though.
>
>
>
> You might also consider putting the data into different tablespaces which
> are spread over multiple disks to help I/O. If you can, use SSD drives,
> they help with speed quite a bit. 😊
>
>
>
> Don’t worry about table dependencies. This is a physical operation, not a
> data operation.
>
>
>
> HTH,
>
> Kevin
>
>
>
> $PGPATH/psql -t -c "
>
> WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME,
> pg_total_relation_size(c.oid) AS total_bytes
>
> FROM pg_class c
>
> LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
>
> WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog',
> 'information_schema' )
>
> ORDER BY 2 )
>
> SELECT table_name FROM s
>
> " |
>
> while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ;
> done
>
> ###
> This e-mail transmission, and any documents, files or previous e-mail
> messages attached to it, may contain confidential information. If you are
> not the intended recipient, or a person responsible for delivering it to
> the intended recipient, you are hereby notified that any disclosure,
> distribution, review, copy or use of any of the information contained in or
> attached to this message is STRICTLY PROHIBITED. If you have received this
> transmission in error, please immediately notify us by reply e-mail, and
> destroy the original transmission and its attachments without reading them
> or saving them to disk. Thank you.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Ron | 2019-04-11 07:01:50 | Re: Recommendation to run vacuum FULL in parallel |
Previous Message | Prakash Ramakrishnan | 2019-04-11 04:31:02 | Re: os upgrade 7.3 to 7.5 (postgres version 10.5) |