Re: Dropping column from big table

From: Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
To: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Dropping column from big table
Date: 2024-07-11 12:36:32
Message-ID: CANzqJaCA3Um1JFgjRNg0YafFwc4VeRmWhsH7fGXeHZ_yOOcTRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jul 11, 2024 at 3:41 AM sud <suds1434(at)gmail(dot)com> wrote:

>
>
> On Thu, 11 Jul, 2024, 12:46 pm Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
> wrote:
>
>> On Wed, Jul 10, 2024 at 11:28 PM sud <suds1434(at)gmail(dot)com> wrote:
>>
>>>
>>>
>>>
>>> Thank you so much. When you said *"you can execute one of the forms of
>>> ALTER TABLE that performs a rewrite*
>>> *of the whole table."* Does it mean that post "alter table drop column"
>>> the vacuum is going to run longer as it will try to clean up all the rows
>>> and recreate the new rows? But then how can this be avoidable or made
>>> better without impacting the system performance
>>>
>>
>> "Impact" is a non-specific word. "How much impact" depends on how many
>> autovacuum workers you've set it to use, and how many threads you set in
>> vacuumdb.
>>
>>
>>> and blocking others?
>>>
>>
>> VACUUM never blocks.
>>
>> Anyway, DROP is the easy part; it's ADD COLUMN that can take a lot of
>> time (depending on whether or not you populate the column with a default
>> value).
>>
>> I'd detach all the partitions from the parent table, and then add the new
>> column to the not-children in multiple threads, add the column to the
>> parent and then reattach all of the children. That's the fastest method,
>> though takes some time to set up.
>>
>
>
> Thank you so much.
>
> Dropping will take it's own time for post vacuum however as you rightly
> said, it won't be blocking which should be fine.
>
> In regards to add column, Detaching all partitions then adding column to
> the individual partition in multiple sessions and then reattaching looks to
> be a really awesome idea to make it faster.
>

Do both the DROP and ADD in the same "set". Possibly in the same statement
(which would be fastest if it works), and alternatively on the same command
line. Examples:
psql --host=foo.example.com somedb -c "ALTER TABLE bar_p85 DROP COLUMN
splat, ADD COLUMN barf BIGINT;"
psql --host=foo.example.com somedb -c "ALTER TABLE bar_p85 DROP splat;" -c
ALTER TABLE bar_p85 ADD COLUMN barf BIGINT;"

My syntax is probably wrong, but you get the idea.

However one doubt, Will it create issue if there already exists foreign key
> on this partition table or say it's the parent to other child
> partition/nonpartition tables?
>

(Note that detached children have FK constraints.)

It'll certainly create an "issue" if the column you're dropping is part of
the foreign key. 😀

It'll also cause a problem if the table you're dropping from or adding to
is the "target" of the FK, since the source can't check the being-altered
table during the ALTER TABLE statement.

Bottom line: you can optimize for:
1. minimized wall time by doing it in multiple transactions (which
*might* bodge
your application; we don't know it, so can't say for sure), OR
2. assured consistency (one transaction where you just ALTER the parent,
and have it ripple down to the children); it'll take much longer, though.

One other issue: *if* adding the new column requires a rewrite, "ALTER
parent" *might* (but I've never tried it) temporarily use an extra 2TB of
disk space in that single transaction. Doing the ALTERs child by child
minimizes that, since each child's ALTER is it's own transaction.

Whatever you do... test test test.

>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2024-07-11 13:05:09 Re: Dropping column from big table
Previous Message Laurenz Albe 2024-07-11 08:06:47 Re: Dropping column from big table