Re: Dropping column from big table

From: sud <suds1434(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
Subject: Re: Dropping column from big table
Date: 2024-07-15 20:30:27
Message-ID: CAD=mzVW9d0cPfccGVHLvCkcq4Mc-Wg5La3+XAHHEY1AMeOQ_Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jul 15, 2024 at 7:58 PM Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:

>
> > Hm, true.
> >
> > You can always do
> >
> > UPDATE tab SET id = id;
> >
> > followed by
> >
> > VACUUM (FULL) tab;
>
> Yes, that should work. It needs about twice the size of the table in
> temporary space, though.
>
> Since the OP wrote that the table is "daily ... and 90 partitions"
> (which understand that there is one partition per day and partitions are
> kept for 90 days) it might be better to just wait. After 90 days all the
> partitions with the obsolete column will be gone.
>
>
>
Thank You very much.

As I understand, after dropping a column, it will still internally hold the
"NOT NULL" values in that column for all the rows, even though it's not
visible outside.

So, after the DROP column, it will force update any of the columns as
below, Then it will force create another copy of each of the rows even if
the column is updated to the same value. The new copy will have the dropped
column with values as NULL. And the post "VACUUM FULL '' will clean all the
rows with "NOT NULL '' values of that dropped column and thus reclaim the
space.

But the only issue would be "VACUUM FULL" will take a table lock and also
it may take longer to run this vacuum on the full table considering the
size of the table in TB's. Thus, is it fine to just leave it post execution
of the "update" statement , so that the normal vacuum operation (which will
be online operation) and that will take care of the removal of old rows ?

UPDATE tab SET id = id;
VACUUM (FULL) tab;

And also, As you mentioned we may also leave it as is and wait for the
partition to be dropped, so that the dropped column with "not null" values
which are still lying under the hood and are occupying space will be
removed automatically. But even then, is that dropped column still lying in
the rows with null values in it throughout its lifecycle, till the table
exists in the database?

Seems there is no other option exist to drop the column with space
reclaimed from the table in immediate effect, other than above discussed.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dan Kortschak 2024-07-15 20:31:13 re-novice coming back to pgsql: porting an SQLite update statement to postgres
Previous Message Ron Johnson 2024-07-15 19:55:19 Re: PostgreSQL Active-Active Clustering