Re: Alter the column data type of the large data volume table.

From: Michael Lewis <mlewis(at)entrata(dot)com>
To: Kevin Brannen <KBrannen(at)efji(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Alter the column data type of the large data volume table.
Date: 2020-12-04 16:22:19
Message-ID: CAHOFxGqbtX9B7wX0EUurtY92OnA98j2+4rUn53MiCu2d1mtbtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Dec 4, 2020 at 9:04 AM Kevin Brannen <KBrannen(at)efji(dot)com> wrote:

> *>From:* Olivier Gautherot <ogautherot(at)gautherot(dot)net>
>
> >>5) If you're brave enough, convert your current table as a partition
> (rename it to something like table_hist), duplicate the table model under
> the same name as now (adjusting the primary key type) and set the INHERITS
> on the primary key range. The inheritance should take care of the type
> conversion (haven't tried it but it's worth a try). If it works, you will
> reach your goal without downtime or significant overhead.
>
> >Sorry, just tried this one and it failed: type mismatch.
>
>
>
> Seems like a sound idea in general. I’d probably rename the tables, let’s
> call them “big_hist” for the old big table and “big_split” for the new
> partitioned table that being used go forward – assuming the original table
> was called “big”. Then create a View that will look at both of those but
> call it the same as the old table, and let the view do a type cast on the
> old key like big_hist.id::bigint so it matches the new type, because the
> view will probably be a union and the type need to match. That way your
> application only has to pause long enough to do a few meta-commands then it
> all can resume, and like Olivier pointed you, you can fix the data by
> moving it from big_hist to big_split in the background as you have time.
>
>
>
> I’d probably put it all in a transaction too:
>
>
>
> Create table … -- all the commands to create your patitioned table
> big_split here
>
> Begin;
>
> Alter table big rename to big_hist;
>
> Create view big select * from big_split union select id::bigint, /* other
> cols */ from big_hist;
>
> Commit;
>
>
>
> Try it on a dev system and if it works you’re off and running. I’d expect
> the view to slow things down a little, but probably not too much if you
> have good indexes. But at least you could transition without major downtime
> and then rename “big_split” back to “big” and drop “big_hist” when you’ve
> finished the transition. I might even be tempted to add a trigger so that
> all new inserts into “big” really go into “big_split” so “big_hist” doesn’t
> grow any more. Your imagination is probably the limit. 😊
>

Yes, "instead of" triggers for insert/update/delete will make the change
transparent to the application side, other than the potential for slowness
while in the process of moving the data and still using the view.

Also, I'd advocate for UNION ALL to avoid the "distinct" work that is
required for plain UNION. I wish ALL were default behavior and "UNION
DISTINCT" was how get that behavior.

If partitioning is going to happen anyway in the near future, now is a
great time to make that happen. Unfortunately, PG13 is required to use
logical replication from a non-partitioned table to a partitioned table so
moving the data still requires some creative work.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Olivier Gautherot 2020-12-04 17:42:29 Re: Alter the column data type of the large data volume table.
Previous Message Tom Lane 2020-12-04 16:18:39 Re: Set COLLATE on a session level