Re: How to do faster DML

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: veem v <veema0000(at)gmail(dot)com>, Ron Johnson <ronljohnsonjr(at)gmail(dot)com>, "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: How to do faster DML
Date: 2024-02-14 19:13:15
Message-ID: bcb0516c-f2ba-4693-90e1-60063fe96601@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On 2/14/24 10:11 AM, veem v wrote:
>
> On Wed, 14 Feb, 2024, 10:30 am Ron Johnson, <ronljohnsonjr(at)gmail(dot)com>
> wrote:
>
> On Tue, Feb 13, 2024 at 4:17 PM veem v <veema0000(at)gmail(dot)com> wrote:
> [sni[]
>
> One question here, if we have defined one column as a
> fixed length data type "integer" and slowly we noticed the
> length of data keeps increasing (in case of a sequence
> generated PK column which will keep increasing), and we want
> to alter the column to "bigint" now. In such scenario, will it
> append/pad the additional spaces to all the existing values
> which were already stored  with integer type initially in the
> table. And that would be then an increase to the existing
> table storage. Please correct me if I'm wrong.
>
>
> ALTER TABLE foo ALTER COLUMN bar BIGINT; rewrites the whole
> table.  Do yourself a favor, and start with BIGINT.
>
>
>
> Not yet tried, but that's interesting. I was initially thinking as it
> will be increasing the length, so it would be just a metadata change
> and finish within seconds.

It depends:

https://www.postgresql.org/docs/current/sql-altertable.html

"Adding a column with a volatile |DEFAULT| or changing the type of an
existing column will require the entire table and its indexes to be
rewritten. As an exception, when changing the type of an existing
column, if the |USING| clause does not change the column contents and
the old type is either binary coercible to the new type or an
unconstrained domain over the new type, a table rewrite is not needed.
However, indexes must always be rebuilt unless the system can verify
that the new index would be logically equivalent to the existing one.
For example, if the collation for a column has been changed, an index
rebuild is always required because the new sort order might be
different. However, in the absence of a collation change, a column can
be changed from |text| to |varchar| (or vice versa) without rebuilding
the indexes because these data types sort identically. Table and/or
index rebuilds may take a significant amount of time for a large table;
and will temporarily require as much as double the disk space.

"

create table int_test(int_fld integer);

insert into int_test select * from generate_series(1, 10000, 1);

select ctid, int_fld from int_test ;

ctid   | int_fld
----------+---------
 (0,1)    |       1
 (0,2)    |       2
 (0,3)    |       3
 (0,4)    |       4
 (0,5)    |       5
 (0,6)    |       6
 (0,7)    |       7
 (0,8)    |       8
 (0,9)    |       9
 (0,10)   |      10

alter table int_test alter column int_fld set data type bigint;

select ctid, int_fld from int_test ;

  ctid   | int_fld
----------+---------
 (0,1)    |       1
 (0,2)    |       2
 (0,3)    |       3
 (0,4)    |       4
 (0,5)    |       5
 (0,6)    |       6
 (0,7)    |       7
 (0,8)    |       8
 (0,9)    |       9
 (0,10)   |      10

update int_test set  int_fld = int_fld;

select ctid, int_fld from int_test  order by int_fld;

(63,1)    |       1
 (63,2)    |       2
 (63,3)    |       3
 (63,4)    |       4
 (63,5)    |       5
 (63,6)    |       6
 (63,7)    |       7
 (63,8)    |       8
 (63,9)    |       9
 (63,10)   |      10

Where ctid is:

https://www.postgresql.org/docs/current/ddl-system-columns.html

"

|ctid|

The physical location of the row version within its table. Note that
although the |ctid| can be used to locate the row version very
quickly, a row's |ctid| will change if it is updated or moved by
|VACUUM FULL|. Therefore |ctid| is useless as a long-term row
identifier. A primary key should be used to identify logical rows.

"

>
>  But as you mentioned, it seems to be the effect of "fixed length data
> type" which is why it's going to rewrite whole table even we just
> increases the column length. Hope it won't be the case in variable
> length data type.
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2024-02-14 19:20:45 Re: PITR
Previous Message Yongye Serkfem 2024-02-14 18:38:54 PITR