Re: How to do faster DML

From: veem v <veema0000(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org, "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Subject: Re: How to do faster DML
Date: 2024-02-12 06:50:11
Message-ID: CAB+=1TVjov+y-8=d3ey-B7-Wj-23KPAKQQAejkjXcS50_KZ=pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank you .

On Mon, 12 Feb 2024 at 03:52, Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:

> On 2024-02-11 12:08:47 -0500, Ron Johnson wrote:
> > On Sun, Feb 11, 2024 at 11:54 AM veem v <veema0000(at)gmail(dot)com> wrote:
> > When you said "you would normally prefer those over numeric " I was
> > thinking the opposite. As you mentioned integer is a fixed length
> data type
> > and will occupy 4 bytes whether you store 15 or 99999999.But in case
> of
> > variable length type like Number or numeric , it will resize itself
> based
> > on the actual data, So is there any downside of going with the
> variable
> > length data type like Numeric,
> >
> >
> > Consider a table with a bunch of NUMERIC fields. One of those records
> has
> > small values (aka three bytes). It fits neatly in 2KiB.
> >
> > And then you update all those NUMERIC fields to big numbers that take 15
> > bytes. Suddenly (or eventually, if you update them at different times),
> the
> > record does not fit in 2KiB, and so must be moved to its own.page.
> That causes
> > extra IO.
>
> I think that's not much of a concern with PostgreSQL because you can't
> update a row in-place anyway because of MVCC. So in any case you're
> writing a new row. If you're lucky there is enough free space in the same
> page and you can do a HOT update, but that's quite independent on
> whether the row changes size.
>
>
>
Good to know. So it means here in postgres, there is no such concern like
"row chaining", "row migration" etc. which we normally have in a non mvcc
database (like Oracle say). And there its not advisable to have more than
~255 columns in a table even its technically possible. And if such
requirement arises, we normally break the table into 2 different tables
with some columns in common to join them.

https://jonathanlewis.wordpress.com/2015/02/19/255-columns/

So we were thinking, adding many column to a table should be fine in
postgres (as here we have a use case in which total number of columns may
go till ~500+). But then, considering the access of columns towards the
end of a row is going to add more time as compared to the column which is
at the start of the row. As you mentioned, accessing 100th column may add 4
to 5 times more as compared to the access of first column. So , is it
advisable here to go for similar approach of breaking the table into two ,
if the total number of column reaches certain number/threshold for a table?

Regards
Veem

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dávid Suchan 2024-02-12 10:07:48 Safest pgupgrade jump distance
Previous Message Tom Lane 2024-02-11 23:43:58 Re: Query hangs (and then timeout) after using COPY to import data