Re: How to do faster DML

From: veem v <veema0000(at)gmail(dot)com>
To: Lok P <loknath(dot)73(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: dwhite(at)seawardmoon(dot)com, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>, Francisco Olarte <folarte(at)peoplecall(dot)com>, haramrae(at)gmail(dot)com, hjp-pgsql(at)hjp(dot)at
Subject: Re: How to do faster DML
Date: 2024-02-06 05:55:05
Message-ID: CAB+=1TUfA7dra5DRqAoVvKvjTv5WndyNDiuvFw+bBR0qi6_tWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 6 Feb 2024 at 10:45, Lok P <loknath(dot)73(at)gmail(dot)com> wrote:

> Thank you Greg.
>
> *"and keeping all your active stuff in cache. Since you have 127 columns,
> only pull back the columns you absolutely need for each query."*
>
> Understood the point. As postgres is a "row store" database, so keeping
> the size of the row lesser by making the number of columns to minimum,
> should definitely help as more number of rows can be packed then into one
> page or block (which I believe is 8KB default size per block). And that
> mean more number of blocks can be cached while reading, so better cache hit
> ratio.
>
> As you rightly pointed out, Another thing I noticed the shared_buffer
> parameters set as 2029684 in this instance, which comes to ~21MB and that
> seems very small for a database operating in large scale. And I see we have
> RAM in the instance showing as ~256GB. So thinking of bumping it to
> something as ~40-50GB. Hope that will help to some extent. Not sure if
> there is methods to manually, cache some objects(tables/indexes) which
> were getting used frequently by the read queries.
>
>
> On Tue, Feb 6, 2024 at 10:27 AM Greg Sabino Mullane <htamfids(at)gmail(dot)com>
> wrote:
>
>>
>> So when you said *"In that case, and based on the numbers you provided,
>>> daily partitioning seems a decent solution."*
>>> , does it mean that size of the partition (which will be ~100-150GB per
>>> partition here) is not an issue for the postgres to serve read latency
>>> sensitive queries?
>>>
>>
>> Yes, that should be fine. At the end of the day, what's going to be more
>> important is making sure you are using good indexing, and keeping all your
>> active stuff in cache. Since you have 127 columns, only pull back the
>> columns you absolutely need for each query.
>>
>> Cheers,
>> Greg
>>
>>
>
Good discussion!! Don't want to divert the thread. And apology for
comparing with another RDBMS, but curious to know the difference in
behaviour.

With respect to the storage part:- In Oracle database we were supposed to
keep the frequently accessed column first and less frequently accessed
columns towards last in the table while creating table DDL. This used to
help the query performance while accessing those columns as it's a row
store database. Are there any such performance implications exists in
postgres? And there the data types like varchar used to only occupy the
space which the real data/value contains.

But here in postgres, it seems a bit different as it has significant
overhead in storage, as these seem like being fixed length data types and
will occupy the space no matter what the read data holds. So curious to
know, if in this case will there be any performance impact accessing those
columns, if such a column which is frequently accessed but has been put
towards the end of the table because of storage space saving?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Shaheed Haque 2024-02-06 07:31:40 Re: Improving pg_dump performance when handling large numbers of LOBs
Previous Message Lok P 2024-02-06 05:15:01 Re: How to do faster DML