Re: Huge Tables

From: André Rodrigues <db(dot)andre(at)gmail(dot)com>
To: Rick Otten <rottenwindfish(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Huge Tables
Date: 2023-03-13 12:51:43
Message-ID: CAPaZWvYNQ-5VpuN=EKG2o9oJoYTjwtA3-5Whxb8RJ5y=R6ZaFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks a Million for your tips !!!!!
Very very good !

Em seg., 13 de mar. de 2023 às 12:45, Rick Otten <rottenwindfish(at)gmail(dot)com>
escreveu:

> 300M rows isn't "huge", but it is starting to get to be real data.
>
> Some notes/very general rules of thumb since you asked a very general
> question:
> 1. Consider updating the statistics on the table from the default sample
> of 100 rows to something larger - especially if you have a wide variety of
> data. (either set on a per-table basis or set globally on your database
> with the `default_statistics_target` parameter.
> 2. Consider the `create statistics` command to see if there any other
> additional hints you can give the planner to help figure out if columns are
> related.
> 3. If you partition:
> a. Your queries could be _slower_ if they don't include the partition
> criteria. So partition on something you are likely to almost always want
> to filter on anyhow. That way you can take advantage of "partition
> pruning".
> b. One of the main advantages of partitioning is to be able to
> archive old data easily - either by moving it to other tables, dropping it,
> or doing other things with it. Think about whether you ever intend to roll
> out old data and figure out ways partitions might make that easier.
> 4. Consider tweaking `max_parallel_workers` to enable more concurrency if
> you are running a lot of big queries on your larger table.
> a. There are a number of other `*parallel*` parameters you can study
> and tune as well.
> 5. Consider bumping `work_mem` if you are running queries that are doing
> a lot of sorting and other intermediary work on the larger data sets.
> 6. For a table with only 300M rows, btree is going to be fine for most
> use cases. If you have a monotonically increasing/decreasing column you
> may be able to use a BRIN index on it to save a little space and make for
> slightly more efficient query.
> 7. You may want to tweak the vacuum parameters to be able to use a little
> more memory and more parallel processing. Since autovacuums are triggered
> by a percentage of change in the table, you may want to lower the
> percentage of rows that trigger the vacuums.
>
> You'll need to get a lot more specific about the issues you are running
> into for us to be able to provide more specific recommendations
>
>
> On Sat, Mar 11, 2023 at 6:48 AM André Rodrigues <db(dot)andre(at)gmail(dot)com>
> wrote:
>
>> Hello Guys
>>
>> Regarding a particular performance + architecture situation with postgres
>> 12, I have a table with 300 millions rows and then I ask you, which basic
>> approach like *parameters in postgres.conf*, suitable index type ,
>> partitions type, would you suggest me knowing that we have Queries using
>> bind with range id ( primary Key ) + 1 or 2 columns ?
>>
>>
>> Best regards
>> Andre
>>
>>
>>

--
Atenciosamente,
*André Rodrigues *

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2023-03-14 17:54:35 Re: multicolumn partitioning help
Previous Message Rick Otten 2023-03-13 12:45:14 Re: Huge Tables