Re: Table copy with SERIALIZABLE is incredibly slow

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: peter plachta <pplachta(at)gmail(dot)com>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Table copy with SERIALIZABLE is incredibly slow
Date: 2023-07-31 06:30:39
Message-ID: b8e6ebbdf1f89cefe06577f9e864e132ff00a99c.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, 2023-07-30 at 23:00 -0600, peter plachta wrote:
> Background is we're trying a pg_repack-like functionality to compact a 500Gb/145Gb
> index (x2) table from which we deleted 80% rows. Offline is not an option. The table
> has a moderate (let's say 100QPS) I/D workload running.
>
> The typical procedure for this type of thing is basically CDC:
>
> 1. create 'log' table/create trigger
> 2. under SERIALIZABLE: select * from current_table insert into new_table
>
> What we're finding is that for the 1st 30 mins the rate is 10Gb/s, then it drops to
> 1Mb/s and stays there.... and 22 hours later the copy is still going and now the log
> table is huge so we know the replay will also take a very long time.
>
> ===
>
> Q: what are some ways in which we could optimize the copy?
>
> Btw this is Postgres 9.6
>
> (we tried unlogged table (that did nothing), we tried creating indexes after
> (that helped), we're experimenting with RRI)

Why are you doing this the hard way, when pg_squeeze or pg_repack could do it?

You definitely should not be using PostgreSQL 9.6 at this time.

Yours,
Laurenz Albe

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dane Foster 2023-08-04 14:59:19 Plan weirdness. A sort produces more rows than the node beneath it
Previous Message peter plachta 2023-07-31 05:00:15 Table copy with SERIALIZABLE is incredibly slow