Re: Moving data from huge table slow, min() query on indexed column taking 38s

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Moving data from huge table slow, min() query on indexed column taking 38s
Date: 2023-07-21 07:20:37
Message-ID: 68257d84-96e5-9dca-ff31-c9f45e356543@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 6 Jul 2023, Dimitrios Apostolou wrote:
> + Then I broke the command above in many small chunks
>
> WITH rows AS (
> DELETE FROM tbl_legacy AS t
> WHERE (partition_key_column < $1)
> RETURNING t.*
> ) INSERT INTO tbl SELECT * FROM rows;
> COMMIT;
>
> I increase the parameter $1 and keep going in a loop. At first this
> goes OK, after one day though I notice that it has slowed down
> significantly. My investigation shows that the simple query
>
> SELECT min(partition_key_column) from tbl_legacy;
>
> takes 38s, despite having an index on the column! A VACUUM fixes that,
> so I guess the index has a ton of dead tuples. I guess autovacuum does
> not do its job because the table is constantly busy.
>
> Unfortunately VACUUM takes long (several hours) on this huge table, so I
> can't add in the loop after the DELETE command.
>
> Is there a better way to avoid the bloat in the loop above? Why can't
> the DELETE command update the index by pointing the minimum element
> beyond the dead tuples?

Any comments on this one? It annoys me that a simple loop deteriorated so
much and kept filling the table with bloat. What is that VACUUM does that
DELETE can't do, to keep the index fresh?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joseph Koshakow 2023-07-21 14:30:09 aclitem binary encoding
Previous Message Les 2023-07-21 05:42:54 Re: invalid value for parameter "default_text_search_config": "public.pg"