Deleting takes days, should I add some index?

From: Alexander Farber <alexander(dot)farber(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Deleting takes days, should I add some index?
Date: 2020-11-27 14:37:44
Message-ID: CAADeyWjd5wsVznHXEvsVhLjDdo3=-WoLmDQk9n-1ZQgCPGceUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I am using PostgreSQL 10.15 on CentOS 7 with 64 GB RAM, Intel i7 6700 and I
have the following 2 tables there:

words_ru=> \d words_games
Table "public.words_games"
Column | Type | Collation | Nullable |
Default
----------+--------------------------+-----------+----------+------------------------------------------
gid | integer | | not null |
nextval('words_games_gid_seq'::regclass)
created | timestamp with time zone | | not null |
finished | timestamp with time zone | | |
player1 | integer | | not null |
player2 | integer | | |
played1 | timestamp with time zone | | |
played2 | timestamp with time zone | | |
state1 | text | | |
state2 | text | | |
reason | text | | |
hint1 | text | | |
hint2 | text | | |
score1 | integer | | not null |
score2 | integer | | not null |
chat1 | integer | | not null |
chat2 | integer | | not null |
hand1 | character(1)[] | | not null |
hand2 | character(1)[] | | not null |
pile | character(1)[] | | not null |
letters | character(1)[] | | not null |
values | integer[] | | not null |
bid | integer | | not null |
diff1 | integer | | |
diff2 | integer | | |
Indexes:
"words_games_pkey" PRIMARY KEY, btree (gid)
"words_games_player1_coalesce_idx" btree (player1, COALESCE(finished,
'infinity'::timestamp with time zone))
"words_games_player2_coalesce_idx" btree (player2, COALESCE(finished,
'infinity'::timestamp with time zone))
Check constraints:
"words_games_chat1_check" CHECK (chat1 >= 0)
"words_games_chat2_check" CHECK (chat2 >= 0)
"words_games_check" CHECK (player1 <> player2)
"words_games_score1_check" CHECK (score1 >= 0)
"words_games_score2_check" CHECK (score2 >= 0)
Foreign-key constraints:
"words_games_bid_fkey" FOREIGN KEY (bid) REFERENCES words_boards(bid)
ON DELETE CASCADE
"words_games_player1_fkey" FOREIGN KEY (player1) REFERENCES
words_users(uid) ON DELETE CASCADE
"words_games_player2_fkey" FOREIGN KEY (player2) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_chat" CONSTRAINT "words_chat_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_moves" CONSTRAINT "words_moves_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_gid_fkey" FOREIGN KEY
(gid) REFERENCES words_games(gid) ON DELETE CASCADE

words_ru=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
str | text | | |
hand | text | | |
puzzle | boolean | | not null | false
letters | character(1)[] | | |
values | integer[] | | |
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_puzzle_idx" btree (puzzle)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE

My word game is published since beginning of 2018 and I have that many
entries there:

words_ru=> select count(*) from words_games;
count
--------
155585
(1 row)

words_ru=> select count(*) from words_moves;
count
---------
5429162
(1 row)

However I was not saving some important data in the 1st months of 2018, so
I would like to delete those old games:

words_ru=> select count(*) from words_games where finished < '2018-06-01';
count
-------
6223
(1 row)

words_ru=> select count(*) from words_moves where played < '2018-06-01';
count
--------
196319
(1 row)

My problem is - it takes days (I run my command using "screen").

So I ctrl-c (surprisingly not a single record was deleted; I was expecting
at least some to be gone) and then do it one by one month ( delete from
words_games where finished < '2018-01-01' and so on).

And it still takes days :-)

Since I gradually get more users in my game and I will probably have to run
similar tasks in future, I would like to learn if there is some trick for
faster deletion here?

Should I add some index maybe?

words_ru=> EXPLAIN delete from words_games where finished < '2018-06-01';
QUERY PLAN
---------------------------------------------------------------------------------
Delete on words_games (cost=0.00..39991.29 rows=7375 width=6)
-> Seq Scan on words_games (cost=0.00..39991.29 rows=7375 width=6)
Filter: (finished < '2018-06-01 00:00:00+02'::timestamp with time
zone)
(3 rows)

Thank you
Alex

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gustavsson Mikael 2020-11-27 14:48:28 SV: Problem with pg_notify / listen
Previous Message Zwettler Markus (OIZ) 2020-11-27 14:36:28 AW: How to debug authentication issues in Postgres