Not sure which part of the query needs optimization

From: Alexander Farber <alexander(dot)farber(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Not sure which part of the query needs optimization
Date: 2019-01-07 13:50:23
Message-ID: CAADeyWgwq2ybD-j81wqwrkjdHL6C8QaWMa+FNuqCAQbUNESqRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Good afternoon,

for each visitor of my website I generate a JSON list of 30 top players (
https://slova.de/words/top.php ), who played in the past week, with their
average scores and average time between moves.

With 5 seconds this query is taking quite a bit of time:
https://explain.depesz.com/s/wMMV

I have noticed the

Seq Scan on words_moves (cost=0.00..81,448.79 rows=1,747 width=4)
(actual time=0.443..161.673 rows=15,009 loops=30)
Filter: (uid = u.uid)
Rows Removed by Filter: 1494728

and added a:

CREATE INDEX ON words_moves(uid);

which has improved the query time to 800ms:
https://explain.depesz.com/s/xgv1

However now I am not sure anymore, what else could be improved in my query:

WITH last_week_moves AS (
SELECT
m.gid,
m.uid,
m.played - LAG(m.played) OVER(PARTITION BY
m.gid ORDER BY played) AS diff
FROM words_moves m
JOIN words_games g ON (m.gid = g.gid AND m.uid in
(g.player1, g.player2))
-- only show players who where active in the last
week
WHERE m.played > CURRENT_TIMESTAMP - interval '1
week'
)
SELECT
u.uid,
u.elo,
(SELECT TO_CHAR(AVG(diff), 'HH24:MI') FROM
last_week_moves WHERE uid = u.uid) AS avg_time,
(SELECT ROUND(AVG(score), 1) FROM words_moves WHERE
uid = u.uid) AS avg_score,
s.given,
s.photo
FROM words_users u
JOIN words_social s USING (uid)
WHERE u.elo > 1500
-- take the most recent record from words_social
AND NOT EXISTS (SELECT 1
FROM words_social x
WHERE s.uid = x.uid
AND x.stamp > s.stamp)
-- only show players who where active in the last week
AND EXISTS (SELECT 1
FROM last_week_moves m
WHERE u.uid = m.uid
AND m.diff IS NOT NULL)
ORDER BY u.elo DESC
LIMIT 30;

I have tried changing to LEFT JOIN LATERAL as in -

WITH last_week_moves AS (
SELECT
m.gid,
m.uid,
m.played - LAG(m.played) OVER(PARTITION BY
m.gid ORDER BY played) AS diff
FROM words_moves m
JOIN words_games g ON (m.gid = g.gid AND m.uid in
(g.player1, g.player2))
-- only show players who where active in the last
week
WHERE m.played > CURRENT_TIMESTAMP - interval '1
week'
)
SELECT
u.uid,
u.elo,
(SELECT TO_CHAR(AVG(diff), 'HH24:MI') FROM
last_week_moves WHERE uid = u.uid) AS avg_time,
(SELECT ROUND(AVG(score), 1) FROM words_moves WHERE
uid = u.uid) AS avg_score,
s.given,
s.photo
FROM words_users u
-- take the most recent record from words_social
LEFT JOIN LATERAL (SELECT * FROM words_social AS s WHERE
s.uid = u.uid ORDER BY s.stamp DESC LIMIT 1) AS s ON TRUE
WHERE u.elo > 1500
-- only show players who where active in the last week
AND EXISTS (SELECT 1
FROM last_week_moves m
WHERE u.uid = m.uid
AND m.diff IS NOT NULL)
ORDER BY u.elo DESC
LIMIT 30;

But this has not helped much (please see https://explain.depesz.com/s/PrF
or the same output below) :

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=89502.34..272686.22 rows=30 width=160) (actual
time=215.796..842.446 rows=30 loops=1)
CTE last_week_moves
-> WindowAgg (cost=87308.08..87316.34 rows=367 width=32) (actual
time=183.075..193.613 rows=33221 loops=1)
-> Sort (cost=87308.08..87309.00 rows=367 width=16) (actual
time=183.069..184.359 rows=33221 loops=1)
Sort Key: m_1.gid, m_1.played
Sort Method: quicksort Memory: 4132kB
-> Gather (cost=13632.94..87292.45 rows=367 width=16)
(actual time=37.204..172.827 rows=33221 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=12632.94..86255.75 rows=153
width=16) (actual time=36.183..168.676 rows=11074 loops=3)
Hash Cond: (m_1.gid = g.gid)
Join Filter: ((m_1.uid = g.player1) OR
(m_1.uid = g.player2))
-> Parallel Seq Scan on words_moves m_1
(cost=0.00..73600.05 rows=8666 width=16) (actual time=0.761..130.844
rows=11074 loops=3)
Filter: (played > (CURRENT_TIMESTAMP -
'7 days'::interval))
Rows Removed by Filter: 492241
-> Hash (cost=12007.42..12007.42 rows=50042
width=12) (actual time=35.236..35.236 rows=50044 loops=3)
Buckets: 65536 Batches: 1 Memory
Usage: 2663kB
-> Seq Scan on words_games g
(cost=0.00..12007.42 rows=50042 width=12) (actual time=0.007..28.981
rows=50044 loops=3)
-> Result (cost=2186.01..1278367.01 rows=209 width=160) (actual
time=215.795..842.425 rows=30 loops=1)
-> Sort (cost=2186.01..2186.53 rows=209 width=96) (actual
time=205.069..205.087 rows=30 loops=1)
Sort Key: u.elo DESC
Sort Method: top-N heapsort Memory: 32kB
-> Nested Loop Left Join (cost=12.19..2179.83 rows=209
width=96) (actual time=203.009..205.040 rows=110 loops=1)
-> Hash Semi Join (cost=11.90..440.43 rows=209
width=8) (actual time=202.984..204.451 rows=110 loops=1)
Hash Cond: (u.uid = m.uid)
-> Seq Scan on words_users u
(cost=0.00..415.96 rows=418 width=8) (actual time=0.005..1.422 rows=418
loops=1)
Filter: (elo > 1500)
Rows Removed by Filter: 10139
-> Hash (cost=7.34..7.34 rows=365 width=4)
(actual time=202.975..202.975 rows=31549 loops=1)
Buckets: 32768 (originally 1024) Batches:
1 (originally 1) Memory Usage: 1366kB
-> CTE Scan on last_week_moves m
(cost=0.00..7.34 rows=365 width=4) (actual time=183.080..200.197
rows=31549 loops=1)
Filter: (diff IS NOT NULL)
Rows Removed by Filter: 1672
-> Limit (cost=0.29..8.30 rows=1 width=180) (actual
time=0.005..0.005 rows=1 loops=110)
-> Index Scan using words_social_uid_stamp_idx
on words_social s (cost=0.29..8.30 rows=1 width=180) (actual
time=0.005..0.005 rows=1 loops=110)
Index Cond: (uid = u.uid)
SubPlan 2
-> Aggregate (cost=8.27..8.28 rows=1 width=32) (actual
time=1.838..1.838 rows=1 loops=30)
-> CTE Scan on last_week_moves (cost=0.00..8.26 rows=2
width=16) (actual time=0.176..1.775 rows=403 loops=30)
Filter: (uid = u.uid)
Rows Removed by Filter: 32818
SubPlan 3
-> Aggregate (cost=6097.83..6097.84 rows=1 width=32) (actual
time=19.401..19.401 rows=1 loops=30)
-> Bitmap Heap Scan on words_moves (cost=33.97..6093.45
rows=1748 width=4) (actual time=1.680..18.153 rows=15011 loops=30)
Recheck Cond: (uid = u.uid)
Heap Blocks: exact=216312
-> Bitmap Index Scan on words_moves_uid_idx
(cost=0.00..33.54 rows=1748 width=0) (actual time=0.979..0.979 rows=15011
loops=30)
Index Cond: (uid = u.uid)
Planning time: 0.508 ms
Execution time: 843.322 ms
(50 rows)

Does anybody please have a suggestion, what else could be improved?

Below are the tables from the above query and I use PostgreSQL 10.6 -

words=> \d words_moves
Table "public.words_moves"
Column | Type | Collation | Nullable |
Default
---------+--------------------------+-----------+----------+------------------------------------------
mid | bigint | | not null |
nextval('words_moves_mid_seq'::regclass)
action | text | | not null |
gid | integer | | not null |
uid | integer | | not null |
played | timestamp with time zone | | not null |
tiles | jsonb | | |
score | integer | | |
letters | text | | |
hand | text | | |
puzzle | boolean | | not null | false
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE

words=> \d words_users
Table "public.words_users"
Column | Type | Collation | Nullable |
Default
---------------+--------------------------+-----------+----------+------------------------------------------
uid | integer | | not null |
nextval('words_users_uid_seq'::regclass)
created | timestamp with time zone | | not null |
visited | timestamp with time zone | | not null |
ip | inet | | not null |
fcm | text | | |
apns | text | | |
adm | text | | |
motto | text | | |
vip_until | timestamp with time zone | | |
grand_until | timestamp with time zone | | |
banned_until | timestamp with time zone | | |
banned_reason | text | | |
elo | integer | | not null |
medals | integer | | not null |
coins | integer | | not null |
Indexes:
"words_users_pkey" PRIMARY KEY, btree (uid)
Check constraints:
"words_users_banned_reason_check" CHECK (length(banned_reason) > 0)
"words_users_elo_check" CHECK (elo >= 0)
"words_users_medals_check" CHECK (medals >= 0)
Referenced by:
TABLE "words_chat" CONSTRAINT "words_chat_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_games" CONSTRAINT "words_games_player1_fkey" FOREIGN KEY
(player1) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_games" CONSTRAINT "words_games_player2_fkey" FOREIGN KEY
(player2) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_moves" CONSTRAINT "words_moves_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_reviews" CONSTRAINT "words_reviews_author_fkey" FOREIGN
KEY (author) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_reviews" CONSTRAINT "words_reviews_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_social" CONSTRAINT "words_social_uid_fkey" FOREIGN KEY
(uid) REFERENCES words_users(uid) ON DELETE CASCADE
TABLE "words_stats" CONSTRAINT "words_stats_uid_fkey" FOREIGN KEY (uid)
REFERENCES words_users(uid) ON DELETE CASCADE

words=> \d words_games
Table "public.words_games"
Column | Type | Collation | Nullable |
Default
----------+--------------------------+-----------+----------+------------------------------------------
gid | integer | | not null |
nextval('words_games_gid_seq'::regclass)
created | timestamp with time zone | | not null |
finished | timestamp with time zone | | |
player1 | integer | | not null |
player2 | integer | | |
played1 | timestamp with time zone | | |
played2 | timestamp with time zone | | |
state1 | text | | |
state2 | text | | |
reason | text | | |
hint1 | text | | |
hint2 | text | | |
score1 | integer | | not null |
score2 | integer | | not null |
chat1 | integer | | not null |
chat2 | integer | | not null |
hand1 | character(1)[] | | not null |
hand2 | character(1)[] | | not null |
pile | character(1)[] | | not null |
letters | character(1)[] | | not null |
values | integer[] | | not null |
bid | integer | | not null |
friendly | boolean | | |
Indexes:
"words_games_pkey" PRIMARY KEY, btree (gid)
"words_games_player1_coalesce_idx" btree (player1, COALESCE(finished,
'infinity'::timestamp with time zone))
"words_games_player2_coalesce_idx" btree (player2, COALESCE(finished,
'infinity'::timestamp with time zone))
Check constraints:
"words_games_chat1_check" CHECK (chat1 >= 0)
"words_games_chat2_check" CHECK (chat2 >= 0)
"words_games_check" CHECK (player1 <> player2)
"words_games_score1_check" CHECK (score1 >= 0)
"words_games_score2_check" CHECK (score2 >= 0)
Foreign-key constraints:
"words_games_bid_fkey" FOREIGN KEY (bid) REFERENCES words_boards(bid)
ON DELETE CASCADE
"words_games_player1_fkey" FOREIGN KEY (player1) REFERENCES
words_users(uid) ON DELETE CASCADE
"words_games_player2_fkey" FOREIGN KEY (player2) REFERENCES
words_users(uid) ON DELETE CASCADE
Referenced by:
TABLE "words_chat" CONSTRAINT "words_chat_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_moves" CONSTRAINT "words_moves_gid_fkey" FOREIGN KEY (gid)
REFERENCES words_games(gid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_gid_fkey" FOREIGN KEY
(gid) REFERENCES words_games(gid) ON DELETE CASCADE

Thank you
Alex

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Farber 2019-01-07 13:56:47 Re: Adding LEFT JOIN to a query has increased execution time 10 times
Previous Message Ron 2019-01-07 13:49:16 Re: Which queries have run query trace form .exe.