From: | Roman Kushnir <roman(dot)kushnir(at)ad2games(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Slow join |
Date: | 2018-06-25 15:55:49 |
Message-ID: | 0B4D190F-CC48-4763-918D-4D4A8DAE5B0B@ad2games.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello,
The following basic inner join is taking too much time for me. (I’m using count(videos.id <http://videos.id/>) instead of count(*) because my actual query looks different, but I simplified it here to the essence).
I’ve tried following random people's suggestions and adjusting the random_page_cost(decreasing it from 4 to 1.1) without a stable improvement. Any hints on what is wrong here? Thank you.
The query
SELECT COUNT(videos.id) FROM videos JOIN accounts ON accounts.channel_id = videos.channel_id;
The accounts table has 744 rows, videos table has 2.2M rows, the join produces 135k rows.
Running on Amazon RDS, with default 10.1 parameters
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit
Execution plan https://explain.depesz.com/s/gf7 <https://explain.depesz.com/s/gf7>
Structure and statistics of the tables involved
=> \d videos
Table "public.videos"
Column | Type | Collation | Nullable | Default
------------------------+-----------------------------+-----------+----------+---------------------------------------------------
id | bigint | | not null | nextval('videos_id_seq'::regclass)
vendor_id | character varying | | not null |
channel_id | bigint | | |
published_at | timestamp without time zone | | |
title | text | | |
description | text | | |
thumbnails | jsonb | | |
tags | character varying[] | | |
category_id | character varying | | |
default_language | character varying | | |
default_audio_language | character varying | | |
duration | integer | | |
stereoscopic | boolean | | |
hd | boolean | | |
captioned | boolean | | |
licensed | boolean | | |
projection | character varying | | |
privacy_status | character varying | | |
license | character varying | | |
embeddable | boolean | | |
terminated_at | timestamp without time zone | | |
created_at | timestamp without time zone | | not null |
updated_at | timestamp without time zone | | not null |
featured_game_id | bigint | | |
Indexes:
"videos_pkey" PRIMARY KEY, btree (id)
"index_videos_on_vendor_id" UNIQUE, btree (vendor_id)
"index_videos_on_channel_id" btree (channel_id)
"index_videos_on_featured_game_id" btree (featured_game_id)
Foreign-key constraints:
"fk_rails_257f68ae55" FOREIGN KEY (channel_id) REFERENCES channels(id)
"fk_rails_ce1b3e10b0" FOREIGN KEY (featured_game_id) REFERENCES games(id)
Referenced by:
TABLE "video_fetch_statuses" CONSTRAINT "fk_rails_3bfdf013b8" FOREIGN KEY (video_id) REFERENCES videos(id)
TABLE "video_daily_facts" CONSTRAINT "fk_rails_dc0eca9ebb" FOREIGN KEY (video_id) REFERENCES videos(id)
=> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='videos’;
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
-----------------------+----------+-------------+---------------+---------+----------+----------------+------------+---------------
videos | 471495 | 2.25694e+06 | 471389 | r | 24 | f | | 4447764480
=> SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename='videos' ORDER BY 1 DESC;
frac_mcv | tablename | attname | n_distinct | n_mcv | n_hist
----------+-----------------------+----------------+------------+-------+--------
0.1704 | videos | channel_id | 1915 | 100 | 101
=> \d accounts
Table "public.accounts"
Column | Type | Collation | Nullable | Default
----------------+-----------------------------+-----------+----------+--------------------------------------------------
id | bigint | | not null | nextval('accounts_id_seq'::regclass)
channel_id | bigint | | not null |
refresh_token | character varying | | not null |
created_at | timestamp without time zone | | not null |
updated_at | timestamp without time zone | | not null |
Indexes:
"accounts_pkey" PRIMARY KEY, btree (id)
"index_accounts_on_channel_id" UNIQUE, btree (channel_id)
"index_accounts_on_refresh_token" UNIQUE, btree (refresh_token)
Foreign-key constraints:
"fk_rails_11d6d9bea2" FOREIGN KEY (channel_id) REFERENCES channels(id)
=> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='accounts’;
relname | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------------------+----------+-----------+---------------+---------+----------+----------------+------------+---------------
accounts | 23 | 744 | 23 | r | 5 | f | | 229376
=> SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename='accounts' ORDER BY 1 DESC;
frac_mcv | tablename | attname | n_distinct | n_mcv | n_hist
----------+----------------------+----------------+------------+-------+--------
| accounts | channel_id | -1 | | 101
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2018-06-25 16:45:22 | Re: Slow join |
Previous Message | Amit Kapila | 2018-06-22 09:23:36 | Re: add default parallel query to v10 release notes? (Re: [PERFORM] performance drop after upgrade (9.6 > 10)) |