Join between 2 tables always executes a sequential scan on the larger table

From: Dieter Rehbein <dieter(dot)rehbein(at)skiline(dot)cc>
To: pgsql-performance(at)postgresql(dot)org
Subject: Join between 2 tables always executes a sequential scan on the larger table
Date: 2013-04-02 08:52:01
Message-ID: D2BFB0D2-5EB3-4921-8B0B-C44402C9C03B@skiline.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi everybody,

in a project I have a performance problem, which I (and my colleagues) don't understand. It's a simple join between 2 of 3 tables:

table-1: user (id, user_name, ...). This table has about 1 million rows (999673 rows)
table-2: competition (57 rows)
table-3: user_2_competition. A relation between user and competition. This table has about 100.000 rows

The query is a join between table user_2_competition and user and looks like this:

select u.id, u.user_name
from user_2_competition uc
left join "user" u on u.id = uc.user_id
where uc.competition_id = '3cc1cb9b3ac132ad013ad01316040001'

The query returns the ID and user_name of all users participating in a competition.

What I don't understand: This query executes a sequential scan on user!

The tables have the following indexes:

user_2_competition: there is an index on user_id and an index on competition_id (competition_id is a VARCHAR(32) containing UUIDs)
user: id is the primary key and has therefore a unique index (the ID is a VARCHAR(32), which contains UUIDs).

The database has just been restored from a backup, I've executed ANALYZE for both tables.

The output of explain analyze (Postgres 9.2.3):

Hash Left Join (cost=111357.64..126222.29 rows=41396 width=42) (actual time=1982.543..2737.331 rows=41333 loops=1)
Hash Cond: ((uc.user_id)::text = (u.id)::text)
-> Seq Scan on user_2_competition uc (cost=0.00..4705.21 rows=41396 width=33) (actual time=0.019..89.691 rows=41333 loops=1)
Filter: ((competition_id)::text = '3cc1cb9b3ac132ad013ad01316040001'::text)
Rows Removed by Filter: 80684
-> Hash (cost=90074.73..90074.73 rows=999673 width=42) (actual time=1977.604..1977.604 rows=999673 loops=1)
Buckets: 2048 Batches: 128 Memory Usage: 589kB
-> Seq Scan on "user" u (cost=0.00..90074.73 rows=999673 width=42) (actual time=0.004..1178.827 rows=999673 loops=1)
Total runtime: 2740.723 ms

I expected to see an index-scan on user_2_competition with a hash join to user, not a sequential scan on user. I've tried this with Postgres 9.1 and 9.2.3).

Any ideas, what's going on here?

With EXPLAIN ANALYZE I can see, which query plan Postgres is using. Is there any way to find out, WHY postgres uses this query plan?

best regards
Dieter

----------------------------------------------------

The full table schema:

CREATE TABLE user_2_competition
(
id varchar(32) NOT NULL,
version int4 NOT NULL DEFAULT 0,
conditions_confirm_ip varchar(30),
created_date timestamp NOT NULL DEFAULT now(),
deleted bool NOT NULL DEFAULT false,
last_visit timestamp,
resort_id int4,
role varchar(255),
caid int4 NOT NULL,
ponr int4 NOT NULL,
ktka int4 NOT NULL,
lfnr int4 NOT NULL,
total_visits int8 NOT NULL DEFAULT 0,
verified bool NOT NULL,
competition_id varchar(32),
user_id varchar(32),
competition_terms int4 NOT NULL DEFAULT (-1),
disqualified bool NOT NULL DEFAULT false,
registration_key_id int4,

PRIMARY KEY(id)
);

-- Indexes ------------------------------------------------------------
CREATE INDEX IDX_USER_ID ON user_2_competition USING btree (user_id);
CREATE INDEX idx_user_2_competition_competition ON user_2_competition USING btree (competition_id);
CREATE UNIQUE INDEX user_2_competition_user_id_competition_id_key ON user_2_competition USING btree (user_id, competition_id);

-- Foreign key constraints -------------------------------------------
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_competition_group
FOREIGN KEY (competition_id) REFERENCES competition (id) ON DELETE CASCADE;
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_2_competition_registration_key
FOREIGN KEY (registration_key_id) REFERENCES competition_registration_key (id);
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_terms
FOREIGN KEY (competition_terms) REFERENCES competition_terms (id);
ALTER TABLE user_2_competition
ADD CONSTRAINT fk_user_competition_user
FOREIGN KEY (user_id) REFERENCES user (id) ON DELETE CASCADE;

-----------------

CREATE TABLE competition
(
id varchar(32) NOT NULL,
version int4 NOT NULL DEFAULT 0,
created_by varchar(255),
created_date timestamp,
modified_by varchar(255),
modified_date timestamp,
deleted bool NOT NULL DEFAULT false,
active bool NOT NULL DEFAULT false,
average_score float8,
start_time timestamp NOT NULL,
end_time timestamp NOT NULL,
info_layout varchar(200),
list_layout varchar(200),
lead_action varchar(100),
ranking_layout varchar(200),
external_url varchar(255),
forum_enabled bool NOT NULL DEFAULT false,
has_ski_movies bool NOT NULL DEFAULT false,
link_name varchar(50) NOT NULL,
participation_type varchar(255) NOT NULL,
sponsor varchar(100),
custom_style bool NOT NULL DEFAULT true,
bg_color varchar(7),
tab_style varchar(20),
background_image_preview_upload_date timestamp,
background_image_upload_date timestamp,
sponsor_logo_upload_date timestamp,
name int4 NOT NULL,
short_name int4 NOT NULL,
description int4 NOT NULL,
teaser int4 NOT NULL,
tags varchar(1000),
logo_resort_id int4,
visible bool NOT NULL DEFAULT true,
time_zone_id varchar(32) NOT NULL DEFAULT 'Europe/Vienna'::character varying,
css_styles varchar(2000),
teaser_popup int4 NOT NULL DEFAULT (-1),
winner_tab int4 NOT NULL DEFAULT (-1),
reminder_email int4 NOT NULL DEFAULT (-1),
reminder_email_subject int4 NOT NULL DEFAULT (-1),
priority int4 NOT NULL DEFAULT 5,
instance_selector_class_name varchar(200),
external_sponsor_logo_upload_date timestamp,
customer_id varchar(10),
restricted_registration bool NOT NULL DEFAULT false,

PRIMARY KEY(id)
);

-- Indexes ------------------------------------------------------------
CREATE UNIQUE INDEX idx_competition_link_name ON competition USING btree (link_name);

-- Foreign key constraints -------------------------------------------
ALTER TABLE competition
ADD CONSTRAINT fk_competition_description
FOREIGN KEY (description) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_name
FOREIGN KEY (name) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_reminder_email
FOREIGN KEY (reminder_email) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_reminder_subject
FOREIGN KEY (reminder_email_subject) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_short_name
FOREIGN KEY (short_name) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_teaser
FOREIGN KEY (teaser) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_teaser_popup
FOREIGN KEY (teaser_popup) REFERENCES localized_text (id);
ALTER TABLE competition
ADD CONSTRAINT fk_competition_winner_tab
FOREIGN KEY (winner_tab) REFERENCES localized_text (id);

CREATE TABLE user
(
id varchar(32) NOT NULL,
version int4 NOT NULL DEFAULT 0,
deleted bool NOT NULL DEFAULT false,
about_me varchar(8000),
birth_date date,
communicated_to_ticket_corner timestamp,
conditions_confirm_date timestamp,
email varchar(125) NOT NULL,
fname varchar(50) NOT NULL,
gender varchar(10),
lname varchar(50) NOT NULL,
old_skiline_id int4,
photo_upload_date timestamp,
news_letter bool NOT NULL DEFAULT true,
newsfeed_notification varchar(20),
preferred_language varchar(16),
privacy_address varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_basic_data varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_community_accounts varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_email varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_fitness_profile varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_phone_numbers varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_race_profile varchar(10) NOT NULL DEFAULT 'FRIENDS'::character varying,
privacy_rankings_user_name varchar(10) DEFAULT 'NO_DISPLAY'::character varying,
search_email varchar(125),
search_name varchar(110),
status_points int4 NOT NULL DEFAULT 0,
ticket_corner_id int4,
user_name varchar(50) NOT NULL,
user_name_deleted varchar(50),
address varchar(32) NOT NULL,
current_fitness_profile varchar(32),
race_profile varchar(32) NOT NULL,
custom1 varchar(255),
custom2 varchar(255),
custom3 varchar(255),
magento_customer_id int4,
created_by varchar(255),
created_date timestamp,
modified_by varchar(255),
modified_date timestamp,
newsfeed varchar(32),
birth_day int4,
estimated_gender varchar(10),
current_season_statistics int4 NOT NULL DEFAULT (-1),
statistic_competition_count int4 NOT NULL DEFAULT 0,
statistic_friend_count int4 NOT NULL DEFAULT 0,
statistic_group_count int4 NOT NULL DEFAULT 0,
statistic_skimovie_count_friends int4 NOT NULL DEFAULT 0,
statistic_skimovie_count_public int4 NOT NULL DEFAULT 0,
statistic_skimovie_count_all int4 NOT NULL DEFAULT 0,
statistic_photo_count_public int4 NOT NULL DEFAULT 0,
statistic_photo_count_friends int4 NOT NULL DEFAULT 0,
statistic_photo_count_all int4 NOT NULL DEFAULT 0,
privacy_calendar varchar(10) DEFAULT 'FRIENDS'::character varying,
security_info_id varchar(32),
statistic_skiing_days int4 NOT NULL DEFAULT 0,
statistic_vertical_meters int4 NOT NULL DEFAULT 0,
conditions_confirm_ip varchar(30),
doi_click_ip varchar(30),
staff bool,
origin varchar(32),
disqualified bool,
statistic_badge_count int4 NOT NULL DEFAULT 0,
time_zone_id varchar(32),
old_email varchar(125),
handicap float4,
previous_handicap float4,
handicap_calculation_time timestamp,
last_skiing_day date,
admin_disqualification bool,
admin_disqualification_top100 bool,

PRIMARY KEY(id)
);

-- Indexes ------------------------------------------------------------
CREATE INDEX idx_user_birthdate ON user USING btree (birth_day);
CREATE INDEX idx_user_created_date ON user USING btree (created_date);
CREATE INDEX idx_user_email ON user USING btree (email);
CREATE INDEX idx_user_magento_customer_id ON user USING btree (magento_customer_id);
CREATE INDEX idx_usr_modified_date ON user USING btree (modified_date);
CREATE UNIQUE INDEX user_address_key ON user USING btree (address);
CREATE UNIQUE INDEX user_race_profile_key ON user USING btree (race_profile);
CREATE UNIQUE INDEX user_ticket_corner_id_key ON user USING btree (ticket_corner_id);
CREATE UNIQUE INDEX user_user_name_key ON user USING btree (user_name);

-- Foreign key constraints -------------------------------------------
ALTER TABLE user
ADD CONSTRAINT fk_user_adress
FOREIGN KEY (address) REFERENCES address (id) ON DELETE CASCADE;
ALTER TABLE user
ADD CONSTRAINT fk36ebcbd93f2254
FOREIGN KEY (current_fitness_profile) REFERENCES fitness_profile (id);
ALTER TABLE user
ADD CONSTRAINT fk_user_newsfeed
FOREIGN KEY (newsfeed) REFERENCES newsfeed (id);
ALTER TABLE user
ADD CONSTRAINT fk36ebcbd70f10c
FOREIGN KEY (race_profile) REFERENCES race_profile (id);
ALTER TABLE user
ADD CONSTRAINT fk_user_sec_info
FOREIGN KEY (security_info_id) REFERENCES security_info (id) ON DELETE CASCADE;
ALTER TABLE user
ADD CONSTRAINT fk_user_statistics_current_season
FOREIGN KEY (current_season_statistics) REFERENCES user_season_statistics (id);

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Armand du Plessis 2013-04-02 09:55:14 Re: Problems with pg_locks explosion
Previous Message Mark Kirkwood 2013-04-02 08:47:21 Re: Postgres upgrade, security release, where?