Postgres using the wrong index index

From: Matt Dupree <matt(dot)dupree(at)heap(dot)io>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Postgres using the wrong index index
Date: 2021-08-10 16:47:20
Message-ID: CAMOk8kqSPr2sTihQ56Va07TeFRA8E1GsEUfwe4t4LSmj5pjd4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I've created a partial index that I expect the query planner to use in
executing a query, but it's using another index instead. Using this other
partial index results in a slower query. I'd really appreciate some help
understanding why this is occurring. Thanks in advance!

*Postgres Version*

PostgreSQL 12.7 (Ubuntu 12.7-1.pgdg20.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit

*Problem Description*

Here's the index I expect the planner to use:

CREATE INDEX other_events_1004175222_pim_evdef_67951aef14bc_idx ON
public.other_events_1004175222 USING btree ("time", user_id) WHERE (
(user_id <= '(1080212440,9007199254740991)'::app_user_id) AND
(user_id >= '(1080212440,0)'::app_user_id) AND
(
(
(type = 'click'::text) AND (library = 'web'::text) AND
(strpos(hierarchy, '#close_onborading;'::text)
<> 0) AND (object IS NULL)
) OR
(
(type = 'click'::text) AND (library = 'web'::text) AND
(strpos(hierarchy,
'#proceedOnboarding;'::text) <> 0) AND (object IS NULL)
)
)
);

Here's the query:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT user_id,
"time",
0 AS event,
session_id
FROM test_yasp_events_exp_1004175222
WHERE ((test_yasp_events_exp_1004175222.user_id >=
'(1080212440,0)'::app_user_id) AND
(test_yasp_events_exp_1004175222.user_id <=
'(1080212440,9007199254740991)'::app_user_id) AND
("time" >=
'1624777200000'::bigint) AND
("time" <=
'1627369200000'::bigint) AND (
(
(type = 'click'::text) AND
(library = 'web'::text) AND
(strpos(hierarchy, '#close_onborading;'::text) <>
0) AND
(object IS NULL)) OR
(
(type = 'click'::text) AND
(library = 'web'::text) AND
(strpos(hierarchy,
'#proceedOnboarding;'::text) <>
0) AND (object IS NULL))))

Here's the plan: https://explain.depesz.com/s/uNGg

Note that the index being used is
other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx, which is
defined this way:

CREATE INDEX other_events_1004175222_pim_core_custom_2_8e65d072fbdd_idx
ON public.other_events_1004175222 USING btree (type, "time", user_id)
WHERE (
(type IS NOT NULL) AND (object IS NULL) AND
((user_id >= '(1080212440,0)'::app_user_id) AND (user_id <=
'(1080212440,9007199254740991)'::app_user_id)))

You can view the definition of test_yasp_events_exp_1004175222 here
<https://pastebin.com/3wYiiTMn>. Note the child tables,
other_events_1004175222, pageviews_1004175222, and sessions_1004175222
which have the following constraints:

other_events_1004175222: CHECK (object IS NULL)
pageviews_1004175222: CHECK (object IS NOT NULL AND object = 'pageview'::text)
sessions_1004175222: CHECK (object IS NOT NULL AND object = 'session'::text)

Also note that these child tables have 100s of partial indexes. You
can find history on why we have things set up this way here
<https://heap.io/blog/running-10-million-postgresql-indexes-in-production>.

Here's the table metadata for other_events_1004175222:

SELECT relname,
relpages,
reltuples,
relallvisible,
relkind,
relnatts,
relhassubclass,
reloptions,
pg_table_size(oid)
FROM pg_class
WHERE relname = 'other_events_1004175222';

Results:

[image: image.png]

--

K. Matt Dupree

Data Science Engineer
321.754.0526 | matt(dot)dupree(at)heap(dot)io

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2021-08-11 12:38:56 Re: Postgres using the wrong index index
Previous Message Alex 2021-08-08 09:43:40 Re: Slow query because lexeme index not used