Wrong index selection

From: Piotr Gasidło <quaker(at)barbara(dot)eu(dot)org>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Wrong index selection
Date: 2014-01-17 22:57:54
Message-ID: CAF8akQtr9ENiroROUmyRe_gpW5aDf-5WNOaewBSYRizvx6CPtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

For Postgresql:

> select version();
version

-------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3.2 on amd64-portbld-freebsd9.2, compiled by cc (GCC) 4.2.1
20070831 patched [FreeBSD], 64-bit

For table:

> \d core.cookies2tags
Tabela "core.cookies2tags"
Kolumna | Typ | Modyfikatory

-----------------------+-----------------------------+--------------------------
co2ta_co_id | integer | niepusty
co2ta_cl_id | integer | niepusty
co2ta_ta_id | integer | niepusty
co2ta_ta_ukey_id | text |
co2ta_ta_ukey_hash | character(40) |
co2ta_fpr_id | integer |
co2ta_date_first | timestamp without time zone | niepusty domyślnie
now()
co2ta_date_last | timestamp without time zone | niepusty domyślnie
now()
co2ta_count_all | integer | niepusty domyślnie 1
co2ta_count_1 | integer | niepusty domyślnie 1
co2ta_date_1 | date | niepusty
co2ta_datelist_date | date |
co2ta_datelist_counts | integer[] |
co2ta_ta_params | hstore |
co2ta_fca_id | integer |
co2ta_mco_id | integer |
Indeksy:
"cookies2tags_ukey1" UNIQUE, btree (co2ta_co_id, co2ta_cl_id,
co2ta_ta_id, co2ta_ta_ukey_hash) WHERE co2ta_ta_ukey_hash IS NOT NULL
"cookies2tags_ukey2" UNIQUE, btree (co2ta_co_id, co2ta_cl_id,
co2ta_ta_id) WHERE co2ta_ta_ukey_hash IS NULL
"cookies2tags_co_id_key" btree (co2ta_co_id)
"cookies2tags_co_id_key2" btree (co2ta_co_id, co2ta_cl_id)
"cookies2tags_key1" btree (co2ta_cl_id, co2ta_ta_id, co2ta_ta_ukey_hash)
"cookies2tags_key2" btree (co2ta_cl_id, co2ta_ta_ukey_hash) WHERE
co2ta_fpr_id IS NULL AND (co2ta_ta_id = ANY (ARRAY[1, 2, 3, 4]))
"cookies2tags_key3" btree (co2ta_cl_id, co2ta_ta_id, co2ta_date_1)
"cookies2tags_key4" btree (co2ta_mco_id)
"idx_co_id_date_last" btree (co2ta_co_id, co2ta_date_last)

Table is rather big (about 150M rows).

For this query:

WITH s AS (
SELECT
co2ta_co_id AS co_id,
co2ta_ta_id AS ta_id,
MIN(co2ta_date_last) AS co2ta_date_last_min,
MAX(co2ta_date_last) AS co2ta_date_last_max,
COUNT(DISTINCT(co2ta_ta_ukey_hash)) AS co2ta_ta_ukey_count,
1
FROM
core.cookies2tags co2ta
WHERE
co2ta.co2ta_co_id =
ANY('{"1","123567429","123872617","123929118","123930244","123935996","123937156","123944495","123944999","123945469"}'::int[])
AND
co2ta.co2ta_cl_id = 97 AND
co2ta.co2ta_ta_id = ANY('{"142"}'::int[])
GROUP BY
ta_id,
co_id
)
SELECT
*
FROM
s
UNION ALL
SELECT
s.co_id,
NULL,
MIN(s.co2ta_date_last_min),
MAX(s.co2ta_date_last_min),
NULL,
1
FROM
s
GROUP BY
s.co_id

i get following plan:

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=49.38..49.44 rows=2 width=36) (actual time=39.009..39.034
rows=16 loops=1)
CTE s
-> GroupAggregate (cost=49.35..49.38 rows=1 width=57) (actual
time=39.006..39.016 rows=8 loops=1)
-> Sort (cost=49.35..49.35 rows=1 width=57) (actual
time=38.993..38.993 rows=8 loops=1)
Sort Key: co2ta.co2ta_ta_id, co2ta.co2ta_co_id
Sort Method: quicksort Memory: 25kB
-> Index Scan using cookies2tags_key3 on cookies2tags
co2ta (cost=0.57..49.34 rows=1 width=57) (actual time=38.339..38.982
rows=8 loops=1)
Index Cond: ((co2ta_cl_id = 97) AND (co2ta_ta_id =
ANY ('{142}'::integer[])))
Filter: (co2ta_co_id = ANY
('{1,123567429,123872617,123929118,123930244,123935996,123937156,123944495,123944999,123945469}'::integer[]))
Rows Removed by Filter: 32120
-> CTE Scan on s (cost=0.00..0.02 rows=1 width=36) (actual
time=39.008..39.021 rows=8 loops=1)
-> HashAggregate (cost=0.03..0.04 rows=1 width=12) (actual
time=0.009..0.010 rows=8 loops=1)
-> CTE Scan on s s_1 (cost=0.00..0.02 rows=1 width=12) (actual
time=0.000..0.001 rows=8 loops=1)
Total runtime: 39.079 ms

But if i remove one of co2ta_co_id in query (eq. "1") i get:

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=45.28..45.35 rows=2 width=36) (actual time=0.233..0.255
rows=16 loops=1)
CTE s
-> GroupAggregate (cost=45.25..45.28 rows=1 width=57) (actual
time=0.230..0.241 rows=8 loops=1)
-> Sort (cost=45.25..45.26 rows=1 width=57) (actual
time=0.224..0.225 rows=8 loops=1)
Sort Key: co2ta.co2ta_ta_id, co2ta.co2ta_co_id
Sort Method: quicksort Memory: 25kB
-> Index Scan using cookies2tags_co_id_key2 on
cookies2tags co2ta (cost=0.58..45.24 rows=1 width=57) (actual
time=0.031..0.215 rows=8 loops=1)
Index Cond: ((co2ta_co_id = ANY
('{123567429,123872617,123929118,123930244,123935996,123937156,123944495,123944999,123945469}'::integer[]))
AND (co2ta_cl_id = 97))
Filter: (co2ta_ta_id = ANY ('{142}'::integer[]))
Rows Removed by Filter: 187
-> CTE Scan on s (cost=0.00..0.02 rows=1 width=36) (actual
time=0.232..0.244 rows=8 loops=1)
-> HashAggregate (cost=0.03..0.04 rows=1 width=12) (actual
time=0.007..0.009 rows=8 loops=1)
-> CTE Scan on s s_1 (cost=0.00..0.02 rows=1 width=12) (actual
time=0.001..0.001 rows=8 loops=1)
Total runtime: 0.321 ms

This plan is much faster. I notice that if I put more co2ta_co_id values in
query than some threshold PostgreSQL creates unoptimal plan.

I wonder what should I tune, to get PostgreSQL use other index for queries
with more co2ta_co_id values in query?
Currently as hotfix I split input values, to execute more queries with less
co2ta_co_id values.

--
Piotr Gasidło

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-01-18 00:33:44 Re: Wrong index selection
Previous Message Igor Neyman 2014-01-16 15:41:49 Re: Issue with query scanning through all data even with indexes