GIN index not used if created in the same transaction as query

From: Adam Brusselback <adambrusselback(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: GIN index not used if created in the same transaction as query
Date: 2017-05-19 15:14:05
Message-ID: CAMjNa7c_=5pZL8T-kBhiRb0=an=DbpGg5jsYZ9guDs8i75jwcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hey all, first off, i'm running: PostgreSQL 9.6.3 on x86_64-pc-linux-gnu,
compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit

At the high level, I am having an issue with a query not using an index,
and in a very hard to reproduce way.

I have a function which builds two temp tables, fills each with data (in
multiple steps), creates a gin index on one of the tables, analyzes each
table, then runs a query joining the two.

My issue is, I am getting inconsistent results for if the query will use
the index or not (with the exact same data each time, and no differences in
the stats stored on the table between using the index or not).

If I just run the function, it will never use the index and the query will
not finish.

If I pull the queries out of the function and run them manually, it will
often use the index, but sometimes it won't, I can't make any sense of
why/when it will use it vs not using it.

I tried to create a test case I could attach to this email by just saving
the results of the temp tables, pg_dumping them, and creating a script to
re-create the temp tables with that data and continue on with the index
creation / analyze / query... but when I try that it runs perfectly (using
the index) every time.

I've attached the test case, because it contains all the schema and query,
regardless of if I can't make it reproducible.

Run the query_help_dump.sql first to populate regular tables, then within
the query_help_test_case.sql I was attempting to replicate the same (very
simplified) workflow that happens in my function, to no avail.

I cannot run an explain analyze on the query when it doesn't use the index,
because it will not finish in a reasonable amount of time (let it run for
12 hours so far).

query without index:
GroupAggregate (cost=23622602.94..23622603.80 rows=43 width=48)
Group Key: r.row_id
-> Sort (cost=23622602.94..23622603.04 rows=43 width=20)
Sort Key: r.row_id
-> Nested Loop (cost=0.00..23622601.77 rows=43 width=20)
Join Filter: ((r.delivery_date <@ con.date_range) AND
(r.contractee_company_ids && con.contractee_company_id) AND
((r.distributor_company_ids && con.distributor_company_id) OR
(con.distributor_company_id IS NULL)) AND (r.product_ids && con.product_id))
-> Seq Scan on _import_invoice_product_contract_match r
(cost=0.00..3525.52 rows=86752 width=145)
-> Materialize (cost=0.00..874.50 rows=12100 width=542)
-> Seq Scan on _contract_claim_match con
(cost=0.00..814.00 rows=12100 width=542)

query with index:
GroupAggregate (cost=137639.13..137639.99 rows=43 width=48) (actual
time=3944.309..4093.798 rows=57966 loops=1)
Group Key: r.row_id
-> Sort (cost=137639.13..137639.24 rows=43 width=20) (actual
time=3944.280..3992.348 rows=145312 loops=1)
Sort Key: r.row_id
Sort Method: external merge Disk: 4256kB
-> Nested Loop (cost=0.02..137637.97 rows=43 width=20) (actual
time=0.091..3701.039 rows=145312 loops=1)
-> Seq Scan on _import_invoice_product_contract_match r
(cost=0.00..3525.52 rows=86752 width=145) (actual time=0.011..46.663
rows=86752 loops=1)
-> Bitmap Heap Scan on _contract_claim_match con
(cost=0.02..1.54 rows=1 width=542) (actual time=0.033..0.040 rows=2
loops=86752)
Recheck Cond: ((r.contractee_company_ids &&
contractee_company_id) AND (r.product_ids && product_id))
Filter: ((r.delivery_date <@ date_range) AND
((r.distributor_company_ids && distributor_company_id) OR
(distributor_company_id IS NULL)))
Rows Removed by Filter: 8
Heap Blocks: exact=793072
-> Bitmap Index Scan on idx_tmp_contract_claim_match
(cost=0.00..0.02 rows=1 width=0) (actual time=0.023..0.023 rows=10
loops=86752)
Index Cond: ((r.contractee_company_ids &&
contractee_company_id) AND (r.product_ids && product_id))
Planning time: 0.804 ms
Execution time: 4106.043 ms


query_help_dump.sql
<https://drive.google.com/file/d/0BzxeqZ1lbi6RazBDbjdBbUNMbFk/view?usp=drive_web>
​​
query_help_test_case.sql
<https://drive.google.com/file/d/0BzxeqZ1lbi6RT3hMbXd1WVpqTjg/view?usp=drive_web>

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2017-05-19 15:33:21 Re: GIN index not used if created in the same transaction as query
Previous Message Jeff Janes 2017-05-16 16:54:13 Re: postgres_fdw and column casting shippability