From: | Craig James <cjames(at)emolecules(dot)com> |
---|---|
To: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Can't get two index scans |
Date: | 2016-06-22 16:03:35 |
Message-ID: | CAFwQ8reztx+Kw7pGoGTMyD2xt25_Z5Vrt5r4qB3AQ=ybG43QHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I'm working with a third-party plugin that does chemistry. It's very fast.
However, I'm trying to do a sampling query, such as the first 1% of the
database, and I just can't get the planner to create a good plan. Here is
the full query (the |>| operator does a subgraph match of a molecular
substructure, in this case benzene, to find all molecules that have a
benzene ring in the database):
explain analyze select * from version where smiles |>| 'c1ccccc1';
...
Index Scan using i_version_smiles on version (cost=3445.75..147094.03
rows=180283 width=36) (actual time=336.493..10015.753
rows=180973 loops=1)
Index Cond: (smiles |>| 'c1ccccc1'::molecule)
Planning time: 1.228 ms
Execution time: 10371.903 ms
Ten seconds over 263,000 molecules, which is actually good. Now let's limit
it to the first 1% of the rows:
explain analyze select * from version where smiles |>| 'c1ccccc1' and
version_id < 897630;
...
Index Scan using pk_version on version (cost=0.42..131940.05 rows=1643
width=36) (actual time=6.122..2816.298 rows=2039 loops=1)
Index Cond: (version_id < 897630)
Filter: (smiles |>| 'c1ccccc1'::molecule)
Rows Removed by Filter: 590
Planning time: 1.217 ms
Execution time: 2822.117 ms
Notice that it doesn't use the i_version_smiles index at all, but instead
applies the very expensive filter |>| to all 1% of the database. So instead
of getting a 100x speedup, we only get a 3x speedup, about 30x worse that
what is theoretically possible.
The production database is about 50x larger than this test database.
Maybe I misunderstand what's possible with indexes, but it seems to me that
it could first do the pk_version index scan, and then use the results of
that to do a limited index-scan search using the i_version_smiles index. Is
that not possible? Is each index scan "self contained", that is, it doesn't
take into account the results of another index scan?
Thanks,
Craig
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-06-22 18:36:31 | Re: Can't get two index scans |
Previous Message | jonescam | 2016-06-20 14:00:15 | Looking for more Beta Users! |