bitmap scan much slower than index scan, hash_search_with_hash_value

From: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: bitmap scan much slower than index scan, hash_search_with_hash_value
Date: 2012-09-02 05:21:13
Message-ID: alpine.LRH.2.02.1209020540470.25232@calx046.ast.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than
index scan which seems to be caused by 1) very big table 2) some hash
search logic (hash_search_with_hash_value )

Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1
where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual time=27.512..66620.231 rows=10000 loops=1)
-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) (actual time=27.511..66616.807 rows=10000 loops=1)
-> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) (actual time=0.010..4.685 rows=11456 loops=1)
-> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 width=40) (actual time=5.807..5.807 rows=1 loops=11456)
Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) (actual time=5.777..5.777 rows=0 loops=11456)
-> Bitmap Index Scan on data_mjd_idx (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 loops=11456)
Index Cond: (mjd = t.mjd)
-> Bitmap Index Scan on data_objid_idx (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 66622.026 ms
(11 rows)

Here is the output when bitmap scans are disabled: QUERY PLAN
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual time=0.082..906.876 rows=10000 loops=1)
-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual time=0.081..905.683 rows=10000 loops=1)
Join Filter: (t.mjd = d1.mjd)
-> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) (actual time=0.009..1.679 rows=11456 loops=1)
-> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456)
Index Cond: (objid = t.objid)
Total runtime: 907.462 ms

When the bitmap scans are enabled the "prof" of postgres shows
47.10% postmaster postgres [.] hash_search_with_hash_value
|
--- hash_search_with_hash_value

11.06% postmaster postgres [.] hash_seq_search
|
--- hash_seq_search

6.95% postmaster postgres [.] hash_any
|
--- hash_any

5.17% postmaster postgres [.] _bt_checkkeys
|
--- _bt_checkkeys

4.07% postmaster postgres [.] tbm_add_tuples
|
--- tbm_add_tuples

3.41% postmaster postgres [.] hash_search
|
--- hash_search

And the last note is that the crts.data table which is being bitmap scanned is
a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,
Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test
Table "koposov.test"
Column | Type | Modifiers
---------+------------------+-----------
mjd | double precision |
fieldid | bigint |
intmag | integer |
objid | bigint |

wsdb=> \d crts.data
Table "crts.data"
Column | Type | Modifiers
--------+------------------+-----------
objid | bigint |
mjd | double precision |
mag | real |
emag | real |
ra | double precision |
dec | double precision |
Indexes:
"data_mjd_idx" btree (mjd) WITH (fillfactor=100)
"data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH (fillfactor=100)

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap scans,
and I can live with disabling them, but I still thought it's worth reporting.

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2012-09-02 07:23:49 Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Previous Message Pavel Stehule 2012-09-02 05:13:19 Fwd: PATCH: psql boolean display