Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Date: 2012-09-02 11:56:40
Message-ID: alpine.LRH.2.02.1209021242510.25232@calx046.ast.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2 Sep 2012, Pavel Stehule wrote:

>
> statistics on data_objid_idx table are absolutly out - so planner
> cannot find optimal plan

That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the
number of rows is known to be incorrect, in the case of column
correlation.
Third, according at least to my understanding in the fully cached regime
bitmap scan should not take two orders of magnitude more CPU time than
index scan.

Sergey
>
> Regard
>
> Pavel Stehule
>
>> Index Cond: (objid = t.objid)
>> Total runtime: 66622.026 ms
>> (11 rows)
>>
>> Here is the output when bitmap scans are disabled:
>> QUERY PLAN
>> QUERY
>> PLAN
>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>> Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual
>> time=0.082..906.876 rows=10000 loops=1)
>> -> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual
>> time=0.081..905.683 rows=10000 loops=1)
>> Join Filter: (t.mjd = d1.mjd)
>> -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28)
>> (actual time=0.009..1.679 rows=11456 loops=1)
>> -> Index Scan using data_objid_idx on data d1
>> (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
>> rows=248 loops=11456)
>> Index Cond: (objid = t.objid)
>> Total runtime: 907.462 ms
>>
>> When the bitmap scans are enabled the "prof" of postgres shows
>> 47.10% postmaster postgres [.] hash_search_with_hash_value
>> |
>> --- hash_search_with_hash_value
>>
>> 11.06% postmaster postgres [.] hash_seq_search
>> |
>> --- hash_seq_search
>>
>> 6.95% postmaster postgres [.] hash_any
>> |
>> --- hash_any
>>
>> 5.17% postmaster postgres [.] _bt_checkkeys
>> |
>> --- _bt_checkkeys
>>
>> 4.07% postmaster postgres [.] tbm_add_tuples
>> |
>> --- tbm_add_tuples
>>
>> 3.41% postmaster postgres [.] hash_search
>> |
>> --- hash_search
>>
>>
>> And the last note is that the crts.data table which is being bitmap scanned
>> is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
>> code
>> is somehow unprepared to combine two bitmaps for such a big table, and this
>> leads to the terrible performance.
>>
>> Regards,
>> Sergey
>>
>> PS Here are the schemas of the tables, just in case:
>> wsdb=> \d test
>> Table "koposov.test"
>> Column | Type | Modifiers
>> ---------+------------------+-----------
>> mjd | double precision |
>> fieldid | bigint |
>> intmag | integer |
>> objid | bigint |
>>
>> wsdb=> \d crts.data
>> Table "crts.data"
>> Column | Type | Modifiers
>> --------+------------------+-----------
>> objid | bigint |
>> mjd | double precision |
>> mag | real |
>> emag | real |
>> ra | double precision |
>> dec | double precision |
>> Indexes:
>> "data_mjd_idx" btree (mjd) WITH (fillfactor=100)
>> "data_objid_idx" btree (objid) WITH (fillfactor=100)
>> "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
>> (fillfactor=100)
>>
>> PPS shared_buffers=10GB, work_mem=1GB
>> All the test shown here were don in fully cached regime.
>>
>> PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
>> scans,
>> and I can live with disabling them, but I still thought it's worth
>> reporting.
>>
>>
>> *****************************************************
>> Sergey E. Koposov, PhD, Research Associate
>> Institute of Astronomy, University of Cambridge
>> Madingley road, CB3 0HA, Cambridge, UK
>> Tel: +44-1223-337-551
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-09-02 12:12:33 Re: [GENERAL] Multiple Slave Failover with PITR
Previous Message Bruce Momjian 2012-09-02 11:34:12 Re: Yet another failure mode in pg_upgrade