Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Date: 2012-09-02 12:39:21
Message-ID: alpine.LRH.2.02.1209021256450.25232@calx046.ast.cam.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for your comments.
On Sun, 2 Sep 2012, Peter Geoghegan wrote:
> On 2 September 2012 06:21, Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk> wrote:
>
> I think that this kind of question is better suited to the
> pgsql-performance list. Granted, it was presented as a bug report
> (though they're generally sent to -bugs rather than -hackers), but I
> don't think that this is a valid bug.

The reason is that was inclined to think that it is a bug is that I
encountered a similar bug before with bitmap scans and very big
tables
http://archives.postgresql.org/pgsql-hackers/2011-08/msg00958.php
Furthermore 2 orders of magnitudes more of CPU time for bitmap scans
comparing to the index scan didn't sound right to me (although obviously
I'm not in the position to claim that it's 100% bug).

> One obvious red-flag from your query plans is that there is a
> misestimation of the row return count of a few orders of magnitude in
> the Bitmap Index Scan node. Did you trying performing an ANALYZE to
> see if that helped? It may also be helpful to show pg_stats entries
> for both the data.mjd and test.mjd columns. You may find, prior to
> doing an ANALYZE, that there is no entries for one of those tables.

The main large table is static and was analyzed. The test table was as
well. But as mentioned in another recent email, the query is the join, so
column correlation is a problem.

> Turning off the enable_* planner options in production is generally
> discouraged. Certainly, you'd be crazy to do that on a server-wide
> basis.

I'm using PG for data mining, data analysis purposes with very few clients
connected and very large tables, so enable_* is used quite often to fix
incorrect plans due to incorrect selectivities.

Regards,
Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-09-02 14:44:29 Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Previous Message Bruce Momjian 2012-09-02 12:12:33 Re: [GENERAL] Multiple Slave Failover with PITR