Re: Bitmap table scan cost per page formula

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haisheng Yuan <hyuan(at)pivotal(dot)io>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bitmap table scan cost per page formula
Date: 2017-12-20 21:20:28
Message-ID: CAMkU=1xv=8mpG7weV7pTC=q9kX_Lb-tvcKEsYb-YgVepxXEEHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 20, 2017 at 12:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Dec 19, 2017 at 2:55 PM, Haisheng Yuan <hyuan(at)pivotal(dot)io> wrote:
>>
>> Below is the graph (credit to Heikki) that plots the total estimated cost
>> of a bitmap heap scan, where table size is 10000 pages, and
>> random_page_cost=10 and seq_page_cost=1. X axis is the number of pages
>> fetche. I.e. on the left, no pages are fetched, and on the right end, at
>> 10000, all pages are fetched. The original formula is in black, the new
>> formula in red, and the horizontal line, in blue, shows the cost of a Seq
>> Scan.
>> [image: Inline image 3]
>>
>>
>> Thoughts? Any better ideas?
>>
>
> The parabola-shape curve we're getting at present is clearly wrong;
> approaching a horizontal line as an asymptote seems much better. However,
> shouldn't the red line level off at some level *above* the blue line rather
> than *at* the blue line? Reading the index pages isn't free, so a
> sequential scan should be preferred when we're going to read the whole
> table anyway.
>

It is not obvious to me that the parabola is wrong. I've certainly seen
cases where reading every 2nd or 3rd block (either stochastically, or
modulus) actually does take longer than reading every block, because it
defeats read-ahead. But it depends on a lot on your kernel version and
your kernel settings and your file system and probably other things as well.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-12-20 21:25:03 Re: [HACKERS] Proposal: Local indexes for partitioned table
Previous Message Robert Haas 2017-12-20 21:09:45 Re: [HACKERS] parallel.c oblivion of worker-startup failures