From: | "Tels" <nospam-pg-abuse(at)bloodgate(dot)com> |
---|---|
To: | "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com> |
Cc: | "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Haisheng Yuan" <hyuan(at)pivotal(dot)io>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bitmap table scan cost per page formula |
Date: | 2017-12-21 18:17:45 |
Message-ID: | 9af1aa1ef2ead874ac6e83983eb3f576.squirrel@sm.webmail.pair.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Moin,
On Wed, December 20, 2017 11:51 pm, Jeff Janes wrote:
> On Wed, Dec 20, 2017 at 2:18 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
>
>> On Wed, Dec 20, 2017 at 4:20 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
>> wrote:
>>>
>>> It is not obvious to me that the parabola is wrong. I've certainly
>>> seen
>>> cases where reading every 2nd or 3rd block (either stochastically, or
>>> modulus) actually does take longer than reading every block, because it
>>> defeats read-ahead. But it depends on a lot on your kernel version and
>>> your kernel settings and your file system and probably other things as
>>> well.
>>>
>>
>> Well, that's an interesting point, too. Maybe we need another graph
>> that
>> also shows the actual runtime of a bitmap scan and a sequential scan.
>>
>
> I've did some low level IO benchmarking, and I actually get 13 times
> slower
> to read every 3rd block than every block using CentOS6.9 with ext4 and the
> setting:
> blockdev --setra 8192 /dev/sdb1
> On some virtualized storage which I don't know the details of, but it
> behaves as if it were RAID/JBOD with around 6 independent spindles..
Repeated this here on my desktop, linux-image-4.10.0-42 with a Samsung SSD
850 EVO 500 Gbyte, on an encrypted / EXT4 partition:
$ dd if=/dev/zero of=zero.dat count=1300000 bs=8192
1300000+0 records in
1300000+0 records out
10649600000 bytes (11 GB, 9,9 GiB) copied, 22,1993 s, 480 MB/s
All blocks:
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ time perl -le 'open my $fh, "rand" or die; foreach (1..1300000)
{$x="";next if $_%3>5; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh,
$x,8*1024; print length $x} ' | uniq -c
1299999 8192
real 0m20,841s
user 0m0,960s
sys 0m2,516s
Every 3rd block:
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ time perl -le 'open my $fh, "rand" or die; foreach (1..1300000) {$x="";
next if $_%3>0; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh,
$x,8*1024; print length $x} '|uniq -c
433333 8192
real 0m50,504s
user 0m0,532s
sys 0m2,972s
Every 3rd block random:
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ time perl -le 'open my $fh, "rand" or die; foreach (1..1300000) {$x="";
next if rand()> 0.3333; sysseek $fh,$_*8*1024,0 or die $!; sysread $fh,
$x,8*1024; print length $x} ' | uniq -c
432810 8192
real 0m26,575s
user 0m0,540s
sys 0m2,200s
So it does get slower, but only about 2.5 times respectively about 30%.
Hope this helps,
Tels
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-12-21 18:21:39 | Re: ddd |
Previous Message | Tom Lane | 2017-12-21 18:00:12 | Re: Letting plpgsql in on the fun with the new expression eval stuff |