From: | jaba the mobzy <makaronaforna(at)yahoo(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Bitmap Heap Scan anomaly |
Date: | 2007-05-04 19:28:51 |
Message-ID: | 847742.30232.qm@web63710.mail.re1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom,
Did you restart Postgres and drop file system caches?
What I am suspecting is that some sort of prefetching is happening.
I know that Postgres does not do prefetching.
I also understand very little about OS/FileSystem level prefetching.
----- Original Message ----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: jaba the mobzy <makaronaforna(at)yahoo(dot)com>; pgsql-hackers(at)postgresql(dot)org
Sent: Thursday, May 3, 2007 11:42:32 PM
Subject: Re: [HACKERS] Bitmap Heap Scan anomaly
Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Thu, 2007-05-03 at 14:33 -0700, jaba the mobzy wrote:
>> mycorr_100 took 11.4 s to run although it had to fetch 100000 row from
>> the base table.
>> mycorr_10 took 24.4 s to run although it had to fetch 10563 row from
>> the base table.
> This is because the physical distribution of data is different. The
> mycorr_10 table has tuples in which a and b are > 15.9M spread all
> throughout. mycorr_100 has them all collected together at the end of the
> physical file. Less disk seeking.
If the OP had generated the data randomly, as claimed, the rows
shouldn't be particularly more clumped in one table than the other.
But I sure agree that it sounds like a nonrandom distribution in the
mycorr_100 table. FWIW I tried to duplicate the behavior, and could
not, using tables made up like this:
create table src as
select int4(16*1024*1024*random()) as key,
int4(16*1024*1024*random()) as a,
int4(16*1024*1024*random()) as b
from generate_series(1,16*1024*1024);
create table mycorr_10 as
select key, a,
case when random() < 0.1 then a else b end as b
from src;
create table mycorr_100 as
select key, a, a as b
from src;
create index mycorr_10i on mycorr_10(a,b);
create index mycorr_100i on mycorr_100(a,b);
vacuum analyze mycorr_10;
vacuum analyze mycorr_100;
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
____________________________________________________________________________________
Sucker-punch spam with award-winning protection.
Try the free Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/features_spam.html
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2007-05-04 19:41:45 | Re: Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped) |
Previous Message | Alvaro Herrera | 2007-05-04 18:57:27 | Removing pg_auth_members.grantor (was Grantor name gets lost when grantor role dropped) |