From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why is indexonlyscan so darned slow? |
Date: | 2012-05-21 21:29:32 |
Message-ID: | CAHyXU0xT=Se_zMdjEaC9W6uGuuP7LEOLKpJ7CRwrH8z+k4SQ-A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 21, 2012 at 4:17 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Mon, May 21, 2012 at 1:42 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>
>>> Earlier you said that this should be an ideal setup for IOS. But it
>>> isn't really--the ideal set up is one in which the alternative to an
>>> IOS is a regular index scan which makes many uncached scattered reads
>>> into the heap. I don't think that that situation can't really be
>>> engineered with a where-less query.
>>
>> Can you give me some suggested comparisons which *would* be ideal, then?
>
> Are you looking for vaguely real-life examples, or highly contrived
> examples used to dissect the server?
>
> For vaguely real life, take your example of pgbench -i -s200 -F 50,
> and I have 2Gig RAM, which seems to be the same as you do.
>
> With select only work load (pgbench -S -M prepared -T 30), I get
>
> tps = 193
>
> But now enable index-only scans:
>
> psql -c "create index on pgbench_accounts(aid, abalance);"
>
> and it goes up to.
>
> tps = 10137
Right -- the main driver here is that your index fits neatly in ram
and the heap does not -- so you're effectively measuring the
difference between a buffered and non-buffered access. That's highly
contrived as you noted and unlikely to come up all *that* often in the
real world.
Generally though the real world wins (although the gains will be
generally less spectacular) are heavily i/o bound queries where the
indexed subset of data you want is nicely packed and the (non
clustered) heap records are all over the place. By skipping the semi
random heap lookups you can see enormous speedups. I figure 50-90%
improvement would be the norm there, but this is against queries that
are taking forever, being i/o bound.
See here: http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
for a 'in the wild' gripe about about not having index scans.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Florian Pflug | 2012-05-21 22:08:51 | Re: read() returns ERANGE in Mac OS X |
Previous Message | Jeff Janes | 2012-05-21 21:17:26 | Re: Why is indexonlyscan so darned slow? |