Re: Use order by clause, got index scan involved

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: 高健 <luckyjackgao(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Use order by clause, got index scan involved
Date: 2012-11-08 17:10:20
Message-ID: CAMkU=1w2sGXVr-aMn84f7v2RViZtWS5B3jSp+CmqN0uGkd1PnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Nov 7, 2012 at 11:41 PM, 高健 <luckyjackgao(at)gmail(dot)com> wrote:

> Hi all:
>
>
>
> What confused me is that: When I select data using order by clause, I
> got the following execution plan:
>
>
>
> postgres=# set session
> enable_indexscan=true;
>
>
> SET
>
>
> postgres=# explain SELECT * FROM pg_proc ORDER BY
> oid;
>
>
> QUERY
> PLAN
>
>
>
> ----------------------------------------------------------------------------------------
>
>
> Index Scan using pg_proc_oid_index on pg_proc (cost=0.00..321.60
> rows=2490 width=552)
>

You should probably use sample cases much larger than this when trying to
understand the planner. With queries this small, it almost doesn't matter
what plan is chosen.

>
>
>
>
> (1
> row)
>
>
>
>
>
> postgres=#
>
>
>
> My Question is :
>
> If I want to find record using the where clause which hold the id
> column, the index scan might be used.
>
> But I just want to get all the records on sorted output format, Why
> index scan can be used here?
>
>
>
> I can’t imagine that:
>
> Step 1 Index is read into memory, then for each tuple in it,
>
> Step 2 Then we got the address of related data block, and then access
> the data block .
>
>
>
> Step 2 will be repeated for many times. I think it is not efficient.
>

But step 2 will repeatedly find the block it is visiting to already be in
memory, so it is efficient.

>
>
> Maybe the database system is clever enough to accumulate data access for
> same physical page, and reduce the times of physical page acess ?
>

There is a bitmap scan which does that, but such a scan can't be used to
fulfill a sort, because it doesn't return the rows in index order. What
reduces the cost here is the various levels of caching implemented by the
file system, the memory system, and the CPU. PG uses
"effective_cache_size" to try to account for these effects, although I
admit I don't quite understand what exactly it is doing in this case. I
thought that setting effective_cache_size to absurdly low values would make
the index scan cost estimate go up a lot, but it only made it go up a
little.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tianyin Xu 2012-11-08 18:47:18 Re: Does PostgreSQL have complete functional test cases?
Previous Message Tom Lane 2012-11-08 16:45:37 Re: How is execution plan cost calculated for index scan