From: | desmodemone <desmodemone(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: So, is COUNT(*) fast now? |
Date: | 2011-10-22 10:43:58 |
Message-ID: | CAEs9oF=hAGBkGaJwpxsCGHEJUMMSBErL=OYsqdHtfzW5eQTQCg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2011/10/22 Andres Freund <andres(at)anarazel(dot)de>
> On Friday, October 21, 2011 08:14:12 PM Robert Haas wrote:
> > On Fri, Oct 21, 2011 at 2:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > >> On Fri, Oct 21, 2011 at 1:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >>> I don't know why you'd imagine that touching an index is free, or
> even
> > >>> cheap, CPU-wise. The whole point of the index-only optimization is
> to
> > >>> avoid I/O. When you try it on a case where there's no I/O to be
> saved,
> > >>> and no shared-buffers contention to be avoided, there's no way it's
> > >>> going to be a win.
> > >>
> > >> Well, call me naive, but I would have thought touching six times less
> > >> data would make the operation run faster, not slower.
> > >
> > > It's not "touching six times less data". It's touching the exact same
> > > number of tuples either way, just index tuples in one case and heap
> > > tuples in the other.
> >
> > Yeah, but it works out to fewer pages.
> But access to those is not sequential. I guess if you measure cache hit
> ratios
> the index scan will come out significantly worse.
>
>
> Andres
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
"But access to those is not sequential" yes, I am agree.
In my opinion the problem is that. If the query needs to scan all the b-tree
index without to
access the table rows, the better way to read the index is like sequential
one,
in fact , query like count(*) or other not need the data are in "order" so I
think we
could read all blocks (better, "only" the leaf blocks) without to touching
too much the branch blocks.
For example query like this :
select column_a from table ;
is better to read the data from indexes like sequential
For query like this :
select column_a from table order by column_a ;
is better to read the data from indexes in range scan from root block to
first branch blocks and their leaf blocks, so we could "save"
the sorting.
Mat
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2011-10-22 12:24:53 | Re: Synchronized snapshots versus multiple databases |
Previous Message | Andres Freund | 2011-10-22 09:49:36 | Re: So, is COUNT(*) fast now? |