Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)

From: Brian Ristuccia <brianr(at)osiris(dot)978(dot)org>
To: "Ross J(dot) Reedstrom" <reedstrm(at)wallace(dot)ece(dot)rice(dot)edu>
Cc: Oliver Elphick <olly(at)lfix(dot)co(dot)uk>, pgsql-hackers(at)postgreSQL(dot)org, 48582(at)bugs(dot)debian(dot)org
Subject: Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)
Date: 1999-10-28 20:35:53
Message-ID: 19991028163553.B3255@osiris.978.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 28, 1999 at 03:32:17PM -0500, Ross J. Reedstrom wrote:
> Oliver (& Brian) -
> Hmm, that happens to not be the case. The rows=XXXX number is drawn
> from the statistics for the table, which are only updated on VACUUM
> ANALYZE of that table. Easily tested: just INSERT a couple rows and do
> the EXPLAIN again. The rows=XXX won't change. Ah, here's an example:
> a table I've never vacuumed, until now (it's in a test copy of a db)

Aah.. Is there any other more efficient way of determining the number of
rows in a table? It seems a sequential scan takes forever, but the database
must already have some idea (somewhere) of how many records are in the table
otherwise how would it know where to start/stop the sequential scan?

>
> test=> select count(*) from "Personnel";
> count
> -----
> 177
> (1 row)
>
> test=> explain select count(*) from "Personnel";
> NOTICE: QUERY PLAN:
>
> Aggregate (cost=43.00 rows=1000 width=4)
> -> Seq Scan on Personnel (cost=43.00 rows=1000 width=4)
>
> EXPLAIN
> test=> vacuum analyze "Personnel";
> VACUUM
> test=> explain select count(*) from "Personnel";
> NOTICE: QUERY PLAN:
>
> Aggregate (cost=7.84 rows=177 width=4)
> -> Seq Scan on Personnel (cost=7.84 rows=177 width=4)
>
> EXPLAIN
> test=>
>
> Ross
>
> On Thu, Oct 28, 1999 at 08:53:31PM +0100, Oliver Elphick wrote:
> > This is a Debian bug report, which needs upstream attention.
> >
> > ------- Forwarded Message
> >
> > Date: Thu, 28 Oct 1999 13:45:18 -0400
> > From: Brian Ristuccia <brianr(at)osiris(dot)978(dot)org>
> > To: submit(at)bugs(dot)debian(dot)org
> > Subject: Bug#48582: psql spends hours computing results it already knows
> >
> > Package: postgresql
> > Version: 6.5.2-3
> >
> > massive_db=> explain select count(*) from huge_table;
> > NOTICE: QUERY PLAN:
> >
> > Aggregate (cost=511.46 rows=9923 width=12)
> > -> Seq Scan on huge_table (cost=511.46 rows=9923 width=12)
> >
> > EXPLAIN
> >
> > If huge_table really is huge -- like 9,000,000 rows instead of 9923, after
> > postgresql already knows the number of rows (that's how it determines the
> > cost), it proceeds to do a very long and CPU/IO intensive seq scan to
> > determine the count().
> >
> > - --
> > Brian Ristuccia
> > brianr(at)osiris(dot)978(dot)org
> > bristucc(at)nortelnetworks(dot)com
> > bristucc(at)cs(dot)uml(dot)edu
> >

--
Brian Ristuccia
brianr(at)osiris(dot)978(dot)org
bristucc(at)nortelnetworks(dot)com
bristucc(at)cs(dot)uml(dot)edu

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-10-28 22:58:12 Re: [HACKERS] psql Week 4.142857
Previous Message Ross J. Reedstrom 1999-10-28 20:32:17 Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)