Quick Links

Re: Indexes?

From:	Martijn van Oosterhout <kleptog(at)svana(dot)org>
To:	Mike Leahy <mgleahy(at)fes(dot)uwaterloo(dot)ca>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Indexes?
Date:	2003-10-15 02:28:13
Message-ID:	20031015022813.GA22182@svana.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

In general, the <> operator can't be indexed since it will match most rows,
in which case you're better off scanning the table anyway.

If you really run that query often, perhaps the answer is a partial index.

On Tue, Oct 14, 2003 at 09:57:05PM -0400, Mike Leahy wrote:
> Hello all,
>
> I have a question related to the use of indexes. One of my tables is part of
> a census, with over 2.5 million records in it. Clearly, just about any query
> takes a rather long time to complete. I was hoping to reduce the amount of
> time by using indexes. However, no matter what I do, PostgreSQL never seems
> to use them. It would seem to make sense that if I calculate something
> grouped by a set of variables that have an index created for them, that the
> index should be used rather than a sequential search, but it doesn't. I have
> a table with an index such as the one created below:
>
> CREATE INDEX tbl_censo_poblacion_1993_manzana_idx ON tbl_censo_poblacion_1993
> ( dubicacion, zona, manzana );
>
> Then I try a query such as:
>
> select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA
> from tbl_censo_poblacion_1993 group by dubicacion, zona, manzana;
>
> The results from explain indicate that a sequential scan is used (as far as I
> can tell). I tried adding where statement:
>
> select count(*) as POBLACION, (dubicacion || zona || manzana) as COD_MANZANA
> from tbl_censo_poblacion_1993 where dubicacion <> '' and zona <> '' and
> manzana <> '' group by dubicacion, zona, manzana;
>
> The EXPLAIN analysis results appear as follows:
>
> Aggregate (cost=847587.90..879024.28 rows=251491 width=27) (actual
> time=272782.00..279458.00 rows=21459 loops=1)
> -> Group (cost=847587.90..872737.01 rows=2514911 width=27) (actual
> time=272782.00..278546.00 rows=2553015 loops=1)
> -> Sort (cost=847587.90..853875.18 rows=2514911 width=27) (actual
> time=272782.00..274533.00 rows=2553015 loop
> s=1)
> Sort Key: dubicacion, zona, manzana
> -> Seq Scan on tbl_censo_poblacion_1993 (cost=0.00..328346.76
> rows=2514911 width=27) (actual time=0.00. .189570.00 rows=2553015 loops=1)
> Filter: ((dubicacion <> ''::character varying) AND (zona
> <> ''::character varying) AND (manzana <> ''::character varying)) Total
> runtime: 279494.00 msec (7 rows)
>
>
> So...can anyone suggest to me what I could do? I'm using PostgreSQL 7.3.4-2
> in Cygwin on a WinXP platform.
>
> As an aside, is there any way to increase the amount of memory allocated to
> the postmaster.exe process? It seems to me that if I could add more than 4MB
> that it has by default, then maybe that could increase the performance.
>
> Any help is appreciated.
>
> Kind regards,
> Mike
>
>
>
>
> ----------------------------------------
> This mail sent through www.mywaterloo.ca
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

In response to

Indexes? at 2003-10-15 01:57:05 from Mike Leahy

Browse pgsql-general by date

	From	Date	Subject
Next Message	Greg Stark	2003-10-15 03:08:39	Re: Question
Previous Message	Mike Leahy	2003-10-15 01:57:05	Indexes?