Re: How to query and index for customer with lastname

From: Joost Kraaijeveld <J(dot)Kraaijeveld(at)Askesis(dot)nl>
To: hubert depesz lubaczewski <depesz(at)gmail(dot)com>
Cc: Pgsql-Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to query and index for customer with lastname
Date: 2006-03-04 14:18:23
Message-ID: 1141481903.10078.14.camel@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Hubert,

On Sat, 2006-03-04 at 14:49 +0100, hubert depesz lubaczewski wrote:
> > Sort (cost=54710.68..54954.39 rows=97484 width=111) (actual
> > time=7398.971..7680.405 rows=96041 loops=1)
> > Sort Key: btrim(upper(customers.lastname)), btrim(upper(addresses.city))
> > -> Hash Join (cost=14341.12..46632.73 rows=97484 width=111) (actual time=1068.862..5472.788 rows=96041 loops=1)
> > Hash Cond: ("outer".contactaddress = "inner".objectid)
> > -> Seq Scan on customers (cost=0.00..24094.01 rows=227197 width=116) (actual time=0.018..1902.646 rows=223990 loops=1)
> > Filter: (btrim(upper(lastname)) >= 'JANSEN'::text)
> > -> Hash (cost=13944.94..13944.94 rows=158473 width=75) (actual time=1068.467..1068.467 rows=158003 loops=1)
> > -> Bitmap Heap Scan on addresses (cost=1189.66..13944.94 rows=158473 width=75) (actual time=71.259..530.986 rows=158003 loops=1)
> > Recheck Cond: (btrim(upper(city)) >= 'NIJMEGEN'::text)
> > -> Bitmap Index Scan on prototype_addresses_trim_upper_city (cost=0.00..1189.66 rows=158473 width=0) (actual time=68.290..68.290 rows=158003 loops=1)
> > Index Cond: (btrim(upper(city)) >=> 'NIJMEGEN'::text)
> > Total runtime: 7941.095 ms
>
> explain clearly shows, that index is used for addresses scan, but it
Yes, but I do not understand why I have both a "Bitmap Index Scan" and
a "Bitmap Heap Scan" on (btrim(upper(city)) >=> 'NIJMEGEN'::text)?

> is not so for users.
> explain estimates that 227197 customers match the lastname criteria -
> which looks awfuly high.
> how many record do you have in the customers table?
368915 of which 222465 actually meet the condition.

>From what I understand from the mailing list, PostgreSQL prefers a table
scan whenever it expects that the number of records in the resultset
will be ~ > 10 % of the total number of records in the table. Which
explains the table scan for customers, but than again, it does not
explain why it uses the index on addresses: it has 369337 addresses of
which 158003 meet the condition

> i would try to create index test on customers(contactAddress,
> trim(uppercase(lastname)));
> or with other ordring of fields.
>
> try this - create the index, make analyze of customers table, and
> recheck explain.
> then try the second index in the same manner.
Makes no difference.

--
Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: J(dot)Kraaijeveld(at)Askesis(dot)nl
web: www.askesis.nl

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message hubert depesz lubaczewski 2006-03-04 14:23:08 Re: How to query and index for customer with lastname and city
Previous Message hubert depesz lubaczewski 2006-03-04 13:49:44 Re: How to query and index for customer with lastname and city