Why is this functional index not used?

From: "Klaus P(dot) Pieper" <kpi6288(at)gmail(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Why is this functional index not used?
Date: 2017-03-20 20:25:34
Message-ID: 007701d2a1b8$2190ae30$64b20a90$@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Given a country table and an order table:

CREATE TABLE g.country

(

-- inherited from table g.standard: oid uuid NOT NULL,

-- … some more columns inherited…

lisocode integer NOT NULL, -- Numeric ISO 3166 code

szisoalpha2 character varying(2), -- The 2 letter country code

szisoalpha3 character varying(3), -- The 3 letter country code

szisonum3 character varying(3), -- The NUMERIC country code with leading
zeros

-- a few more columns here

CONSTRAINT country_pkey PRIMARY KEY (oid)

) INHERITS (g.standard)

WITH (

OIDS=FALSE

);

This table is filled with ISO 3166 country codes.

The order table:

CREATE TABLE t."order"

(

-- inherited from table g.standard: oid uuid NOT NULL,

-- … some more columns inherited…

szxmladdress text,

-- many more columns in this table

CONSTRAINT country_pkey PRIMARY KEY (oid)

) INHERITS (g.standard)

WITH (

OIDS=FALSE

);

A typical entry in t."order".szxmladdress looks like

<address entityType="Person" country="DE">

<field name="Address.oid">ae0eb84f-9b8b-4fef-b87a-d6757bdfeaf9</field>

<field
name="Address.uaddrtyperef">0bbdb48c-21c7-429e-944e-59a4d9ace9d5</field>

<field name="Address.szstreet">Hauptstraße</field>

<!—some more xml tags here -->

</address>

No XML field in the order table exceeds 2kB.

Getting the 2 letter country code from the xml address by this function:

CREATE OR REPLACE FUNCTION g.getxmladdresscountryid(xaddr text)

RETURNS uuid AS

$BODY$BEGIN

RETURN (SELECT oID FROM G.Country WHERE szIsoAlpha2 =
array_to_string(xpath('/address/@country', xAddr::xml), '') ORDER BY
lIsoCode LIMIT 1);

END$BODY$

LANGUAGE plpgsql IMMUTABLE;

I know that this function is not really IMMUTABLE but the country table is
changing only every several years.

Created a functional index on the order table:

CREATE INDEX order_getxmladdresscountryid_fidx

ON t."order"

USING btree

(g.getxmladdresscountryid(szxmladdress));

Joining order and country table limiting to 10 rows uses the index:

explain analyse

SELECT

GCountry.szISOAlpha2,

GCountry.szISOAlpha3,

GCountry.szISONum3

from

t.order torder

left join G.Country GCountry ON
G.GetXmlAddressCountryID(TOrder.szXmlAddress) = GCountry.oID

limit 10

Gives:

Limit (cost=0.56..8.45 rows=10 width=11) (actual time=0.644..4.764 rows=10
loops=1)

-> Merge Right Join (cost=0.56..10670.45 rows=13517 width=11) (actual
time=0.642..4.754 rows=10 loops=1)

Merge Cond: (gcountry.oid =
g.getxmladdresscountryid(torder.szxmladdress))

-> Index Scan using country_pkey on country gcountry
(cost=0.27..38.05 rows=258 width=27) (actual time=0.025..0.067 rows=32
loops=1)

-> Index Scan using order_getxmladdresscountryid_fidx on "order"
torder (cost=0.29..7019.04 rows=13517 width=366) (actual time=0.020..0.058
rows=10 loops=1)

Planning time: 0.603 ms

Execution time: 4.898 ms

But when I remove the “limit 10”, the index is no longer used:

Hash Left Join (cost=17.81..5397.46 rows=13517 width=11) (actual
time=0.941..4721.372 rows=13510 loops=1)

Hash Cond: (g.getxmladdresscountryid(torder.szxmladdress) = gcountry.oid)

-> Seq Scan on "order" torder (cost=0.00..3504.17 rows=13517 width=366)
(actual time=0.011..27.542 rows=13510 loops=1)

-> Hash (cost=14.58..14.58 rows=258 width=27) (actual time=0.427..0.427
rows=258 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 23kB

-> Seq Scan on country gcountry (cost=0.00..14.58 rows=258
width=27) (actual time=0.008..0.226 rows=258 loops=1)

Planning time: 0.580 ms

Execution time: 4728.602 ms

4.8 seconds - I would expect a much faster query using a functional index.

Even a “limit 100” does not use the index any more.

Just a side note: without “limit 10” the query needs 4728 ms - almost a
linear increase from 10 rows to 13500 rows.

I played around with COST of the function between 1 and 20000 and with
several options on postgresql.conf without luck.

A sequential scan on the order table alone is carried out in 15 ms.

Thanks for any idea.

Klaus

Responses

Browse pgsql-general by date

  From Date Subject
Next Message lin 2017-03-21 03:08:48 Re: how can I use "pg_basebackup" and not include pg_log of master ?
Previous Message Merlin Moncure 2017-03-20 20:08:14 Re: Advice about software engineering inside Postgres?