Efficient Query For Mapping IP Addresses To Country Code.

From: "Rajesh Kumar Mallah(dot)" <mallah(at)trade-india(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Efficient Query For Mapping IP Addresses To Country Code.
Date: 2002-06-26 15:00:23
Message-ID: 200206262030.23395.mallah@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi folks,

the problem is to update one table by querying another.

i have a table where i store apache access logs where one of the fields is the host ip address.
i need to find corresponding country for all the ip addrresses.

for this i have another table that contains apnic,arin and ripe databases
in the form of:

Table "ip_country_map"
Column | Type | Modifiers
----------+--------------+-----------
start_ip | inet |
end_ip | inet |
country | character(2) |
Indexes: end_ip_idx,
start_ip_idx

I need to update the accees log's country field by
searching the ip in ip_country_map for country.
i have follwing three alternatives , all seems to be slow.

1 st. (based on implicit join)
-------------
explain UPDATE access_log_2002_06_25 set country=ip_country_map.country where host_ip between
ip_country_map.start_ip and ip_country_map.end_ip;

Nested Loop (cost=0.00..1711037.55 rows=5428333 width=563)
-> Seq Scan on ip_country_map (cost=0.00..1112.55 rows=48855 width=70)
-> Seq Scan on access_log_2002_06_25 (cost=0.00..20.00 rows=1000 width=493)
---------------

2nd (based on subselect)
---------------
explain UPDATE access_log_2002_06_25 set country=(select country from ip_country_map where access_log_2002_06_25.host_ip
between start_ip and end_ip);
NOTICE: QUERY PLAN:
Seq Scan on access_log_2002_06_25 (cost=0.00..20.00 rows=1000 width=493)
SubPlan
-> Seq Scan on ip_country_map (cost=0.00..1356.83 rows=5428 width=6)

EXPLAIN
----------------

3 rd (do not update country field at all just join both the table)
--------------------------------------------------------------------
explain SELECT host_ip,ip_country_map.country from access_log_2002_06_25 join ip_country_map on
( host_ip between start_ip and end_ip) ;
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..1711037.55 rows=5428333 width=102)
-> Seq Scan on ip_country_map (cost=0.00..1112.55 rows=48855 width=70)
-> Seq Scan on access_log_2002_06_25 (cost=0.00..20.00 rows=1000 width=32)

EXPLAIN
--------------------------------------------------------------------

Yet Another option
----------------------------------------------------------------------
while loading access_log from file into database i do a select on ip_country_map.

also even a simple query like do not use indexes.

access_log=# explain SELECT country from ip_country_map where start_ip <= '203.196.129.1' and end_ip >= '203.196.129.1';
NOTICE: QUERY PLAN:

Seq Scan on ip_country_map (cost=0.00..1356.83 rows=5428 width=6)

EXPLAIN
access_log=# explain SELECT country from ip_country_map where '203.196.129.1' between start_ip and end_ip;
NOTICE: QUERY PLAN:

Seq Scan on ip_country_map (cost=0.00..1356.83 rows=5428 width=6)

EXPLAIN
access_log=#

IS THERE ANYTHING woring with my database schema?
how shud i be storing the the data of ipranges and
country for efficient utilization in this problem.

regds

Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

Browse pgsql-sql by date

  From Date Subject
Next Message Rajesh Kumar Mallah. 2002-06-26 15:18:05 Re: sequence chages after firing update
Previous Message Stephan Szabo 2002-06-26 14:46:01 Re: sequence chages after firing update