Re: Geographic data sources, queries and questions

From: "John D(dot) Burger" <john(at)mitre(dot)org>
To: pgsql-list(at)nullmx(dot)com
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Geographic data sources, queries and questions
Date: 2007-05-28 15:27:55
Message-ID: AE774BBE-4502-4BAD-B82F-8E34C21FC190@mitre.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Chuck D. wrote:

> I decided to put together the USGS stuff, the maxmind free stuff
> and the
> GeoNames project files and in the end I had countries with no
> states, states
> with no cities and cities with no states. Some data sources said a
> country
> had 40 states, another said it had 50. It was difficult to try and
> figure
> out because I don't know geo stuff enough to verify it.

Yeah, all of our sources data has various degrees of noise. There
were even locations mis-typed as =countries= in the official NGA
downloads - you'd think their validation would at least identify
spurious countries :). We developed a set of heuristics for deciding
when two locations (usually but not always from two different
sources) were in fact the same entity. This was an area that needed
more work, however, when the project ended. In addition, different
sources had made different ontological decisions about what was
what. For instance, does the US have 50 states - what about the US
Virgin Islands, etc?

This was a few years ago - if we were to start up again, I suspect we
would investigate working with whoever is behind geonames.org, as
they seem to have the same kind of goals we did. Anyway, I will send
our schema under separate cover, and I will investigate sending you
the data as well.

- John D. Burger
MITRE

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Poul Møller Hansen 2007-05-28 17:07:41 Different sort order
Previous Message Andrew Sullivan 2007-05-28 14:41:00 Re: why postgresql over other RDBMS