Re: Complex database for testing, U.S. Census Tiger/UA

From: Dustin Sallings <dustin(at)spy(dot)net>
To: cbbrowne(at)cbbrowne(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Complex database for testing, U.S. Census Tiger/UA
Date: 2003-04-08 16:35:10
Message-ID: Pine.SGI.4.50.0304080931430.26785-100000@bleu.west.spy.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Around 11:24 on Apr 8, 2003, cbbrowne(at)cbbrowne(dot)com said:

I think it was my first application I wrote in python which parsed
the zip files containing these data and shoved it into a postgres system.
I had multiple clients on four or five computers running nonstop for about
two weeks to get it all populated.

By the time I was done, and got my first index created, I began to
run out of disk space. I think I only had about 70GB to work with on the
RAID array.

# Jan Wieck wrote:
# > mlw wrote:
# > >
# > > The U.S. Census provides a database of street polygons and other data
# > > about landmarks, elevation, etc. This was discussed in a separate thread.
# > >
# > > The main URL is here:
# > > http://www.census.gov/geo/www/tiger/index.html
# >
# > While yes, the tiger database (or better it's content) is interesting, I
# > don't think that it can be counted as a "complex database". Just that
# > something is big doesn't mean that.
#
# Just so.
#
# There are doubtless interesting cases that may be tested by virtue of
# having a data set that is large, and perhaps "deeply interlinked."
#
# But that only covers cases that have to do with "largeness." It doesn't
# help ensure that PostgreSQL plays well when it gets hit by nested sets
# of updates where the challenges involve ensuring the system performs OK
# and does not deadlock when hit by complex sets of transactions.
#
# So that an "interesting" database might involve not only a database, but
# also a set of transactions that hit multiple tables that are to update
# that database. In effect, something like the "readers/writers" that get
# used to test locking semantics.
#
# This is something that would not be able to solely consist of a set of
# tables; it would have to include streams of updates. Something like one
# of the TPC benchmarks...
# --
# output = reverse("moc.enworbbc@" "enworbbc")
# http://www3.sympatico.ca/cbbrowne/rdbms.html
# "If I could find a way to get [Saddam Hussein] out of there, even
# putting a contract out on him, if the CIA still did that sort of a
# thing, assuming it ever did, I would be for it." -- Richard M. Nixon
#
#
# ---------------------------(end of broadcast)---------------------------
# TIP 3: if posting/reading through Usenet, please send an appropriate
# subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
# message can get through to the mailing list cleanly
#
#

--
SPY My girlfriend asked me which one I like better.
pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin(at)spy(dot)net>
| Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message cbbrowne 2003-04-08 18:58:42 Re: Complex database for testing, U.S. Census Tiger/UA
Previous Message cbbrowne 2003-04-08 15:24:06 Re: Complex database for testing, U.S. Census Tiger/UA