web archiving

From: Matt Price <matt(dot)price(at)utoronto(dot)ca>
To: pgsql-novice(at)postgresql(dot)org
Subject: web archiving
Date: 2002-07-10 20:59:00
Message-ID: 1026334740.699.82.camel@anarres
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-novice

Hi there,

I've just moved up from non-free os's to debian linux, and installed
postgresql, with the hope of getting started on some projects I've been
thinking about. Several of these projects involve web archives. The
idea is, a url is entered with a bunch of bibliographic-type data in
other fields (keywords, author, date, etc). The html (and hopefully,
accompanying images/css's/etc) are then grabbed using curl, and archived
in a postgresql database. A web or other gui interface then provides
fully-searchable access to the archive for later use.

So my question: does anyone know of a similar tool which already
exists? I'm a complete novice at database programming (and at php, too,
which is what I figured I'd use as the scripting language, though I'd
consider learning perl or java if folks think that's a much better
idea), and I'd rather work with some pre-existing code than start from
the ground up. Any suggestings? Is this the right list to be asking
this quesiton on?

Thanks loads,
Matt

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Philip Hallstrom 2002-07-10 21:06:57 Re: MySQL password function
Previous Message Nathan Hill 2002-07-10 20:53:56 Re: XML to Postgres conversion

Browse pgsql-novice by date

  From Date Subject
Next Message Philip Hallstrom 2002-07-10 22:21:55 Re: web archiving
Previous Message Vivek Khera 2002-07-10 20:52:56 Re: update problem?