From: | Philip Hallstrom <philip(at)adhesivemedia(dot)com> |
---|---|
To: | Matt Price <matt(dot)price(at)utoronto(dot)ca> |
Cc: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: web archiving |
Date: | 2002-07-10 22:21:55 |
Message-ID: | 20020710152041.G672-100000@cypress.adhesivemedia.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-novice |
Not to discourage you from using postgresql or writing it yourself, but
you might want to take a look at wget (for downloading the web pages) and
mngosearch or htdig for searching them.
mngosearch supports postgresql and has a PHP interface so you can have fun
with that...
On 10 Jul 2002, Matt Price wrote:
> Hi there,
>
> I've just moved up from non-free os's to debian linux, and installed
> postgresql, with the hope of getting started on some projects I've been
> thinking about. Several of these projects involve web archives. The
> idea is, a url is entered with a bunch of bibliographic-type data in
> other fields (keywords, author, date, etc). The html (and hopefully,
> accompanying images/css's/etc) are then grabbed using curl, and archived
> in a postgresql database. A web or other gui interface then provides
> fully-searchable access to the archive for later use.
>
> So my question: does anyone know of a similar tool which already
> exists? I'm a complete novice at database programming (and at php, too,
> which is what I figured I'd use as the scripting language, though I'd
> consider learning perl or java if folks think that's a much better
> idea), and I'd rather work with some pre-existing code than start from
> the ground up. Any suggestings? Is this the right list to be asking
> this quesiton on?
>
> Thanks loads,
> Matt
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
From | Date | Subject | |
---|---|---|---|
Next Message | Kris | 2002-07-10 22:44:07 | Re: XML to Postgres conversion |
Previous Message | Lamar Owen | 2002-07-10 22:08:51 | Re: (A) native Windows port |
From | Date | Subject | |
---|---|---|---|
Next Message | Simopoulos | 2002-07-11 14:02:16 | Newbie Stupid Question |
Previous Message | Matt Price | 2002-07-10 20:59:00 | web archiving |