Quick Links

Re: Caching Websites

From:	Adam Kessel <adam(at)bostoncoop(dot)net>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Caching Websites
Date:	2003-05-12 13:52:08
Message-ID:	20030512135208.GA24827@bostoncoop.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Someone else suggested using a 'large object', which I didn't know about:

http://www.postgresql.org/docs/view.php?version=7.3&idoc=1&file=largeobjects.html

It sounds like a large object is almost the same as storing files and
paths to those files, with stricter data integrity.

I don't really plan on doing any database operations with the contents of
these large objects--all manipulations will be done in Python once the
data is retrieved. But it still seems cleaner to not have to maintain
two parallel storage systems (database and filesystem) and make sure they
don't get out of sync.

Based on the documetation, I don't immediately see any disadvantage to
using these large objects--does anyone else see why I might not want to
store archived websites in large objects?

--Adam Kessel

On Mon, May 12, 2003 at 09:39:19AM +0100, Richard Huxton wrote:
> On Friday 09 May 2003 9:48 pm, Adam Kessel wrote:
> > I am wondering whether it would be better to store each website in a
> > record in a table, or instead have a table which links URLs to filenames
> > (the file would contain the pickled website). The sites will of course
> > vary greatly in size, but typically be between 1k and 200k (I probably
> > won't store anything bigger than that).
> >
> > This seems like a simple question, and I suspect there's an obvious
> > answer for which data storage method makes more sense, I just don't know
> > how to go about researching that. What would be the considerations for
> > using one method of data storage vs. the other?
> >
> > Any suggestions for me?
> Not that simple a question - look back through the archives for plenty of
> discussions (usually regarding images).
>
> My personal approach is to ask myself whether I'm going to access/process the
> data in any way. Basically if I want to do any of:
> 1. query the large data
> 2. summarise it
> 3. have transaction-based update control
> then I'll store it in the database. If not, I'll store a path to the file.

In response to

Re: Caching Websites at 2003-05-12 08:39:19 from Richard Huxton

Responses

Re: Caching Websites at 2003-05-12 13:58:31 from Doug McNaught

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2003-05-12 13:53:43	Re: Error installing postgresql-7.3.2 (fixed, but Q remains...)
Previous Message	Jon Earle	2003-05-12 13:15:15	Re: Error installing postgresql-7.3.2 (fixed, but Q