Quick Links

Again, sorry, caching.

From:	mlw <markw(at)mohawksoft(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Again, sorry, caching.
Date:	2002-03-16 14:01:28
Message-ID:	3C935038.BE59E93B@mohawksoft.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I traded a couple emails with a guy using one of my open source projects. To
make a long story short, he is going to the new version of MySQL for his
website because of the new caching feature. He is convinced that it will speed
up his web site, and he is probably right.

On a web site, a few specific queries get executed, unchanged, repeatedly.
Think about an ecommerce site, most of the time it is just a handful of basic
queries. These basic queries are usually against pretty large product tables. A
caching mechanism would make these queries pretty light weight.

The arguments against caching:

"It is an application issue"
This is completely wrong. Caching can not be done against a database without
knowledge of the database, i.e. when the data changes.

"If it is mostly static data, why not just make it a static page?"
Because a static page is a maintenance nightmare. One uses a database in a web
site to allow content to be changed and upgraded dynamically and with a minimum
of work.

"It isn't very useful"
I disagree completely. A cache of most frequently used queries, or specific
ones, could make for REALLY good performance in some very specific, but very
common, applications. Any system that has a hierarchical "drill down" interface
to a data set, ecommerce, libraries, document management systems, etc. will
greatly benefit from a query cache.

I was thinking that it could be implemented as a keyword or comment in a query.
Such as:

select * from table where column = 'foo' cacheable
or
select * from table where column = 'bar' /* cacheable */

Either way, it would speed up a lot of common application types. It would even
be very cool if you could just cache the results of sub queries, such as:

select * from (select * from table where col1 = 'foo' cacheable) as subset
where subset.col2 = 'bar' ;

Which would mean that the subquery gets cached, but the greater select need not
be. The cache could be like a global temp table. Perhaps the user could even
name the cache entry:

select * from table where column = 'foo' cache on foo

Where one could also do:

select * from cache_foo

Using a keyword is probably a better idea, it can be picked up by the parser
and instruct PostgreSQL to use the cache, otherwise there will be no additional
overhead.

Having caching within PostgreSQL will be good for data integrity. Application
caches can't tell when an update/delete/insert happens, they often have to use
a time-out mechanism.

OK, let me have it, tell me how terrible an idea this is. tell me how wrong I
am.

Responses

Re: Again, sorry, caching. at 2002-03-16 14:28:33 from Greg Copeland
Re: Again, sorry, caching. at 2002-03-16 14:39:31 from Greg Copeland
Re: Again, sorry, caching. at 2002-03-16 17:26:43 from Stephan Szabo
Re: Again, sorry, caching. at 2002-03-18 06:01:09 from mlw
Re: Again, sorry, caching. at 2002-03-18 10:01:37 from Karel Zak
Re: Again, sorry, caching. at 2002-03-19 02:35:51 from Neil Conway

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Copeland	2002-03-16 14:28:33	Re: Again, sorry, caching.
Previous Message	Vadim Mikheev	2002-03-16 06:04:09	Re: [BUGS] Bug #613: Sequence values fall back to previously chec