Distributively cache large, read-only lists of data with cache updates

From: Patrick Bakker <patrick(at)vanbelle(dot)com>
To: "PostgreSQL General (E-mail)" <pgsql-general(at)postgresql(dot)org>
Subject: Distributively cache large, read-only lists of data with cache updates
Date: 2002-08-16 18:07:00
Message-ID: 000301c2454f$b80e8f10$3a0a0a0a@20PVBL3
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello.
I'm wondering if anybody has suggestions for how to distributively cache
large (ie. in the thousands of records) read-only lists of data on each
client and update these caches after inserts, updates and deletes.
Obviously, I don't want to extend the scope of insert/update/delete
transactions to include updating every cache of the data but I would like
the updates to happen as soon as they can without adversely impacting
performance too much. Basically I'm looking for a non-blocking, low priority
update mechanism with reasonable update times.
In addition, the solution should fit cleanly within a Java, JBoss 3.0.x and
PostgreSQL 7.2 (probably 7.3 by the time it comes out) environment.

I'd like to use this for updating client-side selection lists which will be
a few columns (ie. code & name) from every row in certain tables (ie. active
inventory items or open orders or active customers). These selection lists
will always be visible on-screen so that the current selection can be
changed very quickly. Each client may have a different table currently
showing in the selection list, so not all clients need to have the same
updates sent to them. In addition each client may have several different
table selection lists open in different views.

Well I haven't implemented the selection lists yet, I imagine that since
they will be always visible it is possible that the list could be used
without a refresh for an entire day if there is no cache update mechanism.

I've briefly considered the following cache update mechanisms:

(-) Simple
Manually refresh the lists when indicated by the user or after some time has
elapsed automatically refresh the selection list.

(-) Messaging mechanism
- On each insert/delete on tables which can appear in a selection list send
out a non-blocking message.
- On each update to a field which can appear in a selection list send out a
non-blocking message.
o The updates could be PostgreSQL notify, or trigger or JBoss JMS message.
- The listener (ie. within PostgreSQL or a message-driven bean within JBoss)
could then publish update notifications to clients.
- The clients would incrementally refresh their lists (or if some threshold
of updates has passed refresh the entire list).

(-) P2P updating
- Each client broadcasts and publishes its inserts/updates/deletes to other
clients which also pass on the message, etc. so that eventually all updates
are passed around.

Anybody else have comments on how they have handled something similar? I
suppose I'm looking a little for which type of solution people have had good
experience with so that I can investigate the possibilities further. For
instance, the P2P description given above is really vague on details but if
people have had good success with that type of update mechanism I'll explore
a precise mechanism of how to do it more fully.

Another related question is how multiple selections can best be handled.
Since I plan on having the selection list always visible and allowing the
user to select multiple items from the list at any time, I need to be able
to retrieve data based on a list of primary keys. I'm not sure on how I
would do this yet within the J2EE EJB environment but I imagine that it
would naturally fall to using an IN statement to list the primary keys. I
have two concerns with that. First, I have noticed that the Visual Foxpro
ADO/ODBC driver on Windows only allows a limited number of items in the IN
statement (approx. 15 I think) and I wonder does PostgreSQL have a similar
limitation? Secondly, I think I remember reading on this list that IN is
slow on PostgreSQL. Is this correct? The other option I am considering is to
persist the user's selection list in the database and then use it in a join
to retrieve the appropriate rows.

Patrick

Browse pgsql-general by date

  From Date Subject
Next Message Ian Harding 2002-08-16 18:31:45 RHDB Explain
Previous Message Bryan White 2002-08-16 17:22:08 query planner oddity for data constant vs current_date