Re: shared data for different applications

From: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
To: Kent Tong <kent(at)cpttm(dot)org(dot)mo>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: shared data for different applications
Date: 2010-11-21 13:51:11
Message-ID: 87824376-9537-47E9-ABEF-1BDA6C0CEEB7@solfertje.student.utwente.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 21 Nov 2010, at 5:08, Kent Tong wrote:

> Hi,
>
> Let's say that there is some data that should be logically shared by
> many applications in the company, such as some core information about
> its customers (name, address, contact info). In principle, such data
> should be stored in a DB for sharing.
>
> But then if a certain application needs to access such shared data, it
> will need to access its own DB plus this shared DB, which is
> troublesome and requires distributed transactions.

I think most companies have ended up at that point just by the progress of time. They have several different databases (often from different vendors even) that they need to aggregate their information from.

Usually the different databases contain different kinds of contents, so the need for distributed transactions and such is quite minimal. Where there is a need to keep relational integrity, the related data is usually in one big central database. You'd be amazed how much a database like that can handle!

That said, separating things out would be an improvement. For example, for generating reports (big aggregations of many data sources at once), you tend to generate heavy-weight queries that are likely to impact other queries (unless your database server is capable of limiting I/O and such per process, that helps some).

> An alternative is to store everything into a single DB and use, say,
> schemas to separate the data. The shared data can be put into one
> schema and that application can have its own schema. This eliminates
> the need for distributed transactions but then you'll have a single
> DB for the whole company! This creates problems in:
> 1) availability: if the DB is down, everything is down.
> 2) maintenance: if we need to deploy a new application, we'll need to
> create a new schema in that single DB, potentially impacting other
> apps.
> 3) performance: all apps are access that single DB server.
>
> I was wondering in practice, how people address this problem?

You limit access to who can touch what and you replicate.

For example, the people in your sales department will need to be able to add and modify customer information, but the guys in the IT department don't need to. So the first group gets access to a database server where the customer database is a master, while the others get access to a read-only slave.

I wouldn't go so far as to create a separate database for every business unit though, the distinction is more a role-based one than an organisational one - there will be overlap between who has access to what.

That said, unless you're in a very large company, a central database will probably do for almost anything. The exceptions are more likely to be among the lines of reporting and frequent short but specialised queries for, for example, machine performance statistics.

At the company where I work we have a central DB2/400 database for production, shipping and sales. That database gets replicated for reporting. We also have a few separate MS SQL databases where for example machines on our production facility send their performance statistics, which they get back in an aggregated form every 5 minutes or so.

It isn't ideal, but that 20-year old DB2/400 database (although the hardware's been upgraded to i5-series or so) can handle its load just fine.

I suspect that Postgres would actually perform better, but you can't just switch a big 24/7 company from one database to another (provided I had anything to say about it at all, which I don't). That's an expensive, time-consuming and risky process.
Just saying, I don't know from experience how well Postgres would fare there, as it's not what we're using. I have no reason to suspect it to perform less well though.

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.

!DSPAM:737,4ce923f910421136214443!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Magnus Hagander 2010-11-21 14:15:44 Re: Fwd: Postgres forums ... take 2
Previous Message Alban Hertroys 2010-11-21 13:04:53 Re: [pgsql-www] Forums at postgresql.com.au