Re: Design advice requested

From: Julian <tempura(at)internode(dot)on(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Design advice requested
Date: 2013-05-08 14:17:59
Message-ID: 518A5E97.8070208@internode.on.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 08/05/13 21:21, Johann Spies wrote:

> Basically my request is for advice on how to make this database as
> fast as possible with as few instances of duplicated data while
> providing both for the updates on level 0 and value added editing on
> level 1.
>
> Regards
> Johann
Hi Johann.

Firstly, don't worry too much about speed in the design phase, there may
be differences of opinion here, but mine is that even with database
design the first fundamental layer is the relationship model. That is,
regardless of how the raw data is presented to you (CSV, raw text, other
relationship models or ideas), the same back to basics problem must be
solved - What is the most effective and efficient way of storing this
data, that will allow for database flexibility and scalability (future
adaptation of new relationship models).

Secondly, assuming the CSV and other raw data is in the flat (fat) table
format (contains columns of duplicate data). Its your job to determine
how to break it down into separate sections (tables) of data and how
they relate to each other (normalization). One to many, many to many,
etc. There's also other things to consider (i.e data history, revision),
but those are the basics.

Thirdly, its the queries and the relationships they reveal (joins)
between sections of data (tables) that assist in making the data
presentable and you can always later on utilize caches for blocks of
data that can be in the database itself (temp tables, MV's etc)
TIP: whether its temps, views, or materialized views its a good idea to
be consistent with the name i.e. "some_view". This provides a level of
abstraction and is handy in the design phase.

It doesn't matter if you are dealing with petabytes of data.

Thats all I can suggest without actually looking at a sample of the data
(problem) you are dealing with. Its a matter of breaking it down into
logical steps and having some fun.

Regards,
Julian.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Julian 2013-05-08 14:40:59 Re: Does it make sense to break a large query into separate functions?
Previous Message Seref Arikan 2013-05-08 14:09:01 Re: Does it make sense to break a large query into separate functions?