DB design advice

From: Toni Alfirević <toni(dot)alfirevic(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: DB design advice
Date: 2013-06-06 09:29:36
Message-ID: CAKHBNE1jGSnq=rw=uzjdvapub+=L+o0FeqQCTt6sAzd4n5nFbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi,

i'm in need of an advice and here is my situation.
I've started putting stuff on paper, drawing my application and everything
seems simple enough. My application will store large amount of string type
entries (let's call them web server logs for the sake of this mail)
and by "large amount" I'm thinking in terms of couple of gigabytes a day
(could be more but at first it will be several hundredth megabytes a day).

Relevant data structure looks something like this:

Logs
-----------------
Id - bigserial
Value - text
Timestamp - timestamp
UserId - FK_Users_Id

Tags
-----------------
Id - bigserial
Value - text
Timestamp - timestamp
LogId - FK_Logs_Id

So, pretty simple stuff there. Log entry that can have multiple tags
related and those logs are related to user, nothing fancy or even complex.

And app should allow user to search trough his logs filtered by one or more
tags and time interval. Here is one more constraint, since those messages
are not really web server logs all messages have the same relevance,
and by same relevace I mean that messages created 2 years ago are not less
likely to be searched for. This would translate to following situation...
If this system should survive for, let's say, 3 years and DB size were to
be 1TB,
system should be able to process search request within couple of seconds
(arbitrary number that should be as low as possible and not high enough for
users to percieve application as slow of sluggish).

App will be used by, let's say, hundreth users, could be more.
Read / search speeds are rather important and writes should be reliable, it
would not be preferable to lose "Log or two every now and then".

Now that you gave general idea on what I plan to do here is the actual
advice that I need.

I know it will seem like a pretty vague question but what advice would you
give me at this point, before i start coding stuff, prototyping my app. But
let me be more specific... I'm interested in stuff like index types on
columns,
maybe modifying table relations. Maybe even experiences using search
servers like sorl, sphynx. Or even stuff like pgStrom (
http://wiki.postgresql.org/wiki/PGStrom) I'm opened to all suggestions,
even to something like "Toni you're stupid,
you should be using cassandraDB for this, not postgresql". And keep in mind
that my general concern is to design DB and app layer so it can handle at
some point TBs of data so please try to give me advice that would result in
performance increase.

Thank you,

--
Toni Alfirevic

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Sergey Konoplev 2013-06-07 05:39:24 Re: DB design advice
Previous Message Erik Ande 2013-06-05 15:24:23 Get the last sql error