Re: Need suggestion

From: Carl von Clausewitz <clausewitz45(at)gmail(dot)com>
To: Ognjen Blagojevic <ognjen(dot)d(dot)blagojevic(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need suggestion
Date: 2011-06-02 10:22:15
Message-ID: BANLkTimqLS2h74pVKJc=ChFessQ7xW22xQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear Ognjen,

thank you - that was my idea too, but I've never seen such a workload like
this. The docu's (which are not processed by any ocr hopefully) and the
pictures are not indexed off course, just some metadatas, which are related
to the exact docu, or pic For example:

productions_docu1:
-sent date
-recieved date
-type
-owner
-case_id
-etc

image_001:
-picturetaken date
-case_id
-image_type

Just these metadatas need to be searched. My questions about the structure
was like this: do you recommend, to store the images and the docu's in a
same table (CREATE TABLE docu_img_store (id BIGSERIAL, case_id
BIGINT, content_type INTEGER, content bytea), or store it in two different
tables? Is there any special settings while table creations, that I have to
set for optimal work (like index, storage parameter, toast, etc).

(:-) I know, that this project could be a high value revenue for any DB
consultancy related company, but this is a small country, with small project
fees, and I'm employee, not a contractor at my company :-)

Thanks you in advance,
Regards,
Carl

2011/6/2 Ognjen Blagojevic <ognjen(dot)d(dot)blagojevic(at)gmail(dot)com>

> Carl,
>
> I don't have experience with that big databases, but I did both solutions,
> and here are pros of both of them:
>
> 1. Files stored on the filesystem:
> - Small database footprint
> - Faster backup, export and import
>
> 2. Files stored in the database
> - RDBMS takes care of transactions and ref. int.
> - Slower backup, export and import but all done in one step
> - Easier continuous archiving
>
> I slightly prefer option no. 2, since transaction handling, rollback and
> ref. integrity is not so easy to implement when you have two different
> storage systems (FS and RDB).
>
> As for indexes and tables it is not clear form your message whether you
> need just a regular search (field LIKE 'something'), full text search of
> metadata, or full text search of scanned documents (in case they are OCRed).
>
> Regards,
> Ognjen
>
>
>
> On 1.6.2011 10:08, Carl von Clausewitz wrote:
>
>> Hello Everyone,
>>
>> I got a new project, with 100 user in Europe. In this case, I need to
>> handle production and sales processes an its documentations in
>> PostgreSQL with PHP. The load of the sales process is negligible, but
>> every user produces 2 transaction in the production process, with 10-30
>> scanned documents (each are 400kb - 800kb), and 30-50 high resolution
>> pictures (each are 3-8 MB), and they wanted to upload it to 'somewhere'.
>> 'Somewhere' could be the server files system, and a link in
>> the PostgreSQL database for the location of the files (with some
>> metadata), or it could be the PostgreSQL database.
>>
>> My question is that: what is your opinion about to store the scanned
>> documentation and the pictures in the database? This is a huge amount of
>> data (between daily 188MB and 800MB data, average year is about 1 TB
>> data), but is must be searchable, and any document must be retrieved
>> within 1 hour. Every documentations must be stored for up to 5 years...
>> It means the database could be about 6-7 TB large after 5 years, and
>> then we can start to archive documents. Any other data size is negligible.
>>
>> If you suggest, to store all of the data in PostgreSQL, what is your
>> recommendation about table, index structure, clustering, archiving?
>>
>> Thank you in advance!
>> Regards,
>> Carl
>>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message salah jubeh 2011-06-02 10:50:04 Re: Table with active and historical data
Previous Message Ognjen Blagojevic 2011-06-02 09:19:03 Re: Need suggestion