Re: How to store text files in the postgresql?

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: DimitryASuplatov <genesup(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: How to store text files in the postgresql?
Date: 2009-06-08 16:11:26
Message-ID: 4A2D382E.5060407@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

DimitryASuplatov wrote:
> My task is to store a lot (10^5) of small ( <10 MB) text files in the
> database with the ability to restore them back to the hard drive on
> demand.
>
> That means that I need two functions. First - grab file from the
> directory, store it in the database and delete from the disk; second -
> recreate in back to the disk....
>
> 2/ Could you give me some quick tips on how to manage it from the start
> so that I knew what to look for in the manual?
>
Others have pointed out some of the scripting methods to use. Some other
things to consider:

1. Give some thought to character-sets. It's likely that UTF8 on the
backend will be fine but there exists the possibility that you will have
to set the appropriate encoding on the client-side depending on the
nature of the files you will be reading.

2. Similarly, think about language. I don't know if you need to do
searches based on the contents of your text-fields but you should be
aware that PostgreSQL has some very nice text-search functions. Take a
moment to scan those docs to see if there is anything that may be of
use: http://www.postgresql.org/docs/8.3/interactive/textsearch.html

3. Tell us more about the app. How does data flow in/out of the system?
Are records updated or static? Are they retained indefinitely or purged?
If purged, on what basis? Lets say, for example, that you are putting
the data in for archival purposes and the records should be purged
monthly after 5 years. In that case, you should consider including some
sort of time indicator in your data and partitioning your table into
month-sized tables. Partitioning uses PostgreSQL's inheritance feature
allowing you to have a single parent table with no data and multiple
child tables that contain chunks of your data so every month you could
create a new child table to take in the new files for the month and drop
the table containing the data you want to purge. Or you could have the
table partitioned by users, projects or whatever is appropriate for your
app. Read up on inheritance and partitioning at
http://www.postgresql.org/docs/8.3/interactive/ddl-inherit.html and
http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html. If
you are able to partition into static chunks and changing chunks, you
may be able to create a much more efficient backup configuration wherein
you only regularly backup the changing piece of data.

Cheers,
Steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Wilson 2009-06-08 16:23:30 Re: Entering time into postgresql
Previous Message Keaton Adams 2009-06-08 15:58:25 Any way to bring up a PG instance with corrupted data in it?