Re: Query questions

From: Poul Jensen <flyvholm(at)gfy(dot)ku(dot)dk>
To: Chris Travers <chris(at)verkiel(dot)metatrontech(dot)com>
Cc: Douglas McNaught <doug(at)mcnaught(dot)org>, Ragnar Hafstað <gnari(at)simnet(dot)is>, pgsql-general(at)postgresql(dot)org
Subject: Re: Query questions
Date: 2005-09-06 13:13:36
Message-ID: 431D9600.5060603@gfy.ku.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> Look into inheritance. It makes this easier. However, I don't care
> which RDBMS you use, management of 1000 identical tables is going to
> be a real pain and I think that everyone here will probably suggest
> that it is not exactly a sane thing to do.

Thank you, Chris. I have omitted two important points. One is:
The database needs no maintenance. Once created nothing in it will be
modified - there'll only
be SELECT queries to look up data satisfying certain criteria.

To summarize the task:
I have ~500,000 data files containing ~1,000 records each. The database
should then contain:
1) A detailed table (~1,000 rows, ~15 columns) for each file
2) A small table with file summary (1 row, ~30 columns) for each file
The typical query should then check file summaries to identify which
detailed tables/subtables
to check for further conditions.

The other important point I forgot about (much worse):
The detailed table for each file is created with a number of columns
which for most files have
the same value for all records. I had planned to delete columns
containing only a single
value and push it to the file summary instead. Hence, the 500,000
detailed tables wouldn't
all be identical. The initial check on the file summary would determine
whether the detailed
table for a given file should be searched and, if so, what columns are
found in it.

I guess I could either:
1) Add a lot of redundant data to the database so tables can be
combined, allowing single
query searches. Judging from the size of my test database I would
end up with ~200 GB
without redundant data, so I do consider this a problem.
2) Write code (e.g. in C) with a loop to do separate queries for every
file - I imagine this
would be terribly inefficient compared to the single query option.

Question:
If I create a table with all the columns needed to combine the 500,000
tables and I simply
omit the redundant columns when entering the subtable for a given file,
will PostgreSQL
fill in values, not allowing me to save any space? Would I have to make
an inheritance
hierarchy to save the space? If so, all columns not in the parent table
can only be
accessed via separate, child-table specific queries which is
troublesome... :-|

Help!

Poul

In response to

Browse pgsql-general by date

  From Date Subject
Next Message A. Kretschmer 2005-09-06 14:00:32 Re: Installation problem
Previous Message Ben Grimm 2005-09-06 11:34:44 Re: fix pg_autovacuum