From: | Jasen Betts <jasen(at)xnet(dot)co(dot)nz> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: DB design advice: lots of small tables? |
Date: | 2013-03-16 06:30:13 |
Message-ID: | ki13hl$c05$1@gonzo.reversiblemaps.ath.cx |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2013-03-15, lender <crlender(at)gmail(dot)com> wrote:
> Hello.
>
> We are currently redesigning a medium/large office management web
> application. There are 75 tables in our existing PostgreSQL database,
> but that number is artificially low, due to some unfortunate design choices.
>
> The main culprits are two tables named "catalog" and "catalog_entries".
> They contain all those data sets that the previous designer deemed too
> small for a separate table, so now they are all stored together. The
> values in catalog_entries are typically used to populate dropdown select
> fields.
> So, my first main question would be: is it "normal" or desirable to have
> that many tiny tables? And is it a problem that many of the tables have
> the same (or a similar) column definitions?
Dunno about "normal", but certainly "Normal" (as in "-form").
No problem.
> The second point is that we have redundant unique identifiers in
> catalog_entries (id and code). The code value is used by the application
> whenever we need to find to one of the values. For example, for a query
> like "show all open invoices", we would either -
>
> 1) select the id from catalog_entries where catalog_id refers to the
> "invoice_status" catalog and the code is "open"
> 2) use that id to filter select * from invoices
>
> - or do the same in one query using joins. This pattern occurs hundreds
> of times in the application code. From a programming viewpoint, having
> all-text ids would make things a lot simpler and cleaner (i.e., keep
> only the "code" column).
>
> The "id" column was used (AFAIK) to reduce the storage size. Most of the
> data tables have less than 100k records, so the overhead wouldn't be too
> dramatic, but a few tables (~10) have more; one of them has 1.2m
> records. These tables can also refer to the old catalog_entries table
> from more than one column. Changing all these references from INT to
> VARCHAR would increase the DB size, and probably make scans less
> performant. I'm not sure know how indexes on these columns would be
> affected.
>
> To summarize, the second question is whether we should ditch the
> artificial numeric IDs and just use the "code" column as primary key in
> the new tiny tables.
I if they aren't hurting you keep them.
> Thanks in advance for your advice.
If you're worried about clutter It may make sense to put all the small tables
in a separate schema.
--
⚂⚃ 100% natural
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Alexeev | 2013-03-16 08:33:12 | Re: Addled index |
Previous Message | Jasen Betts | 2013-03-16 06:16:33 | Re: C++Builder table exist |