From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Rémi Cura <remi(dot)cura(at)gmail(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: automated 'discovery' of a table : potential primary key, columns functional dependencies ... |
Date: | 2019-11-22 22:48:50 |
Message-ID: | 80d60035-6c4a-4eac-df16-956fc49901e8@aklaver.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 11/22/19 2:05 PM, Rémi Cura wrote:
> Hello dear List,
> I'm currently wondering about how to streamline the normalization of a
> new table.
>
> I often have to import messy CSV files into the database, and making
> clean normalized version of these takes me a lot of time (think dozens
> of columns and millions of rows).
To me messy means the information to do the below is not available.
Personally I think you best bet is to get the data into tables and then
use visualization tools to help you determine the below. My guess is
there will be a lot of data cleaning going on before you can get to a
well ordered table layout.
>
> I wrote some code to automatically import a CSV file and infer the type
> of each column.
> Now I'd like to quickly get an idea of
> - what would be the most likely primary key
> - what are the functional dependencies between the columns
>
> The goal is **not** to automate the modelling process,
> but rather to automate the tedious phase of information collection
> that is necessary for the DBA to make a good model.
>
> If this goes well, I'd like to automate further tedious stuff (like
> splitting a table into several ones with appropriate foreign keys /
> constraints)
>
> I'd be glad to have some feedback / pointers to tools in plpgsql or even
> plpython.
>
> Thank you very much
> Remi
>
>
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | stan | 2019-11-22 23:52:16 | And I thought I had this solved. |
Previous Message | Rémi Cura | 2019-11-22 22:05:01 | automated 'discovery' of a table : potential primary key, columns functional dependencies ... |