Re: automatic scan a table, report on data formats in columns

From: Steve Midgley <science(at)misuse(dot)org>
To: Shaozhong SHI <shishaozhong(at)gmail(dot)com>
Cc: pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: Re: automatic scan a table, report on data formats in columns
Date: 2022-02-21 16:59:04
Message-ID: CAJexoS+qvZ2Rut3n7D9tUnTz8zfwpGadXUG4sBwzusEqNnASmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, Feb 21, 2022, 2:06 AM Shaozhong SHI <shishaozhong(at)gmail(dot)com> wrote:

> Is it possible to do the following?
>
> automatically scan a table of all text columns
> produce a report on data formats in columns as indicated in the following:
>
> Column A Column B Column C
> alphabetic words/phrases digits like xxxxx.xx alphanumeric
> identifiers
> City of London 5 digits followed by a iso12345
> decimal point and 2
> digits indicating precision
>
>
> It is a bit like detecting regular expression patterns automatically.
>
> Is automatically detecting something like regular expression patterns
> possible?
>
> Regards,
>
> David
>

Depending on your definition of automatic, I think this is very do-able.

First you find the table names (using system catalog or hard-coded values,
depending) you're interested in and then use the columns view (
https://www.postgresql.org/docs/current/infoschema-columns.html) to
enumerate over the fields in each table to find ones with data types you
want to analyze). From there you can query each record in each column using
regex or similar to classify each column as to its contents.

Of course you have to write all that code so it's not automatic as in
built-in. But it's automatic in the sense that once written it would work
against any set of tables and columns and can be run without any human
intervention or analysis in the moment.

Steve

>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Shaozhong SHI 2022-02-21 19:54:31 Re: automatic scan a table, report on data formats in columns
Previous Message David G. Johnston 2022-02-21 16:27:12 Re: automatic scan a table, report on data formats in columns