Re: automatic scan a table, report on data formats in columns

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Shaozhong SHI <shishaozhong(at)gmail(dot)com>
Cc: pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: Re: automatic scan a table, report on data formats in columns
Date: 2022-02-21 16:27:12
Message-ID: CAKFQuwapUTP=aaSV=-TWtxaXp=ajh9eEZjC1MX=vxBRCL6=+xw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Mon, Feb 21, 2022 at 3:06 AM Shaozhong SHI <shishaozhong(at)gmail(dot)com>
wrote:

> Is it possible to do the following?
>
> automatically scan a table of all text columns
> produce a report on data formats in columns as indicated in the following:
>
> Column A Column B Column C
> alphabetic words/phrases digits like xxxxx.xx alphanumeric
> identifiers
> City of London 5 digits followed by a iso12345
> decimal point and 2
> digits indicating precision
>
>
> It is a bit like detecting regular expression patterns automatically.
>
> Is automatically detecting something like regular expression patterns
> possible?
>
>
Yep, and the answer for any text column you give me is:

^.*$

If you want a classification system where you have more (already known)
complex RegularExpressions and you want to choose the best fit that is also
possible, and probably much more useful.

For anything else you need a better problem specification. And I'd
probably tend toward wanting to run some kind of AI system on the data -
i.e., not something I'd perform in-database.

David J.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Steve Midgley 2022-02-21 16:59:04 Re: automatic scan a table, report on data formats in columns
Previous Message Jian He 2022-02-21 12:45:23 Re: automatic scan a table, report on data formats in columns