From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Shaozhong SHI <shishaozhong(at)gmail(dot)com> |
Cc: | pgsql-sql <pgsql-sql(at)lists(dot)postgresql(dot)org> |
Subject: | Re: automatic scan a table, report on data formats in columns |
Date: | 2022-02-21 16:27:12 |
Message-ID: | CAKFQuwapUTP=aaSV=-TWtxaXp=ajh9eEZjC1MX=vxBRCL6=+xw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Mon, Feb 21, 2022 at 3:06 AM Shaozhong SHI <shishaozhong(at)gmail(dot)com>
wrote:
> Is it possible to do the following?
>
> automatically scan a table of all text columns
> produce a report on data formats in columns as indicated in the following:
>
> Column A Column B Column C
> alphabetic words/phrases digits like xxxxx.xx alphanumeric
> identifiers
> City of London 5 digits followed by a iso12345
> decimal point and 2
> digits indicating precision
>
>
> It is a bit like detecting regular expression patterns automatically.
>
> Is automatically detecting something like regular expression patterns
> possible?
>
>
Yep, and the answer for any text column you give me is:
^.*$
If you want a classification system where you have more (already known)
complex RegularExpressions and you want to choose the best fit that is also
possible, and probably much more useful.
For anything else you need a better problem specification. And I'd
probably tend toward wanting to run some kind of AI system on the data -
i.e., not something I'd perform in-database.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Midgley | 2022-02-21 16:59:04 | Re: automatic scan a table, report on data formats in columns |
Previous Message | Jian He | 2022-02-21 12:45:23 | Re: automatic scan a table, report on data formats in columns |