Re: Ideas for building a system that parses medical research publications/articles

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Laura Smith <n5d9xq3ti233xiyif2vp(at)protonmail(dot)ch>
Cc: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Ideas for building a system that parses medical research publications/articles
Date: 2021-06-05 13:45:55
Message-ID: CAM+6J97cP8vp7GoVdjNVd2ogow0ADPw+F=FSyda4uPZzjXc1ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

http://tika.apache.org/

To get started with collecting doc metadata. It looks this tool can help
you started.
postgres does support fuzzy text search, so I do think dumping meta data
/abstract in postgresql and then using trigram tsearch etc like extensions
it should work well for a POC.
this being a pg mailing list :) what would be your expectation of type of
data and growth of data would be your queries.
If you store data to support multiple lingual papers, will postgresql be
able to handle ?
Ideally the docs would be stored somewhere on a object storage etc and the
link of the same would be stored in the db when someone would request to
read the whole paper.
Long before I read this
https://www.citusdata.com/blog/2017/04/20/analyzing-postgresql-email-archives/

So if this could work, your POC should too :) with postgresql.

On Sat, 5 Jun 2021 at 5:14 PM Laura Smith <
n5d9xq3ti233xiyif2vp(at)protonmail(dot)ch> wrote:

>
>
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Saturday, 5 June 2021 12:14, Achilleas Mantzios <
> achill(at)matrix(dot)gatewaynet(dot)com> wrote:
>
>
> >
> > I know its a huge work, but you are missing a point. Nobody wishes to
> > compete with anyone. This is a about a project, a parent-advocacy
> > non-profit that ONLY aims to save the sick children (or maybe also
> > very young adults) of a certain spectrum . So the goal is to make the
> > right tools for researchers, clinicians and parents. This market is too
> > small to even consider making any money out of it, but the research is
> > still very expensive and the progress slower than optimum.
>
>
> Unfortunately I'm not "missing a point", your final paragraph summarises
> your position.
>
> You have been taken in by the very charitable goal of saving sick children.
>
> Unfortunately your head has been disconnected from your heart.
>
> If we put the charitable purpose to one side and take a purely objective
> view at what you want to do, my original statement still stands, i.e. the
> certainty that you are grossly underestimating the technical and practical
> complexities of what you want to achieve.
>
>
> --
Thanks,
Vijay
Mumbai, India

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2021-06-05 15:34:16 Re: Ideas for building a system that parses medical research publications/articles
Previous Message Laura Smith 2021-06-05 11:44:22 Re: Ideas for building a system that parses medical research publications/articles