Re: What's your experience with using Postgres in IoT-contexts?

From: Jonathan Strong <jonathanrstrong(at)gmail(dot)com>
To: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: What's your experience with using Postgres in IoT-contexts?
Date: 2020-10-14 15:28:45
Message-ID: CAK8Y=HXqp_G+FCFDLomO51n9TSns8EW2f8D8q0fp-Gs1W6Nuew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 14, 2020 at 8:49 AM Peter J. Holzer <hjp-pgsql(at)hjp(dot)at> wrote:

> On 2020-10-13 06:55:52 +0200, chlor wrote:
> > > I want to have long term storage and access to individual telegrams
> >
> > An IOT is not designed for that. It is used for control or delivery of
> > data to a server.
>
> That's a rather dogmatic and narrow-minded point of view. "IOT" means
> "Internet of things". There are many things which which could benefit
> from network connectivity and don't necessarily need a central server or
> may even act as servers for other "things".
>
> It all depends on the application and the "thing".
>
> > Long term storage also means backup and recovery and I don't think you
> > have that planned for your IOT.
>
> That depends on how valuable those data are.
>
> hp
>
> --
> _ | Peter J. Holzer |
>
>
Indeed. IoT architecture also begs the question of "when" detailed
historical data may be needed, and how Edge Computing can factor into the
overall solution model. Detailed transactions may live "at the edge" while
needed aggregate / extract info is communicated to a central server to
support real time response. But those left-behind detailed transactions may
(or may not) follow later on via a lower priority / non- real time path if
relevant and eventually valuable. Some examples I've had to work with:

When calculating real-time Equity / Security Index values, you might
capture tick by tick data for each security in an index valuation formula.
Just one security in an index (e.g., MSFT) could easily generate more than
100,000 ticks per day. One of the Large Cap indices currently has about
3,500 stocks in it. They might not all trade as frequently as MSFT, but you
might see anywhere from 10 million to 100 million data points in a day.
While this differs from IoT in that data sources aren't physically
separated and as numerous as individual IoT devices, the challenge is
similar in that a good real time architecture makes use of needed data at
various stages in the process flow (and data network flow) and defers
functions that can wait, including perhaps committing full details of every
transaction to centralized long term storage, as long as the computed Index
value can be published in real time.

Years ago we developed an online gaming platform supporting hundreds of
thousands of concurrent users who came in from numerous countries around
the world. Real-time scoring and chat communications posed challenges
similar to the Equity Index solution above. We needed to be able to accept
play data from thousands of concurrent players and have a game (or chat
room) respond in near real time, but full detailed data could be queued up
and gradually transmitted, processed, assimilated and committed to long
term storage.

In health care data collection we see similar challenges: real time IoT
biosensors may capture blood oximetry, glucose, lactate info, heart rate,
etc. Some of this may be critical for real time monitoring and processing.
Some gets processed "at the Edge" - aggregated, filtered, interpreted, etc.
before getting to central / long term storage.

Deciding the level of detail that actually has to reach centralized long
term storage - and when - is typically a non-trivial exercise. When you
look at examples like monitoring a jet engine, gas turbines, or an air
conditioner manufacturer and service company (one of my past clients)
monitoring hundreds of thousands of HVAC units distributed around the
country, data samples go past terabytes to petabytes, exabytes and more.

While you need to figure out how to trim the raw data to amounts that can
reasonably be stored and managed, I've seen too many cases of being overly
aggressive in discarding data thought to be superfluous; thoughtful
analysis is critical here.

- Jon

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Raul Kaubi 2020-10-14 15:34:55 Re: Parameter value from (mb/gb) to bytes
Previous Message Tom Lane 2020-10-14 15:23:20 Re: Parameter value from (mb/gb) to bytes