Re: unorthodox use of PG for a customer

From: Gary M <garym(at)oedata(dot)com>
To:
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: unorthodox use of PG for a customer
Date: 2018-08-25 18:17:59
Message-ID: CAGwOJnzCPyO1_an1NEhOcW4NZ8Di5eW+nOEgPq_3nZTxX8jPfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hmmm I usually don't participate in forums and accidentally sent my reply
to David Gauthier directly. This issue sounds interesting I'll give it
another try and send the reply to the group.

From my simple, engineering hat perspective I feel this question is
difficult to answer without hard numbers which include:

1) the amount of data stored in the db including indexes
2) Expected peak number and type of db operations per second
3) Max latency the operations can consume
4) Number and type of CPUs on each workstation
5) Available RAM for each workstation
6) Data storage utilization (MBs, IOPs)
7) Data storage capacity and type (SSD/spinning/network)
8) Type of network ( number of networks and bw)
9) Network bw utilization
10) during operation what is the swap space utilization

From these numbers the approach and architecture can be determined.

For example, if the available ram has sufficient capacity, file system
cache and pgcache can be tuned to keep all the data in ram likely to
significantly improve write and query performance. Another option is to
have a dedicate SSD for data and/or swap space.

I prefer these analytical "rules of 10" as starting point to guide
alternatives

b/r
gary

On Fri, Aug 24, 2018 at 12:18 PM, David Gauthier <davegauthierpg(at)gmail(dot)com>
wrote:

> Hi Everyone:
>
> I'm going to throw this internal customer request out for ideas, even
> though I think it's a bit crazy. I'm on the brink of telling him it's
> impractical and/or inadvisable. But maybe someone has a solution.
>
> He's writing a script/program that runs on a workstation and needs to
> write data to a DB. This process also sends work to a batch system on a
> server farm external to the workstation that will create multiple, parallel
> jobs/processes that also have to write to the DB as well. The workstation
> may have many of these jobs running at the same time. And there are 58
> workstation which all have/use locally mounted disks for this work.
>
> At first blush, this is easy. Just create a DB on a server and have all
> those clients work with it. But he's also adamant about having the DB on
> the same server(s) that ran the script AND on the locally mounted disk. He
> said he doesn't want the overhead, dependencies and worries of anything
> like an external DB with a DBA, etc... . He also wants this to be fast.
>
> My first thought was SQLite. Apparently, they now have some sort of
> multiple, concurrent write ability. But there's no way those batch jobs on
> remote machines are going to be able to get at the locally mounted disk on
> the workstation. So I dismissed that idea. Then I thought about having 58
> PG installs, one per workstation, each serving all the jobs pertaining to
> that workstation. That could work. But 58 DB instances ? If he didn't
> like the ideal of one DBA, 58 can't be good. Still, the DB would be on the
> workstation which seems to be what he wants.
>
> I can't think of anything better. Does anyone have any ideas?
>
> Thanks in Advance !
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message TalGloz 2018-08-25 20:23:39 Re: Linker errors while creating a PostgreSQL C extension function.
Previous Message TalGloz 2018-08-25 17:13:56 Re: Returning Vector of Pairs with a PostgreSQL C Extension Function