Re: [SPAM] Re: Best practices to manage custom statistics

From: Moreno Andreo <moreno(dot)andreo(at)evolu-s(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [SPAM] Re: Best practices to manage custom statistics
Date: 2016-11-23 13:24:06
Message-ID: 5eb18d29-cb38-1591-3ddf-aafbffa5a044@evolu-s.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Il 15/11/2016 18:19, Adrian Klaver ha scritto:
> On 11/15/2016 07:39 AM, Moreno Andreo wrote:
>> Sorry for late reply... i'm in some quite rough days....
>>
>> Il 08/11/2016 21:28, Adrian Klaver ha scritto:
>>> On 11/08/2016 12:13 PM, Moreno Andreo wrote:
>>>> [...]
>>>>
>>>> In your experience, would this approach help me lower server load?
>>>> Are there any other approach I can try?
>>>
>>> Instead of pushing why not pull.
>> Excuse me Adrian, but I can't get what you mean by not pushing but
>> pulling.
>> We are now pulling data from clients for about everything we need...
>> what I'd like to do is either
>> - the database pushes updates when needed,
>
> Pushes updates of what, the statistics you talking about or other data?
Statistics, modified by "something internal". I just need a number
(okay, it's not just one, say about six, for now)
>
>> or
>> - the client pulls data from database, but querying a reduced dataset
>> (just a row values with all values for that user and not thousands of
>> rows)
>
> This confuses me given from your OP:
>
> " 350 databases, 350 users, every user connects to his own database and
> his teammates' (max 10 in total) so each user can connect to max 10
> databases at a time"
>
> "This is obtained, at the moment, with a select count(*) from ..... (that
> involves 4 joins on 4 tables) to be run run every 20 secs from each
> client connected to the cluster (ATM about 650 clients configured, about
> 200 concurrent) to each database it has rights to connect."
>
> So does the user need only their data or do they need the other users
> data also?
they can be able to download also their group mates' data (stats,
numbers), if they want. Numbers can be confusing because evey user can
have more than one workplace.
>
>>> In other words do the users really check/need the statistics every 20
>>> secs?
>> Ideally, I need a real value to be read when that value changes. But on
>> Earth I'm happy with a consistent value (If it should be 800 and it
>> reads 799 it's not an earthquake) at least on a regular basis. This
>> means that if there's no activity, we will be uselessly polling the
>> database, so here's why I thought about "pushing" data from backend to
>> client, that would be the nearest to ideal solution.
>>> Given that you say exact is not important over the course of day, why
>>> not create a mechanism for the user to poll the database when they
>>> need the information.
>>
>> This is what we did in the past. The result was that users _did not_
>> update values (clicking an "update" button) and made disasters working
>> with "old" data (they forgot to do it, they didn't want to do it because
>
> How can they be working with 'old' data? The queries you are running
> are compiling stats on data that exist at the time they are run and at
> any point in time between stats runs the user is working with current
> data regardless of what the last stats say.
Since we are on ADO.NET (with Npgsql) and we don't keep connections open
(query, fill a structure and disconnect), in the time slice between two
updates they will have both data and stats that are not "real" in that
moment...
I'll try to make an example
You and me are teammates and work everyone at his place (several miles
away). You read data from my database and get both stats and work data.
Suddenly I make a change. This change won't be visible to you until you
refresh data (that was pressing "Update" button, now it's automated
every 60 secs).
Hope It's clearer now...

>> "it's another click, I waste my time", and so many, even stupid,
>> excuses... but they're the customers, they pay, and here we say that
>> "customer is always right")
>
> Except when they are wrong:) Still been there.
Don't tell me.... :-)
>
>>
>> So we changed: now we check for values and for data (not every 20 but
>> 60 seconds... I just checked the right value). I need something that's
>> lighter for the DB backend, at least for the values procedure. If we had
>> only a database, I think that queries and datasets would be stuck in
>> cache, so response times would be faster. With more than 350 databases,
>> that's not possible (or we have to grow RAM size to values very big...)
>>
>> I've also thought about using LISTEN/NOTIFY to send value updates to
>> client only when needed, but with NPgSQL I read that we need to keep an
>> open connection, and that's not a good idea AFAIK.
>>
>> Thanks
>> Moreno
>>
>>>
>>>>
>>>> If more details are needed, just ask.
>>>>
>>>> Thanks in advance and sorry for the long message (but I had to explain
>>>> such a complex thing)
>>>> Moreno.-
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-11-23 15:25:36 Re: query locks up when run concurrently
Previous Message Poul Kristensen 2016-11-23 10:46:40 Re: Postgresql 9.5 and Shell scripts/variables vs. C programming/defining a value to be used