From: | George Neuner <gneuner2(at)comcast(dot)net> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Watching for view changes |
Date: | 2018-12-23 09:00:19 |
Message-ID: | l1gu1e9pova7q543j5bcs0o0i1njm5io09@4ax.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, 21 Dec 2018 23:41:16 -0800, Mitar <mmitar(at)gmail(dot)com> wrote:
>Hi!
>
>On Fri, Dec 21, 2018 at 11:10 PM George Neuner <gneuner2(at)comcast(dot)net> wrote:
>> A materialized view IS exactly such a deliberate cache of results from
>> applying a view. It is a real table that can be monitored for changes
>> using INSERT, UPDATE and/or DELETE triggers.
>
>Caching is needed if you want to compute a difference between previous
>version and new. But if you want to just know new value, then I could
>imagine that (a simple implementation would) on every change to any
>underlying table check if this change matches selectors of the query
>and if such apply its operations/projections and produce the new
>value.
Yes, that could be done. But it isn't. In effect you are asking the
DBMS also to be a spreadsheet: i.e. change this cell and everything
that depends on it gets recomputed.
A spreadsheet is an order of magnitude simpler to implement than a
DBMS, but the combination would be an order of magnitude (or more)
harder. Even object graph databases don't offer spreadsheet
functionality, and it would be a lot easier to do there than in a
table relational system.
>So yes, you need caching if you want to decrease CPU use, but you
>could also see it as new values being computed again and again through
>query. Would such caching you are mentioning really improve
>performance, I do not know, so it might be premature optimization?
It may take only 3 cycles to multiply two numbers, but it can take
thousands of cycles [or millions if the data is on disk] to get those
two numbers into the multiplier.
There always are exceptions, but the general rule is that whenever the
result requires:
- significant computation,
- significant resources, or
- significant time
then you should cache the result instead of recomputing it.
Joins and sorts can take a whole lot of memory (and spill onto disk if
they overflow the work buffer). A fetch of a table or index not in
memory is simple but takes a lot of time - as well as maybe pushing
something else out (increasing the complexity of a competing query).
>If we do go down the cache path, then I agree, materialized views
>seems nice, but I would prefer temporary materialized views: they
>should be cleaned up at the end of the session. Moreover, they should
>be ideally just in memory, not really on disk. Materialized views are
>currently stored to disk, no?
In PG, all *named* tables are backed on disk - even temporary tables.
Only anonymous tables of query results can exist entirely in memory
[and even they can spill onto disk when necessary].
With enough memory you can cache all your tables in shared buffers and
have enough extra that you never run out of work buffers and never
overflow a work buffer. But that is the best you can achieve with PG.
George
From | Date | Subject | |
---|---|---|---|
Next Message | Istvan Soos | 2018-12-23 13:39:14 | logical replication resiliency |
Previous Message | Mitar | 2018-12-23 06:09:49 | Re: Watching for view changes |