Re: A Replication Idea

From: Medi Montaseri <medi(at)cybershell(dot)com>
To: "Command Prompt, Inc(dot)" <pgsql-general(at)commandprompt(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: A Replication Idea
Date: 2002-02-22 01:31:44
Message-ID: 3C759F80.9CEA2CB6@cybershell.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I don't think we can figure out what the actual plans are based on the very
high level SQL language. The proxy should delve into a deeper layer after
the plan has been written and before the execuation is kicked in.

In other words, you take a PG engine, you pill off the fron end, parser,
planner
part and then slip in a layer before the execution.

See your installation docs, "Chap 2, Section 2.1 The Path of a Query"

The path is

Connection, Parser Stage, Rewrite System, Planner/Optimizer, Executor.

In fact the name is already there "Planner/Optimizer" what we want is
optimization. I know people usually mean a different thing, but why not.
HA is optimization as well...

By the way I got this idea from Solaris Virtual File System (VFS), I call
this VDB (Virtual DataBase).

"Command Prompt, Inc." wrote:

> >How would it handle functions, which could potentially modify data, even
> >from a select statement?
>
> It seems that you'd have two options, if you wanted the proxy to be truly
> transparent to the client:
>
> 1. Send ALL SQL statements down the wire to each node, including SELECT
> statements, since selected functions may modify data.
>
> 2. Write a small, fast, reliable parser that checks for criteria which
> would make the statement potentially data-modifying (e.g., the
> existence of a function), and send only data-modifying SELECTs along
> with your standard UPDATEs, DELETEs, etc.
>
> However, it probably just occurred to you all as it just occurred to me
> that this is pretty moot, because functions aren't the only concern: you
> could have a trigger on a table that would wipe out idea #2. ;)
>
> Really, there are too many transparent ways data can be modified by
> seemingly innocuous statements, so parsing a statement for distribution
> is right out; it seems as though each node is going to have to require a
> copy of EACH statement that the proxy runs into in order to maintain 100%
> integrity.
>
> However, that doesn't mean your proxy needs to get answer back from all of
> the nodes in terms of result sets. Something as simple as a systemic
> packet indicating that the downstream-execution was successful would be
> enough data for the proxy to know what's going on, provided it knows it
> should get its answer soon from another node (e.g., the node with the
> lowest load).
>
> Result sets could still be cached based on a statement, within some
> specified degree of accuracy (e.g., how much time elapses before a cached
> resultset expires); you'd just need to make sure that even though you're
> returning a cached result set, you still send the request to each back-end
> to get processed in its own time.
>
> Seems like some *really* careful threading might be called for; one thread
> to listen to incoming traffic, from which downstream events are queued up,
> another thread sending off those events to the back-end in the order they
> were received, and another thread listening for answers from nodes, and
> queueing up responses to be sent back to the appropriate client's socket.
>
> Regards,
> Jw.
> --
> jlx(at)commandprompt(dot)com, by way of pgsql-general(at)commandprompt(dot)com
> http://www.postgresql.info/
> http://www.commandprompt.com/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
-------------------------------------------------------------------------
Medi Montaseri medi(at)CyberShell(dot)com
Unix Distributed Systems Engineer HTTP://www.CyberShell.com
CyberShell Engineering
-------------------------------------------------------------------------

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Lee Harr 2002-02-22 01:50:45 Re: deleting an identical record
Previous Message Tom Lane 2002-02-22 01:12:30 Re: PL/pgSQL Memory Management?