Making it stick.: Re: Question about message passing paradigm

Tuesday, July 01, 2008

Re: Question about message passing paradigm

Responding to a thread on the erlang-questions list...

The problem we are discussing is processes B, C, D hold information X, Y, Z respectively; process A wants a coherent snapshot of X, Y,Z.
There are actually two slightly different cases depending on A needs "X Y Z as of *now*" (A, B, C, and D must all synchronise), or A needs "X Y Z as of *some* time in the recent past" (B, C, and D must all synchronise but can then send the information to A without waiting for A to receive it).
I like this problem because it is simple yet subtle. One way that it is subtle is that in "multithread" programming most people STILL think in terms of a single absolute time shared by all threads. (After all, there _is_ such a thing, the system clock. And yes, it's not exactly true, but it's close enough to make people _think_ it's true.) But when you start thinking about Erlang and especially *distributed* Erlang, you start realising that "now" is a pretty fuzzy concept.

Yes, the problem seems simple yet subtle. The downside is there are many unwritten constraints (or not) on any specific problem that could lead the solution alternatives one way or another. Unless you want to really dig into those, then the cost/benefit of one solution or another could be more or less off.

e.g. why not coordinate through an in-memory database? This could be reasonable, or not. We don't know enough.

Why not schedule the source processes to send a message on a periodic or scheduled basis? This could be reasonable, or not, and cut down the message traffic, which seemed to be a concern.

Why is sending fewer than N messages a concern? Why does one process have to collect the information? How much information? How tight is the deadline? Is "now" an actual timestamp or just some unknown point in time that a request has been received? How close to "now" do the other "nows" have to be with respect to each other? Can you widen that window if it would decrease the effort to build?

If synchronization across the processes is needed then is an "eventually consistent" approach reasonable if it lowers the effort to build?

Interesting stuff, but challenging to talk about in when the details are too abstract.