Making it stick.: TSS: Using Javaspaces

Sunday, February 04, 2007

TSS: Using Javaspaces

There is a so-so article on Javaspaces over at TSS but it has spawned a long, interesting thread of multiple topics. After undergoing a substantial signup process and clicking on the url in their email, TSS still refuses to allow me to participate. That sucks, but I'll just put my various responses here. Sites that create barriers to participate irk me.

Going down the comments, pulling out what catches my eye. Some of my responses are clarifications, some are educated guesses. Too bad these are not able to be in-line for others to correct.

"public fields"

The top level object written to a space implements Entry. Only the public fields are marshalled to/from the space. This strikes people as funny at first. Ken Arnold has a rationale. This is another of those things that should be in the core documentation.

First of all, the objects those public fields refer to have all their data serialized. (They implement Serializable.) This is only the top-level Entry that considers just public fields.

The analogy Ken gives is that an Entry's public fields are like the parameters to an asynchronous procedure call. Read his explanation. It works for me. This choice also ties into keeping things simple for the application developer and the space implementor. More elaborate choices bring more complexity.

"[spaces] works best when the problem is... 'data-centric'... as opposed to 'process-centric'"

I am not sure what this means. Spaces are good for distributed processes as well as data. An Entry and its referenced data is marshalled along with their codebase so that systems reading and taking them can use their code without it being on that system's classpath a priori. That is very powerful.

"Javaspaces is a poor model for building large-scale... non-holistic data-intensive compute work-loads"

Maybe. I don't know. What the heck does this even mean?

"I'm assuming that when you are altering something in a space, there's some sort of locking on it"

Nothing can be altered while in a space. Those things are not in a JVM per se. (Although a JVM may be used in the implementation.) Each public field is marshalled independently on a write, and then marshalled back into a JVM on a read or take. Object identity is not preserved. So if JVM #1 does a write and then a take of some Entry and its data, there is now the original Entry and data, plus a new deep copy of the Entry and data.

And so you can see clearly that a Javaspace is *not* a cache, and it is *not* an OODB for Java objects. It is something else altogether that can serve many purposes. The way to exclusively modify something is to take it, update it, and write (a copy of it) back to the space.

"what, if any, value JavaSpaces has vs. messaging"

I think if you want widespread, anonymous publish/subscribe of data across many disperate business processes, then something like JMS or AMQP is a good choice.

JavaSpaces can be used to implement pub/sub-like behavior but that is not its only, or even core, strength. Likewise a space can be used to implement queue-like structures with topics (i.e. the public fields of an Entry acting like topic information as well as payloads).

The big message is JavaSpaces can be used to more quickly create a wider variety of coordination conventions like sparse, distributed arrays of objects, hierarchies of objects, and so on. Moreover, those objects are marshalled with their *codebase* urls for other participants to load. JMS has no such capability to my knowledge.

There is less setup, and more options, e.g. a JavaSpace can look more like a simple database with a JVM doing the writing and taking of its own Entry instances.

"[JavaSpaces] guarantees persistent storage"

More accurately, I hope, a JavaSpace *leases* storage. Leases can expire, not be renewed, etc. So there is no guarantee of persistent storage forever, although in some cases this could be provided.

"How does Coherence relate to this"

I don't know that much about it, but it seems to me Coherence implements various forms of distributed, shared java.util.Map implementations. Based on the choice, more or less of the Map exists in the application JVM, and locks, etc. are used to updated entries more or less atomically.

A JavaSpace exists outside of any of the participating JVMs. Locks are used to get an Entry to/from a space, but there is no update in place at all. Neither are there key/value pairs. Just Entry instances with public fields.

"a Java only solution"

Yes, unless you go with Gigaspaces which has support for C/C++ and dotnet.

Or if you can take a "Java in the middle" position then the Jini parts of Jini/Javaspaces allow integration with other languages and protocols. But Java does have to show up in the middle of everything, and everything else is essentially second-class.

"why are relational databases so dominant?"

Query languages and long-lived support for static data that survives multiple generations of programming languages, etc. Not great, but they've been around for the better part of 30 years.

"no booleans, ints, or doubles"

True, the public fields of an Entry have to be Objects. An Entry is used to "query by example" in a very simple way, and null means "don't care" for that field.

This is not such a big deal, especially with Java 5 which has better support for mixing primitive types and their Object equivalents.

"What Gelernter envisioned..."

I think it is only fair to say that Tuple Spaces was an *influence* on JavaSpaces. I do not believe that the intention was to implement the strict definition of Linda.

"Croquet"

My understanding of Croquet and TeaTime is the intention is to keep shared objects in sync while allowing concurrent modifications among all the participants. A space is different in that participants may come and go, only one participant can update an object, and moreover the update only occurs on the *copy* that was read or taken from the space. When the update is written back, it will be a new thing, not an update.

"why is open source risk free"

I don't think it is risk free. It reduces some risks. e.g. it reduces the risk of having to update on a vendor's schedule. It also reduces the risk not deploying as many instances as desired, when desired, i.e. potential negative results of a combination of licensing structures (e.g. per CPU) and IT budgest (e.g. not wanting to license a lot of development and test environments, or more than a minimum number of production machines).

It is not just about source code. I've worked for vendors and have been a customer of vendors, where source code was part of escrow agreements... if something goes wrong, the code is in escrow and should become available to the customer.

Many commercial vendors such as Confluence (wiki) and Cincom Smalltalk, provide access to, even modification of, their source code within limits.

6 comments:

Joseph Ottinger said...: Patrick, we seem to have a problem in the signup code, where new users aren't able to log in after signup. It's been sent to the proper people, but they're, well, taking their time. I'll escalate. (Again.); 2/05/2007 4:34 AM
Anonymous said...: Very nice summary. Since I've been doing a little homework recently, I'll add a little to the question "what, if any, value JavaSpaces has vs. messaging"

It's a link to Dan's blog answering the same question. http://www.dancres.org/blitzblog/2006/02/20/javaspaces-and-jms/

In reference to your troubles commenting on TSS, you might not know that your own blog requires a Google or Blogger account in order to comment.; 2/05/2007 6:20 AM
Patrick Logan said...: "you might not know that your own blog requires a Google or Blogger account in order to comment"

I know. ;-/

I was hoping no one would point that out.

At a minimum I hope the process works and is not as cumbersome as the one at TSS.

Where is that identity thing again?; 2/05/2007 9:20 AM
PetrolHead said...: "I think it is only fair to say that Tuple Spaces was an *influence* on JavaSpaces. I do not believe that the intention was to implement the strict definition of Linda."

Your belief is correct - as the original Linda model was really all about a concurrent programming model for use in shared-memory machines be they SMP, NUMA or whatever.

JavaSpaces supports a similar programming model (although I use some other models with them as well) but is directed towards use across multiple independent machines with no shared memory.

Dan Creswell
http://www.dancres.org/; 2/05/2007 1:54 PM
Paul Beckford said...: Hi,

Good summary. The discussion lost me too! About Croquet, your discription of Teatime is accurate, but as I posted on TSS - the similarity with Tuple Spaces actually exists in serverside Croquet. Something they call a "Worldbase".

The things is though, to know what a worldbase is, would mean reading the white paper supplied, something people are obviously reluctant to do (to read primary sources) - hence the miss-representation of Tuple Spaces, Linda etc.; 2/06/2007 2:15 AM
Anonymous said...: Thanks for your nice post!; 9/21/2007 4:21 PM

Making it stick.

Search This Blog