Making it stick.: 12/18/05

Saturday, December 24, 2005

Logically Out of the Blue

I did not see this coming. I was in a Barnes and Noble book store the other day and a new book jumped off the shelf into my hands. (I was shopping for gifts, honestly. I don't know what drew me over to the computer section.)

The relationship between Lisp and logic programming goes back to the 1960's and was a precursor to Scheme being invented in the first place. Steele created Scheme as a vehicle to understand Carl Hewitt's actor language which was derived from his Planner language which Hewitt developed about the same time as Alain Colmerauer was developing Prolog. Planner implemented a backtracking capability (for planning, get it?) similar to Prolog's. Scheme has since been used to implement many kinds of programming languages, including several kinds of Planner-like and Prolog-like languages.

And so The Reasoned Schemer brings it all back home to the student of programming and programming languages. It follows the Q&A-with-food-themes style of books also written by Dan Friedman with various authors for learning Lisp and Scheme themselves. (They even have a Java book in this style.) Some people like the style, others do not.

Programmers interested in Lisp, Scheme, functional programming, and/or logic programming will be interested in this book.

Friday, December 23, 2005

Representing RDF

Bill de hÓra explores the upside and down of using an RDBMS for a persistent representation of RDF. This reminds me of several times I've encountered a generic attribute/value pair relational data model. As Bill points out regarding the "disenabling" this does to Rails and Django (as they are currently designed), the generic model moves the domain vocabulary out of the relational schema and into the data itself.

I understand one of the keys to RDF is that this is a conceptual model for the semantic web and not necessarily a recommended persistent physical model. That doesn't mean it couldn't be, but I wonder what other models might make sense.

An advantage Bill points out to this approach is the RDBMS schema itself would change little if any over time. Everything becomes data stored using one simple, generic model. This tells me we need a database, independent of RDF or any other conceptual model, that can recognize more about the domain-specific data and relationships while supporting "agile data refactoring". Some databases are pretty good about this for star schema models. Star schemas kind of aggregate related RDF triples into another conceptual model that is only somewhat more complex than the RDF concept itself.

A star schema relates MxM things to each other through a common "fact". The simplest fact is a statement that such a relationship exists. So when some relationship is established at some point in time, say an employee is assigned a manager, the fact that this exists is established by a fact relating the employee (e.g. a row in the "employee" table) to the manager (e.g. another row in the "employee" table) via a fact (e.g. a row in the "management" table) along with other related "triples" vectoring through the same table (e.g. the "calendar" table for the start date and the "department" table for the organization being managed, perhaps the "role" table for the row of the role the employee is playing).

Then other triples even more tightly associated with each other are those "dimensions" themselves. So the relationship of the employee to their name, address, pay scale, and other vital statistics may simply be represented as columns in the same employee row. The result is a normalized fact table bringing together an array of denormalized dimension tables.

The kinds of changes to this model are simpler than those made to a "fully" normalized model. Columns are typically added to a dimension but are rarely removed. Columns infrequently change data type, but could. Dimensions added to the model, new facts are added, and new relationships between facts and dimensions.

A database not based on 25 year old approaches in the typical RDBMS are pretty efficient about representing this model and refactoring. Sybase IQ is one, e.g. in the Sybase Dynamic Operational Data Store. Efficient in this case means the space used in the database is typically *less* than the space used to represent the same data in a flat file. Very few databases have this capability and at the same time *reduce* the maintenance effort.

This model would be useful for RDF data sets that need to do a good bit of computation. For example when the "facts" of the relationships include numerical measurements of some kind (payments, temperatures, velocities, occurrences (e.g. attendance, page hit, etc.), etc.)

When the model is not so computation intensive I would suspect the queries would be more like searches and path navigation. The structure would not benefit from being in a more general attribute/value exploded RDF-like schema. Some kind of graph model would be better to support path navigation around large networks.

The three activities I think would come up frequently are more or less free-form search (text and other indexing independent of conceptual model), path navigation (across a large network of related RDF triples), and numeric computation (on measurements of some kind related to the things of the triple).

Wednesday, December 21, 2005

Jini/JavaSpaces/Rio... and Jython?

This one slides out off my sphere of awareness and then it's not too long before something pops it back inside. Are JavaSpaces still the best thing going for Java and yet still the least adopted capability?

I'm thinking about code replacement, etc. to keep a distributed system evolving. Erlang does it explicitly. Smalltalk does it well, too.

Java per se has class loaders, etc. But the still-too-secret-weapon that Java really has going for it is JavaSpaces (and the surrounding capabilities like Jini, including Rio).

JavaSpaces provides a simple distributed-shared-memory model for Java objects. While class loaders, etc. support dynamic loading, the mechanisms inside the language and VM per se do not support evolving interfaces very well. JavaSpaces is that "dynamically typed" (if you will forgive the phrase) surface in which "objects" [1] can actually change entire interface definitions over time. Not much more care is needed for this than for, say, a Lisp or Smalltalk system.

I'd rather be using Smalltalk, but JavaSpaces deserve much more attention. I wonder how well JavaSpaces and Jython fit together?

[1] I use "object" in quotes because I do mean to imply that a POJO will change its entire interface. Rather the POJO that is playing the role of some conceptual object will go away over time, to be replaced by some other POJO that is playing the same or some evolved role. This is like the observation that over time the cells of our bodies go away, being replaced by new cells. After some time and degree of replacement, do we have the "same" body that we had earlier? This is the core essence of a survivable complex system.

A Bag on the Side of the Eclipse

I was recently reminded of a favorite book: Tracy Kidder's "The Soul of a New Machine". I went to Data General developing CAD software and met some of these guys a couple years after the book was published.

A couple of phrases stick out... Tom West saying, "Not everything worth doing is worth doing well." and building exactly what Ed deCastro said he did not want, "a bag on the side of the Eclipse".

This is a bag on the side of my blog. I have not been around a computer much outside of work lately. Hopefully over the holidays I will get back to finding out what this AJAX thingy is.

I am currently engrossed in another book though, "Wicked: The Life and Times of the Wicked Witch of the West". As John Updike is quoted on the cover, it is an "amazing novel". It reminds me of Marion Zimmer Bradley's "The Mysts of Avalon" in its imaginative telling of a new story about an old story. Both books creatively incorporate real issues about society, culture, religion, and politics.

Making it stick.

Search This Blog