Making it stick.: 2/15/04

Saturday, February 21, 2004

Note the Subtext

Sean McGrath echos Don Box, but isn't the subtext obvious?

Ironically(?), the PC is still an API battle ground. APIs are not as good as the wire is for interop, but never doubt their importance to the industry.

Wednesday, February 18, 2004

Just great

One of the security firms doing business in Iraq is named CusterBattles.

Nope. It is named for the two principles, Scott Custer and Mike Battles.

It's Morning in America: So Let's Wake Up!

There are just a few basic points the coming presidential campaign should be based on. Every other issue should be brushed aside with just enough rhetoric to play to the center and get the conversation back on track.

It's called the center for good reasons. Most people are not far right neo-cons (even most true conservatives are not neo-cons). Likewise most people at best don't care about gay marriage. Sorry to those of you with strong opinions on either side of that one, which is simply a messy intersection of civil laws and socio-religious customs. The Democrats have to stick to the simple message this year.

No, the point of this election year is to get back to the center, or something resembling it. Let's call it the "moral majority".

The moral consequences of destroying countries and not rebuilding them is bad enough. But the practical consequences are worse. The Bush administration's foreign policy is undermining US security.

Monday, February 16, 2004

What makes a secure system?

As people debate the risks of the Windows code being stolen, and people debate the merits of open surce for making code more secure, I am wondering about a few things:

Writing code in a language that allows buffer overruns is not safe.
Writing office applications for a LAN that assumes all particpants are associates and then putting them on the Internet is not safe.
Writing operating systems that run every application with full user rights (or worse) is not safe.

Open source that is poorly designed is not safe. Closed source that is poorly designed and then released in the open is even less safe.

Sunday, February 15, 2004

More Thoughts on the Future of Persistence

Jay Han picks up the discussion on the future of persistence and poses some questions...

What are the experiences from orthogonally persistent OSes? Did they make say IPC any easier?

I can't answer from the OS perspective, but I can answer from the Gemstone Smalltalk perspective (since Smalltalk has been accused of being an OS, and indeed has demonstrated itself to be a pretty complete one).

Did Gemstone/S make IPC any easier? Of course the answer is yes and no.

Yes, because it implements a transactional (ACID) shared memory, so all applications are coordinated in the loosely coupled manner of a database. And yes, because the persistence between one app and the database is transparent, i.e. that relationship is nearly identical to a Smalltalk app and its image file.

No, because it is specific to Smalltalk. Integration with other languages is no better, i.e. you need to use COM or CORBA or C or SOAP or... Next question...

Amoeba had more RAM then disk and it didn't get virtual memory until late. Obviously there must have been some problems making data persistent there. Was there ever another system that had larger primary memory then secondary?

All I can say here is that Gemstone/S applications perform best when most of the persistent pages of objects fit in a Gemstone shared page cache and most of the applications run on that node that has a cache.

Palm PDA and cell phones have transparent persistence. What are other examples of a set of applications utilizing transparent persistence today?

Gemstone applications in insurance, banking, transportation, manufacturing, etc. are all based on transparent persistence. This is not entirely true because by the time a large multi-user application gets into production there is a good bit of code managing transactions, conflicts, etc. so the persistence is no longer entirely transparent. But there is no O/R mapping and there is no concern about the objects not fitting in RAM. Instead of persistence per se, the emphasis is on coordination.

Now I would argue strenuously that Gemstone/S does not have the right coordination model (i.e. a shared transactional object space). In this model *everything* in RAM is transactional (at least everything strongly connected from a persistent root). And everything in an external database is also transactional, so every connection between the transactional RAM and the transactional relational database requires a two phase commit transaction, or some confidence that the two phase rules can be relaxed.

A better model is not to make everything in RAM transactional, rather the application should use specific coorindation mechanisms, in particular one of:

tuple space
versioned tree
star schema dimensional facts

These mechanisms are kinds of "databases" that could be implemented even more simply in MRAM. An application that may itself be in MRAM should still coordinate its activities with these explicit coordination mechanisms that may or may not have multiple, distributed client processes.

Compiling Efficient Python

[Update: Michael Salib's PyCon 2004 session will be addressing exactly this topic --- "Faster than C: Static Type Inference with Starkiller"...

This dynamism makes programming in Python a joy, but generating optimal code a nightmare. Yet while the presence of such abundant dynamism makes traditional static optimization impossible, in most programs, there is surprisingly little dynamism. For example, in most Python programs:

all class and function objects are created exactly once
class inheritance relationships do not change at run time
methods are not added after a class object has been created
most expressions have exactly one type; the vast majority of those that have more than one type have only a few types

The flip side is that what little dynamism a particular program makes use of is often absolutely vital.

I have developed a type inference algorithm for Python that is able to resolve most dispatches statically. This algorithm is based on Ole Agesen's Cartesian Product Algorithm for type inference of Self programs. I have built a type inferencer for Python based on this algorithm called Starkiller. ]

As folks are discussing, C Python's simple implementation has a cost/benefit. Compiling efficiently may require some kind of "optimistic with fallback" approach. Assume the internal representationss are not messed with, and so compile the code to be optimistically efficient. If the internals become messed with then flip the bit and fallback to the simple implementation for that object.

This is similar to optimistically compiling for efficient data representations. For example when the code is about to do some math, compile in-line a test for the data type. Branch to multiple paths of code based on the type. Create a branch for in-line integer math and/or a branch for in-line double math and also include a branch for the fully boxed math.

Another approach that should not be overlooked is whole/partial program analysis. When moving from development to production, whole program analysis could be employed to optimize the specific application or groups of packages. This approach has been employed for Scheme with a good deal of promise...

Stalin has been tested on a suite of benchmarks whose length ranges up to a thousand lines. Stalin outperforms Scheme->C, Gambit-C, Chez, SML/NJ, and even handwritten Ada, Pascal, Fortran, and C code, on most of these benchmarks.

Making it stick.

Search This Blog