Making it stick.: 9/2/07

Saturday, September 08, 2007

Things from Phil Windley

Things from Phil Windley... Longtails and Software... User Centric Identity.

The portland erlang group is considering some of the ruby quizzes to dive into erlang. If you're not used to programming recursively, you'll probably want to start with simpler problems like implementing the member function, factorial, or count to determine the number of occurrences of some element in a list. When those are comfortable, then the ruby quizzes might be a good challenge.

Here's a sequential, recursive solution in erlang to ruby quiz #121: morse code. It's not *tail* recursive so it could maybe blow out some stack given a really long morse code word. (I assume morse code is translated one word at a time, with spaces between words.)

A tail recursive implementation... hmm... would be a bit more complicated. The two "stacks" are the dots and dashes yet to be decoded and the codes to be considered with each undecoded substring. So one way would be to manage the stacks explicitly in one recursive "loop". Another way would be to use continuation passing style.

Here's (I think) a fairly straight-forward recursive style, morse.erl...


-module(morse).
-export([codes/0, decode/1]).

-import(lists, [prefix/2, reverse/1]).
-import(string, [len/1, substr/2]).

%% http://www.rubyquiz.com/quiz121.html
%%
%% decode/1 when given a string of dots and dashes returns a list of
%% all possible decodings using morse code.
%%
%% Examples:
%%
%% morse:decode(".-").
%% ["ET","A"]
%%
%% lists:member("SOFIA", morse:decode("...---..-....-")).
%% true
%%
%% lists:member("SOPHIA", morse:decode("...---..-....-")).
%% false
%%
%% lists:member("EUGENIA", morse:decode("...---..-....-")).
%% true
%%
%% length(morse:decode("...---..-....-")).
%% 5104

codes() -> 
  [{$A, ".-"},   {$B, "-..."}, {$C, "-.-."}, {$D, "-.."},  {$E, "."}, 
   {$F, "..-."}, {$G, "--."},  {$H, "...."}, {$I, ".."},   {$J, ".---"}, 
   {$K, "-.-"},  {$L, ".-.."}, {$M, "--"},   {$N, "-."},   {$O, "---"}, 
   {$P, ".--."}, {$Q, "--.-"}, {$R, ".-."},  {$S, "..."},  {$T, "-"}, 
   {$U, "..-"},  {$V, "...-"}, {$W, ".--"},  {$X, "-..-"}, {$Y, "-.--"}, 
   {$Z, "--.."}].

decode("") ->
  [];
decode(String) when is_list(String) ->
  decode(String, "", [], codes()).

decode("", "", Results, _Codes) ->
  Results;
decode("", PartialDecoding, Results, _Codes) ->
  [reverse(PartialDecoding) | Results];
decode(_String, _PartialDecoding, Results, []) ->
  Results;
decode(String, PartialDecoding, Results, [{Char, Code} | Rest]) ->
  MoreResults =
    case prefix(Code, String) of
      true ->
        decode(substr(String, 1 + len(Code)), [Char | PartialDecoding], Results, codes());
      false ->
        Results
    end,
  decode(String, PartialDecoding, MoreResults, Rest).

Friday, September 07, 2007

Secure Enterprise-grade Persistent Group Chat

Also from Todd Bishop... Microsoft buys MindAlign...

MindAlign is a secure enterprise-grade persistent group chat solution.

Oh, dear. Is there any reason a large organization would not want to base its future messaging capabilities on XMPP?

But I had similar thoughts about email, and yet there's Exchange. One difference may be, though, with email and Exchange, Microsoft got in before email could become much of a platform for messaging applications. I would imagine over the last 10 years there's been little venture money for email-based platforms.

The same is probably not true for "instant messaging". What do you think? Am I just being ignorant in my recollection and/or prognostications?

Just a Router

Via Todd Bishop on Microsoft's Seattle-area buses running Linux-based wi-fi...

"Obviously this could be a sensitive issue for an operating system company like Microsoft," Polson said of the Linux issue, "but it's just a router, and that happens to be the operating system."

Just a router. Problem is for an operating system company like Microsoft, the operating system is disappearing as quick as it can. No one cares, nor should they.

Before long no one will care about Word, SQL-Server, or SharePoint. Excel may last a bit longer, but most people will realize they don't need 3/4 of what's in that, and there are better ways of getting at the 1/4 they do care about.

Microsoft has shown few signs of being a successful "design shop" in an age of increasingly boutique long tails running on increasingly simple, open, common foundations that have little to do with any specific piece of hardware sitting right in front of you.

I know, I know. They've got a lot of money, past success, patents, and Silverspoon. There's hope for them yet.

And they are a sponsor of the Open Router Platform project.

Parallel Jars of Formaldehyde

Phil Windley observing Parallels in action...

One of the cool features of Parallels is something they call “Smart Select.” With Smart Select you can specify which file types are handled by which application and in which OS. So for example, you can specify that Word docs are always opened in Office 2007 in Windows, regardless of which OS you click on the document. Or that clicking on a URL, regardless of which OS you’re using always opens the page in Safari on the Mac.

Little jars of formaldehyde to preserve the remains of yesterday's mammoth operating systems.

(Re)New Web

"Air will set the World on Fire."

The current browser is a terrible platform for the web. Whatever you think about Adobe's Air, per se, it will ultimately change "the browser" for the betterment of the web generally.

Wednesday, September 05, 2007

Naming and Synchronization in a Decentralized Computer System

This is David Reed's dissertation from 1978. Croquet, developed in Squeak Smalltalk, is based on this mechanism.

It's Simply Different

Sam Ruby gushes...

With many frameworks and languages, I get the feeling that I’m dealing with a metal cabinet covered by layers of marine paint; one where scratches tend to reveal sharp edges. With Erlang, I get the feeling of a Victorian mahogany armoire; one where scratches in the wood simply reveal more rich wood.

Reading erlang can be fun. Here's Sam's atom2json.erl.

When combining this kind of concurrency with an imperative language, there's typically a bit of code explosion implementing loops and iterators. The tendency also is to write long stretches of code that update variables and structures in place, even when concurrency and first class functions exist.

Erlang, being a Lisp-like language (really), instead traverses lists recursively.

Erlang, being influence by Prolog (somewhat), combines recursion with (a simpler form of) pattern matching.

List recursion, pattern matching, and simple concurrency mechanisms interact into a code implosion. Not weird at all to this Lisp programmer.

It's Already On

To Bob Warfield's point on the multicore crisis upon us already: yes, even on the desktop.

Here's an exercise -- think of your favorite or your most frustrating desktop applications. Maybe a browser, a mail reader, presentation or drawing apps. No matter which specific application comes to mind, that application almost certainly consists of long stretches of sequential code. Almost certainly a good bit of that code could be concurrent not in the sense of a parallel algorithm, but in the sense of there being no logical reason for C to follow B to follow A other than the languages used reinforce that style.

Anytime time you see an hour glass, you see an opportunity for concurrency.

I've been trying out Yahoo's beta email application. It's fine, but really cries out to be developed in a truly concurrent system. Moving more applications into an environment worse than modern desktops, let alone far from securely concurrent, is just a shame. Modern browsers are horrible platforms. We need a new browser model that is concurrent and secure. We need a new desktop that is concurrent and connected.

Both the desktop and the browser are poorly suited for upcoming many-core laptops and desktops. Not to mention five or ten years from now when we may well be deluged in so much more cheap iron looking to do something useful for us other than running bloated sequential code.

Programming shared memory threads in C# with transactional memory will not get us any closer to where we need to be. We need to think much differently with languages that support that thinking. It's only too bad we're still in the 1970s.

Tony Hoare developed monitors in the early 1970s then replaced them with concurrent message passing in the late 1970s. Then Java brought us monitors again in the mid-1990s.

Over a decade later we're still twiddling bits in critical sections stuck in a stretch of sequential code. It's ludicrous to think that's the natural human problem solving model. The tools have shaped our thinking. And I am rambling.

My Agile Failed - pause - NOT

OK, here's the deal: "agile" *cannot* fail!

Preposterous?

Here's why "agile" cannot fail: it is a set of tools that can be adapted to your needs. You may do better or worse with them. In that sense you may fail to benefit from them, or you may simply prefer not to use them. Or you may benefit from them. In either case it is you suffering or benefiting, not "agile".

If you think "agile" can fail, there is a different problem to talk about.

But "agile" cannot fail in the same way "hammer" cannot fail. (Thanks, Ed, for the analogy.)

Delaying The Inevitable

Dan Creswell on a few of the ways we developers paint ourselves into a corner...

Every time we assume we can keep all our data in a single memory or database (even if it’s a cluster) we’re embedding assumptions into our software that will be broken come the day we must partition across multiple memories or databases.
Each time we choose an algorithm that doesn’t easily partition or assumes a single memory/database we’re storing up trouble in our data and computational models.
In big monolithic systems it’s possible to create (by force) a never-fails environment which allows developers to ignore various edge cases.

Tuesday, September 04, 2007

PDX.erl

Merlyn Albery-Speyer has started a yahoo group for portlanders (and beyond, I presume) interested in erlang.

pdx.erl

The Race Is On, Or Is That Off?

Or: "I just dropped in to see what condition my condition was in."

Tim Bray ponders more cores, hardware (and software -- cannot forget the software) transactional memory as well as erlang, or some sort of erlang transmorgrigication into java or something less weird.

Ralph Johnson addressed that idea, probably accurately.

These are fascinatingly intertwined topics. Dan Creswell sent a link to me the other day: this Sun research paper on Hybrid Transactional Memory (pdf). I hope he blogs about it. He's got a good handle on the situation.

Unlike apparently many smart people at Sun, Microsoft, Intel, and elsewhere, I'm still unconvinced that transactional memory makes the overall problem any easier or better.

I do know that focusing on transactional memory is a short-term solution at best. Erlang addresses the true problem: a simple programming model that can scale out beyond SMP and yet scale down to a single core just as well.

Tim suggests transactional memory will remain well out of sight for application programmers. But these programmers need better tools, no matter how HTM affects them in the small (and eight, even 16, cores should be considered small over the next decade). The results of system programmers using transactional memory in low level SMP code is a drop in the bucket compared to today's and tomorrow's application development problems. These have little to do with a single piece of silicon and have everything to do with networks and complex concurrent collaboration among disparate subsystems.

Not so many years from now we will be awash in more, cheaper, hardware than our current application development languages and tools can reasonably accommodate. We should have a simple model for addressing all that hardware with less developer effort. We need simple languages and tools for concurrency and *distribution* so that we can waste cheap hardware in exchange for leveraging valuable developer ergs.

Today we are wasting hardware running garbage collectors in order to save developer ergs. Increasingly we need to be wasting hardware running large numbers of small processes.

Transactional memory is not even close to the support we need. I am not sure why so many people find it shiny. Maybe I'll be surprised.

Update: Some urls in comments made easier here:

Bob Warfield: You may have already encountered a multicore crisis and just not known it
Bob Warfield: 70% of the Software You Build is Wasted
Brit Butler: The state of state

Brit's got a conversation going with Tim Sweeney on transactional memory vs. message passing.

Tim's example for TM is coordinating a lot of objects quickly in a shared memory for some game scenarios. Fair enough - I am unable to compare the complexity of transactional memory vs. traditional "critical section" mechanisms for this. Off the top of my head I would agree that a shared-nothing message passing mechanism does not really address this problem, but I would imagine it still useful for other aspects of that kind of a game system. My bigger point is this: there are relatively few people with that kind of a problem. Most of us have the kinds of problems that are far better addressed by shared nothing message passing.

So what frightens me as much as the transactional memory hardware and/or software itself is the *attention* it is receiving as any sort of general solution to developing software. Is the cost worth the benefit?

Monday, September 03, 2007

Phil Windley on the Microwulf

Phil Windley on driving down the cost of Beowulf clusters...

The world's cheapest supercomputer, built by a Calvin College CS professor Joel Adams and student Tim Brom is very interesting. They built an 8 core Beowulf cluster using four motherboards and a gigabit network switch for less than $2500. The resulting machine has a price/performance ratio of $100/Gigaflop. That's just plain fun.
I think there ought to be a yearly competition of this sort for students. Who can build the fastest supercomputer for $2500?

Rob Pike on Concurrency and Message passing in Newsqueak

Rob Pike and Luca Cardelli created a simple, concurrent language called Squeak (not to be confused with Squeak Smalltalk) for simplifying programming user interfaces. Pike then turned that into Newsqueak, a more complete programming language. (See "Squinting at Power Series", (pdf)).

Newsqueak has processes like Erlang. But instead of an implicit inbox and outbox per process, Newsqueak has channels as first-class types. All channels are half-duplex, synchronous communicating, well, "channels", as in C.A.R. Hoare's Communicating Sequential Processes. Newsqueak also has only immutable data values, although it does have true variables rather than single-assignment. (Not sure yet how that interacts with concurrent processes running closures over variables lexically known to each process.)

And so Newsqueak can be used to do Erlang-like programming and vice versa, using processes/channels/functions or processes/pids/functions to implement something like the other's mechanisms.

Although Newsqueak does not have the distribution mechanisms or the failure mechanisms of Erlang. And rather than using pattern matching over all the messages in a process inbox, in Newsqueak a process might use something like one channel per pattern or encode all the possible patterns into a structured type passed over a single channel.

Russ Cox has an overview of CSP in the context of Bell Labs where Newsqueak was developed and then led to other interesting things.

Pike is now at Google and not too long ago recorded an interesting and well-done Tech Talk on Newsqueak and concurrent, message passing programming generally...

Define components as interfaces with all data flow and sharing done as communication over channels.

The interface is a type; implementations of that interface just honor the protocol.

Composition is linear in complexity of design but superlinear in expressibility. (The opposite of composition of state machines.) Interleaving is free. Compose interfaces, not state machines.

Parallelism is not the point, but falls out for free. Networking and remote execution are not the point, but also can fall out (although not quite for free).

...Concurrent processes with message passing can be a powerful model for programming, whether the problem is intrinsically parallel (server design) or not (power series)...
The expressiveness - notation - is important.

Update: Ehud linked here from Lambda the Ultimate, and that thread has several comments of its own.

Making it stick.

Search This Blog