Making it stick.: 9/23/07

Saturday, September 29, 2007

Happy 80th

John McCarthy turned 80 recently. Happy birthday!

In 1961, he was the first to publicly suggest (in a speech given to celebrate MIT's centennial) that computer time-sharing technology might lead to a future in which computing power and even specific applications could be sold through the utility business model (like water or electricity).

Enterprise URIs

I saw the page in our wiki where fuzzy was working out some uri's. That *is* cool...

I'd rather bet on the URI than any ESB vendor...
With a URI naming scheme that doesn't change you get a very different, very simple view of your systems. Sure, there may be madness today behind those URIs, but over time that madness will hopefully start to go away. But your URIs will stay the same.

Rather than selling proprietary ESB middleware, "integration" vendors should be developing and selling "information architecture" expertise as well as experience in planning and developing simple tools, that don't hurt the web, around the awful messes every organization has and would like to move away from over time.

Update: Ross Mason comments...

I agree that URI design is very important, but I'm not sure I understand what that has to do with ESB vendors. In the Mule we use URI to express service endpoints, but this isn't limited to http but also Jms queues, data repostiories even IM and EJBs. However, in all these cases, we, the vendor, don't control how the URIs are defined above ensuring that they are valid. Can you elaborate a little on your comments about the ESB vendor involvement?

Good question. Over the last few years, in a couple of different IT shops, I have been directly involved in "SOA" efforts. At the same time I have been a fairly close observer of other similar efforts. In each of these efforts a lot of time has been devoted to products in the ESB category.

My primary concern over the last 25 years of software development has been the inability for the software developers to make a change they would otherwise choose to make, except for the fact that making that change is inordinately too difficult. The difficulty in almost all circumstances were due to implementation details of one component being expressed as dependencies by another component. In order to change A, you also have to change B, and usually worse than that.

As a result the *apparently* most cost-effective choice *in the moment* usually is, "don't change anything". And so these problems compound with more problems over time until the entire data center is in a development deadlock. Nothing significant can be improved because everything depends on everything else. I've been involved in a couple of situations where the big changes *were* funded and they are nightmares.

To avoid, or to climb out of, such massive technical debt, an enterprise must invest in separation of concerns, abstractions that hide details as much as possible, "wedges", if you will, between the components that should be able to change independently. The WS-* proponents envisioned WSDL-based interfaces as such "wedges". The message bus proponents (and I was one of those for a long time) envisioned "messages" as such wedges. The distributed object proponents (and I was one of those for a long time too) envisioned "objects" as such wegdes.

I was never a WSDL/WS-* fan, but I still believe reasonable solutions could be implemented with messages and/or objects in many data center situations. But taking a step further back, I can now see how identifying the significant "resources" of an organization, and how to identify (and "locate") those can be even more abstract than messages or objects. Atom, for example, can be used to identify, locate, manipulate, and announce changes to resources over time. Nothing at that level must be said about the lower-level techniques (e.g. messages and/or objects) used to do so.

I would like to say as little as possible at the enterprise level about how I implemented the enterprise. An ESB product may or may not fit in somewhere behind the scenes (that's another discussion). I would like products and technologies to come and go as desired without unduly affecting the rest of the enterprise.

An ESB, or any kind of "bus" or object system is not an architecture, per se. They may help to realize an architecture. Right now it looks like "the web" is a pretty good architecture for the enterprise.

Third parties, such as ESB vendors, over time could have more real success with enterprises by helping them implement web-like architectures rather than helping them install and configure ESBs. Mule may play a role here or there behind the scenes, but enterprises need to learn how to build more lasting structures ("information architectures" may be a decent name for it) and focus less on the plumbing that they've been sold by the ESB vendors.

Thursday, September 27, 2007

When it rains, it pours i/o all over Erlang

klacke adds to the erlang i/o discussion as he did with the regexp discussion, with a faster library...

Originally at Bluetail, we had some serious problems with high performance file I/O, especially line oriented such.
I then wrote a portable (yes win32 too) linked in driver for fast FILE I/O. It's based on an old and hacked version of the BSD FILE* interface. It's called bfile and we've been using it in pretty much all projects during the past 8 years. I've prepared a tarball of it at
http://yaws.hyber.org/download/bfile-1.0.tgz
2> bfile:load_driver().
ok
4> {ok, Fd} = bfile:fopen("Makefile", "r").
{ok,{bfile,#Port<0.98>}}
5> bfile:fgets(Fd).
{line,<<10>>}
6> bfile:fgets(Fd).
{line,<<10>>}
7> bfile:fgets(Fd).
{line,<<97,108,108,58,32,10>>}
14> bfile:fread(Fd, 10000).
{ok,<<10,10,105,110,115,116,97,108,108,58,32,97,108,108,10,9,40,99,100,32,99,95,115,114,99,59,32,...>>}
15> bfile:fread(Fd, 10000).
eof

More on Erlang's i/o Rationale

From Ulf Wiger of Ericsson on the erlang-questions email list, regarding the performance of some of the i/o functions...

One reason is that file IO in Erlang has traditionally been tuned in order to be as unobtrusive as possible, in massively concurrent systems. For example, Mnesia's log dumps usually run in the background at low priority in such systems, and the more important IO is the signaling to/from the network. In these systems, writes to disk are uncommon, and reading large volumes of data from disk only occurs at restarts (which are - hopefully - exceedingly uncommon).
While we've noticed for a long time that Erlang's IO generally sucks in benchmarks that test raw sequential speed on one large file or one socket, it hasn't been clear that this adversely affects the key products using Erlang.
I'm sure that we can find ways to speed up such IO without adversely affecting the characteristics of massively concurrent IO. As Erlang is spreading more into other application areas, this is bound to be a major issue.

Solid State

(via Steve Dekorte)

Fusion io's flash storage card. Neat.

...the cards will start at 80 GB and will scale to 320 and 640 GB next year. By the end of 2008, Fusion io also hopes to roll out a 1.2 TB card...
...the card has 160 parallel pipelines that can read data at 800 megabytes per second and write at 600 MB/sec. He even proved it by running a Linux drive I/O benchmark. But for large corporations running busy databases, operations per second is a much more important number than bandwidth.
Flynn set the benchmark for the worst case scenario by using small 4K blocks and then streaming eight simultaneous 1 GB reads and writes. In that test, the ioDrive clocked in at 100,000 operations per second. “That would have just thrashed a regular hard drive,” said Flynn.

Five years from now will be fun, running not-so-little data centers in a pizza box.

How are you going to justify running your operations on a mainframe in 2012?

Moore's Law is changing the hardware landscape orders of magnitude more quickly than the software community can track. We have not even grasped the difference between today and tomorrow because we're still way back before yesterday in the way we think about software.

How long before someone gets rid of that artificial disk driver sitting between the processor/caches/memory and the "disk"?

Tuesday, September 25, 2007

Apparently Fast Erlang File Read and Regexp

Claes Wikstrom sent a link to the erlang-questions list to his faster regexp library. No report yet on speed...

...the only fast way today to process a large file line/by/line is to file:open(Filename, [read, raw]) In a loop {ok, Bin} = file:read(Fd, BufSize) Use a binary regex matcher such as... http://yaws.hyber.org/download/posregex-1.0.tgz (I don't know the state of the regex lib in OTP today, last time I looked it sucked bigtime though)
/klacke

Here's an example of its use. Note it uses Erlang's binary representation (a sequential hunk of memory) of strings instead of its list-of-characters representation...


Erl Interface to posix regular expressions by klacke@emailaddress.xyz
LICENSE: BSD style, free, use,  molest and rewrite

To build, make and sudo make install

To use:

1. Compile your regexp.

4>  {ok, RE} = posregex:compile(<<"abc.*foo">>, [extended]).
{ok,#Port<0.101>}

Try to match something 

7> posregex:match(RE, <<"abc mre text here foo">>, []).
ok

If it doesn't match 

9> posregex:match(RE, <<"abdc mre text here foo">>, []).
{error,nomatch}

Try to match and find out where the match occured

10> posregex:exec(RE, <<"abc mre text here foo">>, []).  
{ok,[{0,21}]}

Free memory occupied by the compilation (or exit process since
RE is an erlang port)

11> posregex:free(RE).
ok

Monday, September 24, 2007

Beer Riot

Oh and BeerRiot.com is also done in Erlang and ErlyWeb. Gulp.

I gotta try that Sam Adams Imperial Pilsner. (Sam Adams? Imperial? The irony.)

Vimagi

Vimagi was built with Erlang and the ErlyWeb framework. (And Flash apparently.)

Check out an example.

Pier Port for Cincom Smalltalk

Via James Robertson, Pier has been ported to Cincom Smalltalk. What's that mean?

Pier may be the best open source application built on the open source Seaside framework. (The best commercial application being DabbleDB of course.)

Pier may be the best extension of Ward Cunningham's "wiki" concept. And it is built on Magritte, which may be the best self-describing meta application system on... well, on earth.

And Cincom Smalltalk may be the best OO dynamic language system, probably the best such commercial system, and has all the openness that Smalltalk systems have had going back to the early 1980s. In fact CST's lineage goes all the way back. (Why would you use Ruby when you could use Smalltalk???)

Cincom has or will soon have support for Seaside. Try out Pier. It's cool.

Regular Expression Matching Can Be Simple And Fast

Speaking of regexp. Interesting analysis of approaches to regexp design. Rob Pike shows up again in this. Those Unix guys had something going back then.

This is a tale of two approaches to regular expression matching. One of them is in widespread use in the standard interpreters for many languages, including Perl. The other is used only in a few places, notably most implementations of awk and grep. The two approaches have wildly different performance characteristics...
Notice that Perl requires over sixty seconds to match a 29-character string. The other approach, labeled Thompson NFA for reasons that will be explained later, requires twenty microseconds to match the string. That's not a typo. The Perl graph plots time in seconds, while the Thompson NFA graph plots time in microseconds: the Thompson NFA implementation is a million times faster than Perl when running on a miniscule 29-character string. The trends shown in the graph continue: the Thompson NFA handles a 100-character string in under 200 microseconds, while Perl would require over 1015 years. (Perl is only the most conspicuous example of a large number of popular programs that use the same algorithm; the above graph could have been Python, or PHP, or Ruby, or many other languages. A more detailed graph later in this article presents data for other implementations.)
It may be hard to believe the graphs: perhaps you've used Perl, and it never seemed like regular expression matching was particularly slow. Most of the time, in fact, regular expression matching in Perl is fast enough...
Today, regular expressions have also become a shining example of how ignoring good theory leads to bad programs. The regular expression implementations used by today's popular tools are significantly slower than the ones used in many of those thirty-year-old Unix tools.
This article reviews the good theory: regular expressions, finite automata, and a regular expression search algorithm invented by Ken Thompson in the mid-1960s. It also puts the theory into practice, describing a simple implementation of Thompson's algorithm. That implementation, less than 400 lines of C, is the one that went head to head with Perl above. It outperforms the more complex real-world implementations used by Perl, Python, PCRE, and others. The article concludes with a discussion of how theory might yet be converted into practice in the real-world implementations...
While writing the text editor sam in the early 1980s, Rob Pike wrote a new regular expression implementation, which Dave Presotto extracted into a library that appeared in the Eighth Edition. Pike's implementation incorporated submatch tracking into an efficient NFA simulation but, like the rest of the Eighth Edition source, was not widely distributed. Pike himself did not realize that his technique was anything new. Henry Spencer reimplemented the Eighth Edition library interface from scratch, but using backtracking, and released his implementation into the public domain. It became very widely used, eventually serving as the basis for the slow regular expression implementations mentioned earlier: Perl, PCRE, Python, and so on. (In his defense, Spencer knew the routines could be slow, and he didn't know that a more efficient algorithm existed. He even warned in the documentation, “Many users have found the speed perfectly adequate, although replacing the insides of egrep with this code would be a mistake.”) Pike's regular expression implementation, extended to support Unicode, was made freely available with sam in late 1992, but the particularly efficient regular expression search algorithm went unnoticed. The code is now available in many forms: as part of sam, as Plan 9's regular expression library, or packaged separately for Unix. Ville Laurikari independently discovered Pike's algorithm in 1999, developing a theoretical foundation as well.
Finally, any discussion of regular expressions would be incomplete without mentioning Jeffrey Friedl's book Mastering Regular Expressions, perhaps the most popular reference among today's programmers. Friedl's book teaches programmers how best to use today's regular expression implementations, but not how best to implement them. What little text it devotes to implementation issues perpetuates the widespread belief that recursive backtracking is the only way to simulate an NFA. Friedl makes it clear that he neither understands nor respects the underlying theory.
Regular expression matching can be simple and fast, using finite automata-based techniques that have been known for decades. In contrast, Perl, PCRE, Python, Ruby, Java, and many other languages have regular expression implementations based on recursive backtracking that are simple but can be excruciatingly slow. With the exception of backreferences, the features provided by the slow backtracking implementations can be provided by the automata-based implementations at dramatically faster, more consistent speeds.
Companion articles, not yet written, will cover NFA-based submatch extraction and fast DFA implementations in more detail.

Just goes to show that various benchmarks are relative, and there's likely a good bit of low-hanging fruit in Erlang's implementation.

Steve Vinoski on Tim Bray and Erlang

Not only is Steve Vinoski blogging again, he's blogging about Erlang. Not only is he blogging about Erlang, he's written some code like Tim Bray's but parallelized it on two-core and eight-core machines, with ease, as a relative newbie to Erlang...

Reading between the lines, it seems that Tim was hoping to take advantage of Erlang’s concurrency to put his multicore machines to work analyzing his logs...
I decided to take a crack at it myself...
The way this solution works is that it uses multiple Erlang processes to convert chunks of the input file to lists of strings and process them for matches...
The best I got on my MacBook Pro after numerous runs was 0.301 seconds with 2400 processes, but the average best seems to be about 0.318 seconds. The performance of this approach comes pretty close to other solutions that rely on external non-Erlang assistance, at least for Tim’s sample dataset on this machine.
I also tried it on an 8-core (2 Intel Xeon E5345 CPUs) 64-bit Dell box running Linux, and it clocked in at 0.126 seconds with 2400 processes, and I saw a 0.124 seconds with 1200 processes. I believe this utilization of multiple cores was exactly what Tim was looking for.
If you’re a Java or C++ programmer, note the ease with which we can spawn Erlang processes and have them communicate, and note how quickly we can launch thousands of processes. This is what Tim was after, I believe, so hopefully my example provides food for thought in that area. BTW, I’m no Erlang expert, so if anyone wants to suggest improvements to what I’ve written, please feel free to comment here.

Very cool. There are still the benefits to be gained from improving Erlang's I/O and regexp libraries for doing the sequential aspects of Tim's work. But this shows the real value of Erlang (and Erlang-like capabilities if they show up in other language systems) for the increasingly multi-core, multi-node world.

Sunday, September 23, 2007

Intel C/C++ STM Compiler

James Reinders, who lives just down the road a bit, announces a prototype C/C++ compiler with Software Transactional Memory...

We have a lot to learn before we can decide whether STM offers some relief from locks (they are NOT going away) and offers help for programming, or for tools which compose programs automatically. We think that the existence of a C/C++ compiler supporting Software Tranactional Memory (STM) would be a great help. So... Today, we released a prototype version of the Intel C/C++ Compiler with support for STM. It is available from Whatif.intel.com. The Intel STM Compiler supports Linux and Windows producing 32 bit code for x86 (Intel and AMD) processors. We hope that the availability of such a prototype compiler allows unprecedented exploration by C / C++ software developers of a promising technique to make programming for multi-core easier.

That's a healthy attitude. Have fun with it.

If you *are* interested in STM (well, I'm not), then you might consider how a system like Gambit Scheme, which compiles to C, could use this new C compiler. (You'd also have to consider how the Gambit Scheme interpreter does the same.)

Postmodern I/O

Update: As it turns out, get_line is intended primarily for writing interactive tty character-by-character apps. On the erlang-questions list, I think, someone on the implementation team announced they're updating the performance guides. And now I assume the i/o and regexp implementations. End.

Tim Bray's note to Erlang...

I like you. Really, I do. But until you can read lines of text out of a file and do basic pattern-matching against them acceptably fast (which most people would say is faster than Ruby), you’re stuck in a niche; you’re a thought experiment and a consciousness-raiser and an engineering showpiece, but you’re not a general-purpose tool. Sorry

I've never had to do a lot of really fast I/O in Erlang, so Tim's excercise (and Steve Loughran's a while back) have been useful for me.

Fortunately for Erlang, making improvements to the I/O and perhaps the regexp libraries should be a fair bit easier than making concurrency and distributed system improvements in other languages.

If I had to do a lot of really fast I/O and Erlang did not pan out for that, I would probably turn to Gambit Scheme. In fact Gambit can do really nice Erlang-like concurrency as well as really fast I/O. It just doesn't have all the OTP libraries that Erlang has.

Maybe Gambit will get there some day. Maybe someone will port Ruby to Gambit, so Ruby can run really fast and have really fast I/O too. And have really nice concurrency. All without building a new virtual machine from scratch or building on top of the JVM.

(If you're interested in that, let me know. I just don't care enough for the cruft in Ruby to write an implementation of it for my own use. There're a helluva lot of good things that fall out of this approach, like secure multiple application spaces per OS process, multiple languages with message-passing integration per OS process, "engines" that can be metered, pet-named secure access to resources, etc. OK -- that's its own blog post.)

If I had to do a lot of really fast I/O in the context of a reliable, scalable distributed system, I would probably do the really fast I/O in Gambit Scheme connected to an Erlang external port. Or this could be Python or Ruby if you wanted something from those systems.

For example instead of implementing a rule engine in Erlang, I'd integrate from Erlang to PyClips (a really nice integration of Python and the Clips rules engine) like this. That seems like the way to develop postmodern systems... use good tools for appropriate situations, especially if they are built to be integrated easily. Programming today of any size probably leads to multiple languages, by its very nature.

Meanwhile Tim's I/O results have been taken to the erlang-questions list. With those numbers there seems to be some low hanging fruit that may or may not require some underlying systems coding. Already solutions are pouring in just using a different approach than the apparently slow get_line.

Making it stick.

Search This Blog