"I have a mind like a steel... uh... thingy." Patrick Logan's weblog.

Search This Blog

Wednesday, May 26, 2004

What a concept!

programs will be stored as XML documents, so that programmers can represent and process data and meta-data uniformly

7 comments:

Anonymous said...

The XML thing is really a red herring; any serialisation format can be used: S-expressions, picked live objects, even a VBA-style script that generates the program structure directly.

The _real_ Big Idea is that programs will be created by directly manipulating the Abstract Syntax Tree, rather than by writing dumb plaintext source code that's transformed into executable bytecode/machinecode by some mystical process largely impenetrable to, and uncontrollable by, the common developer. Comparable to the popular shift from CLI to GUI, except without losing any advantages of the former.

Scads of benefits to this approach; for example: scriptable, extensible ASTs will give you all the benefits of Lisp macros, except in realtime and without actually having to use Lisp; "smart" programming tools will eliminate syntax/compiler errors and provide all kinds of ways to dynamically edit, view and analyze code; code refactoring will become highly automated and seamlessly integrated with the writing process; easier access to the weird-n-wonderful world of programming for the great unwashed masses; and no more vi-vs-emacs flamewars, ever.:)

Not a new idea, but hopefully one that's finally coming of time (see James Gosling's Jackpot project, for example).

Patrick Logan said...

without actually having to use LispNote sure this is a benefit.

In general, any tree has to be displayed somehow. Text has its advantages. I'm not searching for another format, but if a better one comes along, I'm open to it.

I won't be holding my breath in the mean time.

Anonymous said...

"In general, any tree has to be displayed somehow. Text has its advantages."

For input and representation, yes. Text is undoubtedly much more time- and space-efficient than point-n-click-driven icon displays, so I would expect AST editors to provide professional programmers with a keyboard-driven, text-based interface that looks, at least on the surface, similar to the tools they currently use.

The big difference is that these tools _won't_ be limited to crudely manipulating character streams. They'll have complete understanding of the language syntax, semantics and structural rules, and will be able to leverage that considerable knowledge to help the developer write programs much more quickly and reliably than they currently can.

A nice touch will be the ability to swap a text-oriented programming UI on-the-fly for [e.g.] a child-friendly graphical one, or one tailored specifically to disabled users, just by plugging a different View layer into the editor. Another cool trick will be the ability to embed live GUI widgets - buttons, checkboxes, input/display text fields, etc. - directly within a program's 'source'.

More practical benefits will be stuff like 'live' variable name/scope checking and editing. For example, the editor could put a squiggly underline under every undeclared variable in a program, much like Word puts one under every misspelt one, so no more waiting for the compiler/runtime to spew 'variable x not found' errors whenever you make a typo. Selecting a variable at any location within the program will cause the editor to highlight _every_other_ reference to that variable. An extra keystroke or right-click will let you globally or selectively rename that variable safely and automatically. A type inferencer component could be plugged into the editor to help write code and look for variable typing errors. Careless semantic errors such as typing '=' in situations where you meant '==' will often be flagged for attention immediately, as the editor will have good knowledge of which keywords, operators, etc. are legal in a given context and which aren't.


BTW, another interesting language-cum-environment to look at is Berkley's experimental "next-generation" Logo-like Boxer language (http://www.soe.berkeley.edu/boxer.html/). Doesn't seem to be in active development any more, but it shares some ideas with AST-based programming, particularly in its use of live, interactive, graphical boxes to denote block structures which is basically just a less granular version of the AST-based editor UI. I do recommend giving it a look if you're interested in all this stuff: there's some interesting reading material and a downloadable classic-compatible Mac executable available.

Patrick Logan said...

I'm not sure simply presenting a syntax tree differently will make programming easier for the newer developer, but if it does I am all for it.

In any case the following sounds kind of like the Smalltalk Refactoring Browser, which has been around form some years now...

They'll have complete understanding of the language syntax, semantics and structural rules, and will be able to leverage that considerable knowledge to help the developer write programs much more quickly and reliably than they currently can.

Anonymous said...

Note from the author: I don't actually say that XML is the right/best/most appropriate format; I say that its use is (now) inevitable. Most other document-based systems are migrating to XML for storage; tools are already ubiquitous (who _doesn't_ have SAX, DOM, or XSLT these days?); and programmers are meeting it earlier and earlier in their careers. Sure, it could all have been done with name-of-alternative-goes-here; my point is, it wasn't, and now it's too late for anything else to gain sufficient traction.

Anonymous said...

"Note from the author: I don't actually say that XML is the right/best/most appropriate format; I say that its use is (now) inevitable."

I'd agree only inasmuch as living, breathing and sweating XML is these days an essential requirement for full buzzword compliancy; about the only way to get the great unwashed masses to pay any attention to anything new and unfamiliar. (As long as you can "talk the talk", most couldn't tell you from the Ministry of Silly Walks. Nice serialisation format; damn shame about all the High Priests and Black Magic Conjurers it's sucked in like a magnet. Humbug.)

Unfortunately, I think any mention of XML, even as a trivial adjunct as it is in this case, distracts the average eye from the main argument, resulting in the whole lot being swiftly and ignorantly pidgeonholed as Yet Another Sparkly Fun XML Technology by the more casual reader. Which is a huge disservice to do to a technology that will do for programmers what the CLI-to-GUI revolution did for end users.

Incidentally, you might be interested in my own solution: serialising AST-based programs as Python scripts. Thus a simple program such as 'display 2 + 2' is serialised as the following Python script:

Command('display', Command('add', Literal(TypeNumber(2)), Literal(TypeNumber(2))))

Executing that script within the context of the AST editor will rebuild the original AST. Not the most aesthetic-looking program ever; but then it has no need to be, and will do its job just fine with the absolute minimum effort required.

BTW, I was going to use XML originally, then changed my mind to using a full-blown language syntax, then switched to S-expressions, and finally decided that all this stuff was a complete load of nonsense and total waste of time since my AST is supposed to be completely open and scriptable anyway. (I began my programming life as a casual Mac application scripter, pushing icons around my desktop with recorded scripts. Plus ca change, eh?:)

The only technical value I can see in using XML as the native serialization format is if the AST editor wants to lock its users out of manipulating the program object model directly. Which would fall somewhere between utterly clueless and completely self-defeating; the AST already contains all the domain knowledge needed to expertly manipulate the program structure. If users need to resort to manipulating the serialized form directly then the AST editor is not doing its job. (Though it would give the XML weenies a shiny new toy to despoil for a week, until they get bored and move onto screwing up something else...:p)

Do keep pumping the core concept though. It's a true Killer Technology in-waiting. You just have to break a few vi/emacs lovers' fingers to get there. ;)

Anonymous said...

"I'm not sure simply presenting a syntax tree differently will make programming easier for the newer developer, but if it does I am all for it."

Here's a really simple example: doing string substitutions. In Python, one writes stuff like:

'Hello, %s. The time is %s.' % (name, strftime('%H:%M:%S', t))

In an AST editor (with apologies for the lousy text-only mockup; it doesn't do it justice) the equivalent would be represented on-screen as something like:

[Hello, [name]. The time is [t as [h:mm:ss]].]

Superficially you may think "oh-ho, that's just like Perl string substitution", but it's not. First, try to visualise the [...] as being drawn on-screen as a a series of subtle, gently coloured boxes nested within one another, with additional styling applied to the text to indicate what each 'word' means. Now, each box represents an object within the editor application's object model; so in the above you're creating first a 'literal string' object containing a series of characters, and into that you're embedding a variable object and a command object; or any other arbitrary object that can evaluate to text.

The full AST representation would be rather more complex, of course, but the editor's graphical View would fine-tune the presentation as above to make it more palatable to the human eye (nothing as crude as Lisp parens here;).


Input constraint and context-sensitive help are a couple of other things I'm thinking of. My first 'programming' experience was on a ZX81, and there's something uniquely wonderful in having the entire language vocabulary printed beneath your fingertips where you can see it at all times, and just pressing the desired button being all it took to put a legal keyword into the right place. An advantage that's been lost somewhere in the transition from home micros to home PCs, and one I'd very much like to resurrect. The idea of having a floating window that displays a brief summary of the currently selected language structure is also highly appealing. Not only will it help new users get up to speed with language syntax and semantics; it can also be hyperlinked into deeper documentation on programming theory and how to use such structures more effectively, and into macro scripts that can physically show them how to use them by generating template code and examples before their eyes.


And these are probably quite trivial examples of what could be possible in such a system; I've been dabbling with ideas for the last few months and still don't feel like I've more than scratched the surface. (Probably a good indicator I should stop faffing about daydreaming and get on with actually implementing it...;p)

Not that you can't provide such services in other ways, of course, but I can't think of any other way that could do it with such simplicity, efficiency and grace.

...

"In any case the following sounds kind of like the Smalltalk Refactoring Browser, which has been around form some years now..."

Kind of. Except that refactoring browsers really get the cart before the horse; rather than create a general solution that can be customised to support all kinds of different - and often highly interactive - uses, they provide a highly specialised solution that solves only a single problem somewhat after the main event.

FWIW, I did think of drawing the comparison, but I didn't want to end up creating an "oh, another way to build a refactoring browser" misconception by accident. (Almost as bad as creating an "oh, another whizzy XML technology" one as the original paper's author seems to have made if the blog links I've seen to it are any indication of readers' [mis]perceptions.[grin])

But if it helps to think of it as taking the refactoring browser concept and completely generalising it, and then removing the original character stream-based source code from the equation and leaving the user to do everything by manipulating the browser DOM, then by all means go for it. I'm still struggling to find the perfect pitch myself (as you can probably tell;p), so if you can come up with one then by all means let me know. [grin]

...

Anyway, going to shut up now before I give away all me competitive h-h-h-advantages... but best of British to anyone fancies stealing the ideas given here, and I hope they've provided a useful little glimpse into what [I hope!] will be the future. ;)

Blog Archive

About Me

Portland, Oregon, United States
I'm usually writing from my favorite location on the planet, the pacific northwest of the u.s. I write for myself only and unless otherwise specified my posts here should not be taken as representing an official position of my employer. Contact me at my gee mail account, username patrickdlogan.