"I have a mind like a steel... uh... thingy." Patrick Logan's weblog.

Search This Blog

Saturday, October 20, 2007

Programming with Streams

Michael Lucas-Smith illustrates programming with streams and a helpful "functional" protocol added by the ComputingStreams package from Cincom Smalltalk's public repository. Michael makes the case that more code should be based on streams instead of collections, and use a common protocol like ComputingStreams instead of custom, one-off methods that won't play well with others. Streams are a better abstraction than collections because they inherently "delay" and scale up to and including infinitely long streams.

Streams are interesting creatures, fundamental to programming if you've got them available. See Structure and Interpretation of Computer Programming: "Streams Are Delayed Lists" and Richard Waters' Series package for Common Lisp for more examples.

Also this Smalltalk package is similar to Avi Bryant's Relational Object Expressions. ROE uses objects using the standard Smalltalk collections protocol to delay "realizing" complete, in-memory collections, enabling the original collections to be enormous numbers of rows in tables on disk. In the case of ROE the objects are delaying the eventual composition and execution of SQL, including the conditional expressions that may have narrowed the size of the selection.

2 comments:

Steve Jenson said...

'Scala by Example' has a short section on using Streams in Scala. It's pretty much a straight port of the section in SICP. Fun!

Anonymous said...

It would appear that "streams" as discussed in Lisp and Smalltalk are approximately equivalent to Python's generators. The nice thing about generators is that you iterate through them using the same techniques - no new reserved names and such - as you would any iterable.

# a standard list comprehension is going to consume all your RAM straight away before any useful work is done
for item in [i for i in ['over and',] * 100000000000]:
# do something with item
print item,

# a generator expression, or callable which returns values using the yield keyword, will chug along merrily on even the most limited of machines:

for item in (i for i in ('over and',) * 100000000000):
# do something with item
print item,

# creating a generator using yield; in Python 3 xrange disappears and the standard range itself is a generator.

def foo(x):
for i in xrange(x):
yield 'over and '

for item in foo(100000000000):
# do something with item...

And so on. generators and generator expressions have become commonplace in Python code these days.

Blog Archive

About Me

Portland, Oregon, United States
I'm usually writing from my favorite location on the planet, the pacific northwest of the u.s. I write for myself only and unless otherwise specified my posts here should not be taken as representing an official position of my employer. Contact me at my gee mail account, username patrickdlogan.