"I have a mind like a steel... uh... thingy." Patrick Logan's weblog.

Search This Blog

Thursday, February 08, 2007

On Complexity

The question is asked over on LtU about my transactional memory rant...

why is the under-the-covers complexity of STM bad, when the under-the-covers complexity of garbage collection is good?
There is a glaringly obvious answer to this...

A garbage collector eliminates tons of complexity from the application developer's burden, allowing the app developer to focus more on the true problem domain.

Transactional memory does no such thing. Application developers have to think about shared memory, potential conflicts, how to express them as transactions, and as mental points out: ultimately how to *recover* from a transaction failure.

On comments from Mental...

I don't know that you're really forced to deal with conflict resolution -- most of the popular STM implementations deal with conflict for you by terminating the conflicted transactions and automatically retrying (perhaps with backoff). Conflict resolution is taken care of (or, if you want to be pessimistic, taken out of your hands).

Common optimistic concurrency control techniques make matters a little sticky in that regard: you can easily end up in a situation where a long-running transaction is starved by a steady stream of shorter transactions whose successful commits conflict with it.

It's a slippery slope. New kinds of conflict resolution will be thought up and implemented. I've seen this with transactional systems like Gemstone for Smalltalk and Java. The "reduced conflict" classes take various strategies during the commit process to turn bitwise conflicts into logical successes. There is an arbitrary number of retries before the transaction service just gives up and fails the transaction. Then it is back in the app developers hands to figure out what to do.

This is not unlike the "fallacies of distributed programming" where some system attempts to hide failures that can arise through distribution. No, that illusion can always be broken somewhere and cannot be fully ignored.

Without the ability to reserve objects to the longer-running transaction (i.e. locking), it's hard to resolve such a one-sided conflict.
And that's another problem. Once this thing is implemented in software (and hardware), as Nat Pryce pointed out earlier, there is still the whole coordination thing.
All that said, STM still has composition properties that other shared-state techniques do not, and there are always going to be situations where shared state is most appropriate. I think it can be a useful technique where shared state is required, and transaction sizes would be modest and relatively consistent.
And at the end of the day I am not arguing against the mechanism per se so much as I am arguing against its widespread use. Although I do believe the number of appropriate uses is so small that 90 some percent of us should not even have to pay attention to it. My fear is this will: (1) draw attention away from getting the majority of us running on simple shared-nothing message passing systems, and (2) end up in the average programmer's toolkit as a shiny fob to get out and play with all the time.
Something that may help limit STM to those situations are its practical restrictions on non-transactional effects -- even if you're not in a language where you're forced into an STM monad, any non-transactional effects need to be idempotent because the transaction may be retried arbitrarily many times. (On the other hand, cue novice programmers wondering why the output from their giant transaction occasionally gets duplicated several times...)
Yeah. It's a slippery slope that is just way to shiny. We all get fascinated by it, end up on that slope, and before you know it, smelly messes everywhere. The alternatives are just so much more promising for the majority of cases.

3 comments:

MenTaLguY said...

I don't know that you're really forced to deal with conflict resolution -- most of the popular STM implementations deal with conflict for you by terminating the conflicted transactions and automatically retrying (perhaps with backoff). Conflict resolution is taken care of (or, if you want to be pessimistic, taken out of your hands).

Common optimistic concurrency control techniques make matters a little sticky in that regard: you can easily end up in a situation where a long-running transaction is starved by a steady stream of shorter transactions whose successful commits conflict with it. The transaction manager helpfully keeps retrying the longer transaction for you, but never makes it all the way to the end. Without the ability to reserve objects to the longer-running transaction (i.e. locking), it's hard to resolve such a one-sided conflict.

Contrary to popular belief, STM doesn't require such implementation techniques (you can use a simple lock-based approach just fine), but those techniques are largely responsible for the technique's performance over more traditional lock-based approaches.

All that said, STM still has composition properties that other shared-state techniques do not, and there are always going to be situations where shared state is most appropriate. I think it can be a useful technique where shared state is required, and transaction sizes would be modest and relatively consistent.

Something that may help limit STM to those situations are its practical restrictions on non-transactional effects -- even if you're not in a language where you're forced into an STM monad, any non-transactional effects need to be idempotent because the transaction may be retried arbitrarily many times. (On the other hand, cue novice programmers wondering why the output from their giant transaction occasionally gets duplicated several times...)

Unknown said...

Please reply to your comments as comments. Or alternatively clearly mark where the original text ended. It makes your posts very confusing to read the way you do it now.

Nat Pryce said...

STM has clean compositional semantics as long as all you want to do is safely synchronise concurrent access to shared data. However, that is only one aspect of concurrent programming. You also need to coordinate *activities*. If you use STM you're going to have to rely on other mechanisms for that, so the clean compositionality goes out the window.

Blog Archive

About Me

Portland, Oregon, United States
I'm usually writing from my favorite location on the planet, the pacific northwest of the u.s. I write for myself only and unless otherwise specified my posts here should not be taken as representing an official position of my employer. Contact me at my gee mail account, username patrickdlogan.