We are stupid.

Let's face it, we may be the smartest organism on this planet - though some may doubt that as well - but we are all stupid. We can only do a few things at the same time and then not well. We can't count higher than about 9 - unless you are rainman - without resorting to a sort of mental tagging. Just don't move while I count! In fact we can't handle the complexity of reality without fuzzing it up untill what you see bears very little resemblance to it. Why then do we build software in such a way that only really smart beings can understand it?

I've been writing software since I was about 15, starting out with basic, worked my way up to turbo pascal, got into university - computer science - and learned modula2, object oriented programming and functional programming. Then I dropped out and started a company writing software in PHP. I know - as I said: we are stupid.

Still what I've learned about writing medium sized software, I learned from a number of sources, none of which were taught at the university. I learned the value of keeping components very small writing monsters and mazes in Underworld - a local mud of the lpmud variaty. It also taught me the value of a single simple unifying platform. I learned the value of good abstractions by using many, many very bad ones. I learned the value of openness - the ability to dive under the hood and see how something is achieved - by browsing the web and studying html and javascript. Mostly I learned the value of simplicity by walking repeatedly into the wall of unintentional complexity.

The most important thing I've learned is that writing any piece of software is complex. The moment you have more code than fits on a single screen, you start to lose control. The human mind is not made to really understand software. You can look at any part and think you understand it, but there are many parts hidden - out of immediate sight - which impact what you are looking at, and you cannot possibly understand all its implications.

When you get to parallel programming, threads, processes or even just asynchronous calls, you are really lost. When anything can happen at any time, we are simply incapable of keeping track of all potential pitfalls. 

There have been many attempts at containing the complexity - higher level languages vs. machine code, object oriented programming vs. imperative, fourth generation languages that try to think for you and functional programming vs. all of the above.

None of these have been a cure for the problem and I believe that there may be no single cure. The best of these revolve around the concept of eliminating shared state as the root of all complexity. Because of shared state you cannot look at a part and know what it will do, you need a holistic understanding of the system. The larger the system the more impossible this is.

Alan Kay - one of the inventors of OOP through Smalltalk and overall less stupid than most of us - has said that the key to containing or reducing complexity is encapsulation. You keep the complexity inside and don't let complex implementation details leak out. He also insists that creating custom languages for a job - Domain Specific Languages or DSL - keeps complexity down. I believe that is true - or at least is pointing in a direction to eventually getting us out of this mess. 

Someone else who seems relatively intelligent has made the observation that it is important to reduce complexity and that it may be hard to do so - in the short run. Simple does not equal easy.

One of the things he notes is that syntax is complex and data is simple. I think this is one of the more important observations he made. It is easy to work with data, it is very difficult to work with code or language. However this is pointing in a different direction than Alan Kay's insistance on DSL's to solve the complexity problem.

Functional programming has solved part of the complexity problem and it has done so in a very simple and direct way. It has cut the gordian knot by removing the possibility of shared state. There are no variables - once something has a value it can never change, it has become immutable. If you want a different value - create a new one. 

So why don't we all program in Haskell? I think the problem is that by insisting on a specific solution for all levels of abstraction in software - the complexity of the simple parts of software increases. People find it hard to program functionally, whereas imperative programming is much more natural - more intuïtive. By eliminating the possibility of shared state it has become more difficult to write pieces of software where that shared state would in fact make things easier to write and comprehend.

Alan Kay has said that he believes he made a mistake by designing Object Oriented Programming at too low a level of abstraction - he believes objects should be 'bigger' than how they are now used. I think functional programming may suffer from the same problem. Perhaps we should solve the complexity problem one step up in the abstraction ladder.

The largest software system currently running has been running for more than 30 years without failing catastrophically once. The complexity of the thing is staggering, the number of people that worked on it and are even now working on it is uncountable. And it has gone through major and minor updates without shutting down for maintenance. It is called the Internet and it shows that human beings are capable of writing and maintaining very complex systems. So what is different about the internet and virtually any other software system out there?

Alan Kay speculates that it may be that the internet was designed by some very intelligent people who knew what they were doing. I think they may have been relatively smart - but still stupid. However they succeeded in creating a system that was locally simple but overall complex. To understand any part of it you don't need to understand the totality - just the part you are looking at. Everything is built using the same system of communication - tcp/ip. Everything is built to work with the fact that things will break - and still continue on without failing.

Can we not create smaller systems that have these same features - without giving up the tools we already have?

What if we took a page from the functional programming playbook and eliminated shared state. But not at all levels, just where it would become cumbersome. You could create relatively small pieces of software that work relatively well and let them communicate with other pieces of software through a channel that allows no shared state. Something like the tcp/ip network stack, but simpler.

You could then create a new - more complex system by combining these pieces of software and communicate through this restricted channel. This larger system could itself communicate with other systems through a similarly restricted channel, thereby encapsulating the internal complexity.

By using a communication channel with similar features at larger and larger levels of abstraction you can create systems which are fractally similar. Every step you go down or up in the abstraction ladder - things look similar. Software communicates and responds in similar ways. And more importantly - there is no need to step between levels of abstraction to know what will happen. There can be no shared state between levels of abstraction - the communication channel will not allow it.

The simplest communication system I can think of is a bus. A bus that allows only data - no code, no syntax, no pointers and no streams - say something like JSON. A bus that is by definition asynchronous and guarantees no ordering or minimum response times.

By creating objects that can only communicate with this bus - and have no other shared state - objects become fully self-contained. It no longer matters on what processor or computer an object is running. As long as it is connected to the same bus you can combine it with other objects to build a more complex system. A system that keeps the same level of simpleness at every level of abstraction, where you can know what each component does, but not care how it does it.

In fact people have already hit on this mechanism to manage complexity for large scale and changing systems - there are many implementations of a concept called Enterprise Service Bus which implement just such a scheme for large service components. But the difference is that the scale on which it is applied is only at the topmost level of abstraction. I believe that it should be applied recursively. To create more complex systems by encapsulating simpler systems - so there are many different busses on different levels of abstraction.

6 may 2012
Auke van Slooten

blog comments powered by Disqus