The first paper I read (although the second paper I'll review) in my personal
quest towards becoming a software engineer was Butler Lampson's
Hints for Computer System Design, 1983. It's an excellent paper,
which has given me a great deal of hope that becoming a competent
designer of complex systems is attainable, while reinforcing my
growing belief that such individuals are rather rare.
The paper itself is extremely practical, and walks through a number
of small ideas about improving system design. There is very little
mathematics (in the strictest sense, L. Lamport seems to argue that the sense of mathematics which excludes computer systems engineering is artificially small), nor is there much of a logical structure to the hints:
it's simply an approachable collection of good ideas for software engineering.
(The ideas behind the hints are Lampson's, the wording of the hints are modified versions of Lampson's originals, and the commentary following the hints is my own.)
Interfaces seperate clients from implementation.
Interfaces's are hard to design (they constitute a mini-lanaguage).
Inaccurate assumptions can lead to inept interfaces.
Interfaces should be kept as stable as possible.
Interfaces should not expose the secrets of the underlying implementation.
Don't design an interface which contains functionality you don't know how to implement.
Interfaces shouldn't hide power.
Some functionality can be left to the client (i.e. don't solve a simple client problem with a complex server solution).
Poorly defined interfaces kill projects. Again--I repeat from the core
of my being--poorly defined interfaces kill projects. A well written
interface can shield consumers from a great deal of complication and
frustration, while a bad interface can make a simple process complex
and a quick process slow.
The art of writing interfaces needs further work at
reducing it into a science, towards which ends these
hints are a nice start waiting a thorough ending.
Systems must support end-to-end recovery.
If you support end-to-end recovery, all other recoveries are optimization.
Actions should be either atomic or restartable.
These three succinct statements have caused me to truly
reconsider system reliability and recover, and to develop
a more systematic approach for the current systems I am
working on, and to build in these fundamentals at the
ground floor for future projects.
These changes go big, and they also go small. Consider the
differences between these two snippets:
# snippet oneforfileinfiles:try:withopen(file,'r')asin:data=in.read()# do somethingos.remove(file)except:attempt_recovery(file)
# snippet twotry:forfileinfiles:withopen(file,'r')asin:data=in.read()# do somethingforfileinfiles:os.remove(file)except:attempt_recovery(files)
The first snippet is makes operating on each
file an atomic operation, deletes important
recovery context (the contents of all previously
successful files) as it goes, making it impossible
to restart the entire operation after reaching a
It is likely more efficient to only attempt recovery
of the files where the operation fails, but the option
to rollback and reattempt the process from the beginning
is no longer available. Instead of returning the system
to a known code-path for recovery (reverting to beginning,
trying again), a recovery specific code-path is written.
Generalization & Abstraction
Do one thing, do it well.
Don't generalize for generalization's sake.
Programs maintain a balance between abstract and concrete, and veering too far
one direction or another has obvious consequences. The greater the number of
individuals who need to understand a given code path, the more I would err
towards simplicity over flexibility and power (not that these things are
necessary on opposite sides of the spectrum, but often they are).
There is a frequently suggested idea that generalized code is generally
less efficient than more focused code. If the concrete code is designed
in such a way to take advantage of additional knowledge about each subset
of problems (optimization for adding strings as opposed to adding integers)
that is certainly true, but often I find that abstraction allows a single
code path written with greater care. It certainly is possible to nail
yourself to the wall with either approach.
Handle the best case and worst case seperately. Best case must behave efficiently; worst case must make progress.
Use static analysis.
Cache answers to expensive questions.
Use hints to speed up calculation.
Use batch processing.
Safety first, speed afterwards.
I think the performance advice is fairly straight-forward with
the exception of two bits: handling best and worst case seperately,
and using hints.
Using hints is the concept that you can often take advantage of
some highly likely but not guaranteed aspect of your
dataset to provide faster operations. For example, you might choose
an insertion sort over a quicksort if you know the majority of
your dataset will already be in-order.
When making changes, keep a place to stand.
Plan to throw one away.
Divide & conquer.
These three pieces of advice speak strongly to the design
decisions my team has been making over the last half-year or so.
We've had to upgrade one interface quite substantially, while
leaving the previous version intact. Certain parts of the
system have been rewritten with the benefit of a clearer
understanding of the problem they are solving.
In my experience thus far, the hardest part is always
finding an optimal way to slice a project up such that
each individual has a portion that is suited to their
skill-level and preferences, while still making sure
the project actually gets done. System design strikes me as
intimately involved with understanding the engineers
who are doing the implementation.
Higher Order Functions
Use procedures as function arguments.
This is a wonderful hint, the value of which is often obscured
by the recent reliance on languages which only support the
object-oriented paradigm for programming.
A personal example for passing function arguments is when
performing stream-processing on incoming data, where you need
to perform different operations (or a different order of
operations) depending on the input.
This leads to a concise, simple and composable stream processing mechanism
which can be easily modified in the future. Certainly this can be done
without passing functions as parameters, but it is a fairly clean solution.
That said, what Lampson was really thinking of was probably more along the
lines of writing a generic filter function which takes a predicate function
as a parameter.
This is a rather handy pattern for utility functions.
Promising Reading Fodder from References
Looking through the references, there are a number
of papers which seem to carry some promise, which
I'll be looking for in the future. That said, I
haven't read these papers yet, so it may turn out
my optimism is unfounded; please point out my
folly if you've read one of the papers before
and found it lacking.
Gray, J. et al. The recovery manager of the System R database manager. Computing Surveys 13, 2, June 1981, pp 223-242.
Hoare, C.A.R. Hints on programming language design. SIGACT/SIGPLAN Symposium on Principles of Programming Languages, Boston, Oct. 1973.
Lampson, B.W. and Sturgis, H.E. Atomic transactions. In Distributed Systems — An Advanced Course, Lecture Notes in Computer Science 105, Springer, 1981, pp 246-265.
Lampson, B.W. and Sturgis, H.E. Reflections on an operating system design. Comm. ACM 19, 5, May 1976, pp 251-265.
Reed, D. Naming and Synchronization in a Decentralized Computer System, MIT LCS TR-205. Sep. 1978.
As I said in the introduction, this is a tremendous paper; it is
extremely approachable for laymen, provides a great deal of thought-provoking
ideas, and is coming from an extremely successful engineer. Looking
through the hints, essentially all of them directly apply to decisions
my team has made or is making at work. Areas where we made conflicting
have often turned into swampy code pits.
This paper probably doesn't have too much depth to offer an experienced
developer, but provides a great deal of breadth, and serves well as a
checklist for software engineering decisions.