The Logic of Service-Orientation Plus 14 SO Tenets and Practical Principlesby Juval Löwy
What exactly is service orientation, and what does it mean for the future of the software industry? What are the principles that should guide any developer using it? My recently published book, Programming WCF Services, is all about designing and developing service-oriented applications using WCF. But here I want to present my understanding of what service-orientation is all about, and what it means in practical terms.
To understand where the software industry is heading with service orientation, it helps to first appreciate where it came from, since almost nothing in this new methodology is entirely new but rather, the result of a gradual evolution of methodology and practice that has spanned decades. I'll begin with that story and after a brief discussion of the history of software engineering and the arc of its development, I'll describe what makes a service-oriented application (as opposed to mere service-oriented architecture), explain what services themselves are, and examine the benefits of using the service-oriented approach. I'll conclude by presenting several principles of service orientation and augment these abstract tenets with a few more practical and concrete requirements that must be met by most applications.
From Machine Code to Services: A Brief History of Software Engineering
The first modern computer was an electromechanical, typewriter-size device developed in Poland in the late 1920s for enciphering messages. The device was later sold to the German Commerce Ministry, and in the 1930s was adopted by the German military for enciphering all communication. Today we know it as the Enigma. Enigma used mechanical rotors for changing the route of electrical current flow from a key type to a light board with a different letter on it (the ciphered letter). Enigma was not a general-purpose computer: it could only do enciphering and deciphering (today we call it encryption and decryption). If the operator wanted to change the encryption algorithm, he had to change the mechanical structure of the machine by changing the rotors, their order, their initial positions, and the wired plugs that connected the keyboard to the light board. The "program" was therefore coupled in the extreme to the problem it was designed to solve (encryption), and to the mechanical design of the computer.
The late 1940s and the 1950s saw the introduction of the first general-purpose electronic computers for defense purposes. These machines could run code that addressed any problem, not just a single predetermined task. The downside was that the code executed on those computers was in a machine-specific "language" with the program coupled to the hardware itself. Code developed for one machine could not run on another. Initially this was not a cause for concern since there were only a handful of computers in the world anyway.
Unstructured Programming: Assembly and Higher-Level Languages
As machines proliferated, the emergence of assembly language in the early 1960s decoupled code from specific machines and enabled it to run on multiple machines. However, the code was now coupled to machine architecture. Code written for an 8-bit machine could not run on a 16-bit machine, let alone withstand differences in the registers or available memory and memory layout. As a result, the cost of owning and maintaining a program began to escalate. This coincided more or less with the widespread adoption of computers by the civilian and government sectors, where the more limited resources and budgets necessitated a better solution.
In the 1960s, higher-level languages such as COBOL and FORTRAN introduced the notion of a compiler. A developer could write in an abstraction of a machine program (the language), and the compiler would translate that into actual assembly code. Compilers for the first time decoupled the code from the hardware and its architecture. The problem with those first-generation higher-level languages was that the code they generated was unstructured, resulting in code that was coupled to its own structure, via the use of jump or go-to statements. Minute changes to the code structure had devastating effects throughout the program.
Structured Programming: C and Pascal
The 1970s saw the emergence of structured programming via languages such as C and Pascal, which decoupled the code from its internal layout and structure by using functions and structures. During this same period, developers and researchers began to examine software as an engineered entity. To drive down the cost of ownership, companies had to start thinking about reuse--how could a piece of code be written so that it could be reused in other contexts? With languages like C, the basic unit of reuse is the function, but the problem with function-based reuse is that the function is coupled to the data it manipulates, and if the data is global, a change to benefit one function in one reuse context damages another function used somewhere else.
Object-Orientation: C++ and Smalltalk
The solution to these problems was object-orientation, which appeared in the 1980s with languages such as Smalltalk, and later, C++. With object-orientation, the functions and the data they manipulate are packaged together in an object. The functions (now called methods) encapsulate the logic, and the object encapsulates the data. Object-orientation enables domain modeling in the form of a class hierarchy. The mechanism of reuse is class-based, enabling both direct reuse and specialization via inheritance.
But object-orientation is not without its own acute problems. First, the generated application (or code artifact) is a single monolithic application. Languages like C++ have nothing to say about the binary representation of the generated code. Developers had to deploy huge code bases every time, even for minute changes. This had a detrimental effect on the development process, quality, time to market, and cost. While the basic unit of reuse was a class, it was a class in source format. Consequently, the application was coupled to the language used. You could not have a Smalltalk client consuming a C++ class or deriving from it. Moreover, it turned out that inheritance is a poor mechanism for reuse, often harboring more harm than good because the developer of the derived class needs to be intimately aware of the implementation of the base class, which introduces vertical coupling across the class hierarchy.
Object-orientation was also oblivious to real-life challenges, such as deployment and versioning. Serialization and persistence posed yet another set of problems. Most applications did not start by plucking objects out of thin air--they had some persistent state that needed to be hydrated into their objects, and yet there was no way of enforcing compatibility between the persisted state and the potentially new object code. If the objects were distributed across multiple processes or machines, there was no way of using raw C++ for the invocation, since C++ required direct memory reference and did not support distribution. Developers had to write host processes and use some remote call technology such as TCP sockets to remote the calls, but such invocations looked nothing like native C++ calls and did not benefit from it.
Component-Orientation: COM and .NET
The solutions to the problems of object-orientation evolved over time, involving technologies such as the static library (.lib) and the dynamic library (.dll), culminating in 1994 with the release by Microsoft of Component Object Model (COM), the first component-oriented technology. Component-orientation provides interchangeable and interoperable binary components. With COM, the client and the server agree on a binary type system (such as IDL) and a way of representing the metadata inside the opaque binary components, instead of sharing source files. COM components are discovered and loaded at runtime, enabling scenarios such as the dropping of a control on a form and having that control automatically loaded at runtime on the client's machine. The client only programs against an abstraction of the service provided by the COM object--a contract called the interface. As long as the interface is immutable, the service is free to evolve at will. A proxy can implement the same interface and thus enable seamless remote calls by encapsulating the low-level mechanics of the remote call.
The availability of a common binary type system in COM enables cross-language interoperability, and so a Visual Basic client can consume a C++ COM component. The basic unit of reuse is the interface, not the component, and polymorphic implementations are interchangeable. Versioning is controlled by assigning a unique identifier for every interface, COM object, and type library.
While COM was a fundamental breakthrough in modern software engineering, most developers found it unpalatable. COM was unnecessarily ugly because it was bolted on top of an operating system that was unaware of it, and the languages used for writing COM components (such as C++ and Visual Basic) were at best object-oriented but not component-oriented. This greatly complicated the programming model, requiring frameworks such as ATL to bridge the two worlds.
Recognizing these limitations, Microsoft released .NET 1.0 in 2002. .NET is (in the abstract) nothing more than cleaned-up COM, C++, and Windows, all working seamlessly together under a single, new component-oriented runtime. .NET supports all the advantages of COM, and mandates and standardizes many ingredients such as type metadata sharing, serialization, and versioning. While .NET is at least an order of magnitude easier to work with than COM, both COM and .NET suffer from a similar set of problems:
Technology and platform
The application and its code are coupled to the technology and the platform. Both COM and .NET are only available on Windows. Both COM and .NET expect the client and the service to be either COM- or .NET-based, and cannot interoperate natively with other technologies, whether they're on Windows or not. While bridging technologies such as web services make interoperability possible, they force the developers to let go of almost all of the benefits of working with the native framework and introduces their own complexities.
When shipping a component, a vendor cannot assume it will not be accessed by multiple threads concurrently by its clients. It fact, the only safe assumption the vendor can make is that the component will be accessed by multiple threads. As a result, the components must be thread-safe and equipped with a synchronization lock. If an application developer is building an application by aggregating multiple components from multiple vendors, the introduction of multiple locks renders the application deadlock-prone. Avoiding the deadlock couples the application and the components.
If the application wishes to have the components participate in a single transaction, it requires the application that hosts them to coordinate the transaction and flow the transaction from one component to the next, which is a serious programming fit. It also introduces coupling between the application and the components.
If components are deployed across process or machine boundaries, they are coupled to the details of the remote calls, the transport protocol used, and that protocol's implication on the programming model (such as reliability and security).
Components can be invoked synchronously or asynchronously, and can be connected or disconnected. A component may or may not be able to be invoked in either one of these modes, and the application must be aware of the exact preference.
Applications may be written against one version of a component and yet encounter another in production. Dealing robustly with versioning issues couples the application to the components it uses.
Components may need to authenticate and authorize their callers, and yet how would the component know which security authority it should use or which user is a member of which role? Not only that, but the component may want to ensure that the communication from its clients is secure, which of course imposes certain restrictions on the clients and in turn couples them to the security needs of the component.
Both COM and .NET tried to address some (but not all) of these challenges using technologies such as COM+ and Enterprise Services, respectively (similarly, Java introduced J2EE), but in reality, such applications were inundated with plumbing. In a decent-size application, the bulk of the effort, development, and debugging time is spent on addressing such plumbing issues, as opposed to business logic and features. To make things even worse, since the end customer (or the development manager) rarely cares about plumbing (as opposed to features), the developers are not given adequate time to develop robust plumbing. Instead, most handcrafted plumbing solutions are proprietary (which hinders reuse, migration, and hiring), and low quality, because most developers are not security or synchronization experts and were never given the time and resources to develop the plumbing properly.
As you consider the brief history of software engineering I've just outlined, you'll see a pattern: each new generation of technology incorporates the benefits, and improves on the deficiencies, of the technology that preceded it. However, every generation also introduces its own challenges, and I would say that modern software engineering is the ongoing refinement of the ever-increasing degrees of decoupling. Yet, while the history of software shows that coupling is bad, it also suggests that coupling is unavoidable. An absolutely decoupled application is useless because it adds no value. Developers can only add value by coupling things together. The very act of writing code is coupling one thing to another. The real question is how to wisely choose what to be coupled to.
I believe there are two types of coupling. Good coupling is business-level coupling. Developers add value by implementing a system use case or a feature, by coupling software functionality together. Bad coupling is anything to do with writing plumbing. What is wrong with .NET and COM is not the concept, but the fact that developers still have to write so much plumbing.
Recognizing the problems of the past, the service-oriented methodology has emerged in the 2000s as the answer to the shortcomings of object- and component-orientated methodologies, which I'll discuss in greater detail in the next section. For Microsoft developers, the methodology is embodied in the Windows Communication Foundation (WCF), which was released in November 2006 with the release of the .NET Framework 3.0.
Pages: 1, 2