|
December 2007
December 31, 2007
SOA Is Not About Business
Sam Gentile comments about my attempts to define SOA (Part I, Part II, and more to come) by saying that:
That's all well and true, but any definition of SOA must encompass the business drivers and business reasons, as SOA is not really about technology. It is about a better alignment of business and IT through business processes and services. The goal is to create a dynamic, more Agile and Dynamic IT that can respond quickly to new business opportunities and threats by quickly assembling new capabilities from putting together composite applications (and even Mash-ups) from reusable business services...
Sorry Sam, but I beg to differ, not about the importance of business drive behind implementing SOA, but about what SOA is. The culprit, in my opinion, is terminology overloading.
SOA is, as I said in the above mentioned post and numerous other times, first and foremost an architectural style -- and as an architectural style it offers several architectural benefits and poses several architectural constraints. This has nothing to do with business drivers. It has to do with defining components, relations, attributes on relations and components as well as constraints. Now you can take those set of rules and use (or misuse) them as you like, in the context of a subsystem, single project, product line, or enterprise -- this is your choice.
Applying SOA, on the other hand, has everything to do with the business. I'll take Sam's post word-for-word but instead of using the word SOA, I would prefer using the term SOA initiative. An SOA initiative is the effort of applying SOA in a wide context for an enterprise, aiming to increase the alignment of IT and the business, etc. I would have to say though, that in my experience, such an effort would rarely use SOA alone. It would also include other distributed architectural styles that also help with decoupling and loose coupling like EDA and REST to name a couple.
By the way, SOA has nothing to do with technology either. You can implement SOA using WS-*, Atompub, MSMQ, CORBA just as much as you can implement REST with quite a few technologies. It so happens that WS-* is a common implementation technology for SOA, and that HTTP is used as a common implementation technology for REST but both styles live independently of the technologies.
Posted by Arnon Rotem-Gal-Oz at 05:53 PM Permalink
|
December 26, 2007
Defining SOA: Part II, Client/Server
In the previous post on defining SOA I claimed that SOA is an architectural style building on four other architectural styles. The first one of these is Client/Server.
Describing client/server is easy -- not because I am such a genius (far from it), but it has already been done numerous times. Let's take a look at the definition from Roy Fielding's now-famous dissertation (The link is to Chapter 3; REST is defined in Chapter 5 if you are interested):
The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client. A variety of client-server systems are surveyed by Sinha [123] and Umar [131]. Andrews [6] describes client-server components as follows: A client is a triggering process; a server is a reactive process. Clients make requests that trigger reactions from servers. Thus, a client initiates activity at times of its choosing; it often then delays until its request has been serviced. On the other hand, a server waits for requests to be made and
then reacts to them. A server is usually a non-terminating process and often provides service to more than one client.
Separation of concerns is the principle behind the client-server constraints. A proper separation of functionality should simplify the server component in order to improve scalability. This simplification usually takes the form of moving all of the user interface functionality into the client component. The separation also allows the two types of components to evolve independently, provided that the interface doesn't change.
The basic form of client-server does not constrain how application state is partitioned between client and server components. It is often referred to by the mechanisms used for the connector implementation, such as remote procedure call [23] or message-oriented middleware [131].
SOA takes from the client/server style the two roles -- i.e., in each interaction one party is the client (what I call "service consumer") and the other is the server (service) which handles the request coming from the client*. Unlike traditional client/server, the roles are held only for a particular set of interactions -- a given interface that the service exposes. In another set of interactions the roles can be reversed and a component that once was a server can now act as a client even working with the very same component that was previously its client.
Like REST, SOA takes the constraint of separation of concerns which allow the service and its service consumers to evolve independently (as long as the interface is kept).
To support this, services should takes care of all its internal state without exposing its internal state or its internal structures outside of the service. This also allows the service to scale behind the interface but for that we also need constraints and capabilities from the next architectural style layered system, which
I'll discuss in the next installment on this subject.
*You can compose SOA with other architectural styles such as EDA and have the service also push data, but that isn't something SOA does in its basic form.
Posted by Arnon Rotem-Gal-Oz at 05:13 PM Permalink
|
December 20, 2007
Make It Easy To Do the Right Thing
Wes Dyer, one of the principal people behind the Volta tier-splitting, was kind enough to leave a comment on my previous post. Here is one quote from that comment:
I do want to clear up a few things about Volta that we apparently didn't make clear enough. We do not believe that you can develop an application as if it will run on a single tier and then just sprinkle a few custom attributes here and there and be done with it. More than anything else, programmers need brains. Volta does not claim that programmer brains can be checked at the door. When the programmer wants to divide the application across a particular boundary then things like network latency, new failure modes, concurrency, etc. need to be considered at that boundary. What Volta does is make expressing the transition between boundaries easier. It reduces the accidental complexity of writing all of the boilerplate code to express the programmer's intention. This allows the programmer to focus on the essential complexity of his problem domain -- figuring out how to write effective code for that particular tier boundary.
For one, it is good to hear that the architects behind Volta have a deeper understanding of distributed computing challenges -- even if the first version doesn't seem to show it. I didn't use Microsoft Volta enough to say that indeed the problem is not with the inherent capabilities and design (let's just take Wes words for that). I am also not against saving the boilerplate code (though I would favor libraries rather than code generation and try to keep the "generation" gap to a minimum; i.e. the amount of generated code or the distance between the abstraction and the next concrete level).
Lastly I am also in favor of trusting developers have brains and that it is okay to provide developer "sharp tools". So if all is good, where's the problem?
The problem is that you have to make it "easy to do the right thing" and provide the means to do the more complicated, less safe things. When I teach my young kids (and I can objectively say they are very
smart :) ) to use a knife, I don't hand them the razor sharp, butcher knife first. They start with the plastic ones. When they've mastered that they can try something more dangerous. When you allow distributing something at a flick of an attribute and put marketing blurb on the site that makes it compelling to use it you create the wrong impression to the less experienced folks.
In one project which architecture I reviewed, the (very talented) architect/developer designed his own distributed transactions system (he shouldn't have been doing that in the first place -- but that's for another post). When designing this he built in a lot ways to control the transaction behavior including the option to allow transaction participants to prevent rollback without failing the transaction. Circumventing the transaction was as easy as making it work properly. Are there edge cases where you may need to have one participant violate the ACIDness of the transaction? I guess so -- but that is not the general rule. Most of the time when you commit a transaction you expect it to be ACID. if for some reason it didn't behave that way -- you want to know about it, even if it didn't actually failed. When you don't make it easy to do the right you get unexpected behaviors, you get hard to explain bugs, you get slow performance, etc.
Developers using tools, smart as they may be, don't usually go and read all the source code of the tool/framework they are using (maybe they should). If two options are just as easy to use, it seems safe to
assume they are just equally right. Things which are unsafe should be clearly marked as such to prevent mis-use by unexperienced users. This is especially true for tools that are targeted for common use and to ease the life of inexperienced developers.
Posted by Arnon Rotem-Gal-Oz at 04:50 PM Permalink
|
December 12, 2007
Why Arbitrary Tier-splitting Is Bad
I got a couple of emails with questions regarding my previous post on Volta. So here's another go at explaining why dynamic-tiering is not a good move--this time in technicolor.
Let's start with a simple example. The diagram below represents a typical local component(A) in its environment. As a component that works locally, it has access to other local components which it interacts with. These can be objects it created by itself or objects that where injected to it. The likely design for local components is to have a chatty interaction. After all objects can talk to instances of other objects quite easily.
 " border="0"> .
Now enters Volta (or any other such framework, and I've seen a few, I'm ashamed to say. I even wrote one about 15 years ago). So Volta says "we'll just mark things we want to execute on a different server and everything is fine." What you get is something like the illustration below:
 " border="0"> .
We have the same number of interactions -- only now all the interactions between A and its (used to be) near environment requires serialization, network interaction, possibly encryption, authentication, authorization, and what not. You can imagine that this type of interaction can have a heavy hit on performance and scalability if it wasn't pre-designed somehow.
This is a bit of hand-waving, so let me also give you an example from a real project. About three years ago I was invited to consult in a project. This was the kind of project that interacts with real things like sensors etc. I'll use an automated irrigation system to illustrate its architectural components. One type of component is "Things", these represent real devices you can interact with like sprinklers, soil, sensors etc. Things represent the logical state of the real devices and cannot talk to each other. When two Things need to interact (e.g., you want to turn on the sprinkler if the soil is dry) so I introduce another architectural component, I call "Interaction" which looks at the state of the Things and can then act upon it. The last major type is "Services" (not services in the SOA sense) e.g., we can have a Service that reads the weather. Services can't interact with Things directly, but they can interact between them and they can interact with "Interactions". This particular system had
dozens of Things hundreds of Interactions and Services. And the tiers/process boundaries were as follows:
 " border="0">
Interactions have to know about changes both in Things and Services so messages keep flying around this system to keep the Interactions in sync as well as propagate decisions made by Interactions. The outcome of this "smart" design is that every status change in a Thingresults in an order of magnitude more messages to react to the change is status. I was brought in to find a way to find a way to get in-order reliable messages flow fast enough between the different tiers. I did my best and left -- what they didn't want to listen to, and the better solution is to give a lot of thought about related Things, Interactions, and Services and bundle them together into "tierable" component. The interactions within these "chunks" would be local and would then inflict a whole less messages on the system. In our example it makes sense to bundle the four components (sprinkler etc.) into a single tier and possibly the same process and increase the overall performance significantly while also giving us more cohesive boundaries.
 " border="0">
(As an aside, I'll just mention that I ran into someone who is part of this particular project a few days
ago. They are still struggling with performance and stability problems...)
Anyway, one could argue that frameworks like Volta would allow you to move from the bad partioning to the good one more easily -- but this is not really so since when you rearange the components you also have to remodel the messages that flow between the new partitions. Also this is not to say that having the ability to run a system in local and in distributed modes does not have value. As I said in the previous post -- it is the assumption that you can easily move this boundary and still get a viable solution that is wrong. Also if you are going to allow running in local and distributed mode that doesn't have to spell to "dark magic" of MSIL rewrites and compilations.
In another (SOA) project we designed services so that in a small-scale installation you would be able to instantiate services in the same process. Services were constructed as Active Services (i.e., have at least one thread of control). If you wanted to let two services run in the same process you just had to write a new ServiceHost and a new ServiceBus. The new ServiceHost has to provide each service its own thread or thread pull and the ServiceBus has to work in-memory by passing message objects around rather then serializing/deserializing and sending them over the network. On a small installation this works better than multiple processes (but not as good as a system designed specifically to run on a single tier). Note that this is the opposite of what Volta does as it takes a distributed solution and let it run locally rather than the other way around.
The other part of Volta is the C# to JavaScript cross compiler. This may have a future -- but it really depends on the attention Microsoft will put into this direction. Google does something similar on its Android mobile platform where it takes Java bytecode and translates it into the Dalvik VM. But for Google, that's a strategic platform. With Microsoft's investments in Silverlight (which I personally prefer), I would guess the effort would always lag behind (though I hope they'd get it to be better than it is today).
Posted by Arnon Rotem-Gal-Oz at 04:28 PM Permalink
|
December 07, 2007
Microsoft Volta: Oh My, Oh My
Microsoft uses the "live labs" to release all sorts of test balloons. Sometimes we get really nifty stuff like Photosynth or SeaDragon. Unfortunately, sometimes we get stupid not-so-bright ideas like Volta.
What is Volta? Here's what the project's homepage has to say (emphasis mine):
The Volta technology preview is a developer toolset that enables you to build multi-tier web applications by applying familiar techniques and patterns. First, design and build your application as a .NET client
application, then assign the portions of the application to run on the server and the client tiers late in the development process. The compiler creates cross-browser JavaScript for the client tier, web services for the server tier, and communication, serialization, synchronization, security, and other boilerplate code to tie the tiers together.
Developers can target either web browsers or the CLR as clients and Volta handles the complexities of tier-splitting for you. Volta comprises tools such as end-to-end profiling to make architectural refactoring and optimization simple and quick. In effect, Volta offers a best-effort experience in multiple environments without any changes to the application.
The idea sounds very compelling. So what's the problem?
The first issue is that, as a platform/framework (MS would say factory), Volta tries to accomplish too much. On one hand Volta is another go at the web /desktop convergence trend.
On the other hand it is supposed to be a solution for "painless" tier-splitting. Both of these tasks are very heavy. My opinion is that the Single Responsibility Principle (while originally defined for objects) applies here. And Volta should choose one thing and try to excel in that.
What's more disturbing to me is the automatic handling of the "complexities of tier-splitting". Here's another excerpt from the Volta site which further explains the "tier-splitting" concept:
Objective
We have an application that runs in a monolithic environment, say the browser. We want parts of this application to run in other environments, such as servers. We don’t want to litter the application with plumbing code.
Rationale
The standard techniques for distributed applications infuse our code everywhere with information about what parts run where. This makes the code hard to change. Typically, once we make these decisions we can’t change them because it is too expensive. However, environments, requirements, and performance profiles change and we’re stuck with applications that can’t adapt to new realities. We need to separate the concerns about what the application does from the concerns about where parts of the application run.
Without Volta, we are forced to decide where code runs before we know everything it is going to do, in
particular before we know the communication frequencies and delays. Development methodologies force us to make irreversible decisions too early in the application lifecycle. Volta gives us the means to delay
decisions until we have adequate information to base them on.
Recipe
Volta tier splitting automates the creation of the communication plumbing code, serialization, and remoting. Simply mark classes or methods with a custom attribute that tells the Volta compiler where they should run. Unmarked classes and methods continue to run on the client.
We may base our decisions about tier assignment on any criteria we like, such as performance or location of critical assets and capabilities. Because Volta automates boilerplate code and processes for dispersing code, it is easy for us to experiment with and change assignments of classes and methods to tiers.
Wow! Agile development at its best, allowing us to postpone architectural decisions that just sound too good to be true. Well, the problem is that it is too good to be true. Abstracting the network out, and providing location transparency without thinking about the implications of distribution is the reason "distributed objects" failed. e.g. Here is what Harry Pierson (DevHawk) had to say about distributed objects:
...back in 2003, mainstream platforms typically used a distributed object approach to building distributed apps. Distributed objects were widely implemented and fairly well understood. You created an object like normal, but the underlying platform would create the actual object on a remote machine. You'd call functions on your local proxy and the platform would marshal the call across the network to the real object. The network hop would still be there, but the platform abstracted away the mechanics of making it. Examples of distributed object platforms include CORBA via IOR, Java RMI, COM via DCOM and .NET Remoting.
The (now well-documented and understood) problem with this approach is that distributed objects can't be designed like other objects. For performance reasons, distributed objects have to have what Martin Fowler called
a "coarse-grained interface", a design which sacrifices flexibility and extensibility in return for minimizing the number of cross-network calls. Because the network overhead can't be abstracted away,
distributed objects are a very leaky abstraction
So here comes Volta and tells us just put a [RunAtOrigin] attribute on the code you want on another tier and if you don't like that you can change it to another place in your application and what not. Note that the notion that you can automate some or maybe even all of the distribution "boilerplate" code may be viable. The problem is in the premise that you can seamlessly move that boundary around. There's a fundamental difference between tiers and layers. Tiers should be treated as a boundary .Volta designers do talk about Security but they seem to forget a few of the other fallacies of distributed computing...
Posted by Arnon Rotem-Gal-Oz at 07:09 AM Permalink
|
December 05, 2007
Who Needs an Architect Anyway? Part III: The Other Responsibilities
In the previous installment, I talked about the architect and the architectural decisions, I also said (okay, "wrote") that architects do more than that. Well, here are a few of the duties I think architects should have (sometimes not exclusively).
Project CTO. Tom Berray has an excellent paper describing four models for the role of a CTO. Three of them can be applied to software architects (within their projects):
- "Big Thinker". This is somewhat akin to the role discussed in the previous post.
- "External Facing Technologiest". I usually saw this in larger projects, but it is also applicable for smaller ones. There are many occasions where the technical capabilities of the project have to be presented and/or negotiated with external stakeholders. Architects are in a good position to perform this as they should have good understanding of both the business and the technology. Additionally making architectural decisions already requires the architect to understand the different stakeholders' needs.
- The third model is called "Technology Visionary and Operations Manager" -- Making sure that technology works
to deliver business goals -- but how is that done?
In their book Organizational Patterns of Agile Software Development, Jim Coplien and Neil Harrison talk about the Quattro Pro for Windows (QPW) development team. According to the case study, Borland had a team of four architects who worked together to produce what the authors call "prototypes"*. Six months later these architects were joined by additional developers to produce the product. During the development the architects kept meeting on a daily basis to
coordinate their efforts (sort of like a daily stand-up in a SCRUM of SCRUMs).
The situation in the QPW is probably close to the ideal architect
involvement in a project -- coding architects that work closely with the team, while driving technical and architectural decisions. The availability of multiple architect (but few to prevent the "design by committee" effect) also enhances the overall quality of the solution.
Another aspect of the architect work is to act as a coach/tutor. It isn't enough for the architect to "know best". We already know that architect must also be able to reason about their recommendations/decisions, but that's just part of the story. Helping other team members get better in what they do means that they'd be able to do their job better, they'd be able to come up with their own ideas (and get more fresh ideas into the discussions) and produce better software. Since the architect is ultimately responsible for the quality of the solution, making others perform better should be a top priority for the architect. Being considered as a source of knowledge will help an architect perform his/her role, even when they don't have an architect title.
Actually, what they did were POCs or spikes(see Architecture Evaluation In Code for an explanation of the differences)
Posted by Arnon Rotem-Gal-Oz at 04:21 PM Permalink
|
|
January 2008
| Sun |
Mon |
Tue |
Wed |
Thu |
Fri |
Sat |
| |
|
1 |
2 |
3 |
4 |
5 |
| 6 |
7 |
8 |
9 |
10 |
11 |
12 |
| 13 |
14 |
15 |
16 |
17 |
18 |
19 |
| 20 |
21 |
22 |
23 |
24 |
25 |
26 |
| 27 |
28 |
29 |
30 |
31 |
|
|
|