Site Archive (Complete)
Architecture & Design
IF YOU BUILD IT

... Will they Come?

by Arnon Rotem-Gal-Oz

June 2007


June 28, 2007

Events and Temporal Coupling


You raise an event when something interesting happens. You think it is important, but you don't care enough to know who is interested. You are even less interested in personally going to each and every interested party and letting them know. So instead, you raise an event -- and let the poor buggers take care of any implications by themselves. We raise the event "now" when the change happened. It is only important now anyway....

Looking from the "poor buggers" point of view (the event consumer), things are more complicated. There are events which are cyclic in nature (like stock price updates, the blips from a sonar etc.) where if you miss a blip, then it isn't really important. You'll get the right information in the next update (actually, that's not always true; keep reading). Then there are the events which only occur once. Sometimes it isn't important for you to listen to them if you are not up and running in the same time. Other times you can't afford to lose an event; for instance, if your ordering service (or component for that matter) communicates with the invoicing service using events, missing the event of a new order means you lose money.

This basically means that the event producer and the event consumer are coupled in time. Which means that you need to make sure both of these services are available at the same time -- if the invoices crashed, then processing orders should be suspended. This doesn't mean that you don't accept orders, just that you don't process them.

Okay, maybe we can just raise the event "transactionally". This would probably work, but you need to remember that the event producer doesn't really care about the event consumers. Why would it want to fail because of them?!

Maybe a better way would be to "raise" the event over some reliable transport. But this has a few problems. For one thing, we've passed the problem to the connection between the event producer and the transport. It might be acceptable to have a transaction between the event producer and the transport. But as I've already said, the producer doesn't care much about the consumers.

We can have persistent subscriptions for existing consumers to prevent events from getting lost which make both create a minor problem that new consumers can't see past events. Also it has the risk of existing subscribers disappearing and their queue can then grow endlessly (or until an administrator removes the subscription).

Okay, let's look at the problem from a different perspective. Looking at the events, what we can really see is that an event has a time-to-live (TTL) as far as the event consumer is concerned. For instance, in the case of the cyclic events, the TTL is the interval until the next event. Actually, even with cyclic events the TTL might be larger. If we are also interested in analyzing trends or abnormal occurrences (which is why I said it isn't entirely true we don't care about old events). In case of one-time events, the TTL might be indefinite or maybe even then it might be some definite value (one day, week, year etc.). Since we can't know about the TTL of consumers it can be a good idea to make past events available somehow.

Thus, when you design an event-centric architecture like EDA (whether on top of SOA or not), it is important to think about event consumers -- we don't want to think about specific consumers since it negates the benefits of thinking in events. However I would say that you want to think about event consumers in general, after all your component is also an event consumer ("do unto others as you would have them do unto you").

One option which I already talked about is to make past events available as a feed. Event consumers can then come at their own leisure and consume past event (this can be in addition to raising the events in real-time). This provides a partial solution as the maximal TTL is determined by the event producer (after which the event is deleted from the feed). This may be acceptable but you must be aware of that.

The other option is to to log all the events and provide an API to retrieve past events. In a sense the max TTL is still at the hands of the event producer only if you use a database it would probably be a large time compared with a feed. Alternatively the events can be logged on by a central "always present" event aggregator (in a manner similar to the aggregated reporting pattern I described for SOA).

To sum up: Events they seem only to matter in the instance in time they are created, we are used to that thinking from building object-oriented systems where all the components are co-located in the same address-space and time (even there I can think of scenarios where we would want past events). In a distributed world events need to have a TTL. The TTLs can vary and are determined by the events consumers. Lastly, as I demonstrated in the previous paragraph, there are several strategies we can use to help solve the event TTL dilemma (and there are probably a few others).

Posted by Arnon Rotem-Gal-Oz at 10:24 AM  Permalink |


June 23, 2007

BI and SOA: A Question


Few months ago I wrote here about solving the mismatch between Service Oriented Architecture (SOA) and Business Intelligence (BI).

I recently got this question from Ben:

One major question I have is around large data sets. As an experienced BI/DW architect and developer I have worked on a number of large scale data warehouses. Retrieving large data sets (i.e. millions of records) doesn't seem to fit well into SOA. As you state in your article, we could have another point-to-point interface, where the service which houses data we need gets a request and writes out a batch file (xml or plain ascii text). Then using typical ETL, we grab the file and load it. The underlying source system (service) can use optimization in generating a large data set (vs. record by record) and the data warehouse can correspondingly load in bulk.

Like most architectural questions, the answer is "it depends". For instance, if you do a run-of-the-mill ETL as an on-time setup then it is just that -- a one time setup and I don't see any contradiction between SOA goals or tenets and that.

I do think that iit is better to enhance SOA with EDA interactions to provide a long-term solution to the BI problem. You can also have a dedicated component that aggregated the information that flows in in these events and builds batch files that are suited for the ETL you've used during the setup phase (mentioned above).

It is true though that moving an already-in-place SOA to EDA is not a small feat, but adding EDA layers does not have to mean that the old interfaces go away -- especially not immediately (remember to treat services as products).

If you have a business that generates millions of records on a daily basis, then the situation is more complicated. Now you have to think about the trade-offs between "compromising" SOA and adding a dedicated interface (or a backdoor to the database) for the ETL vs. the implications of performance, bandwidth, transition costs, ROI, etc. of pushing that information with EDA. I believe in pragmatism and the "no-silver-bullet" approach so I can't say that EDA is always the best solution. (As an aside, this is part of the reason my book refers to "patterns," not "best-practices guidance"). You may find that ETL is the best trade-off in your situation. Yes, I know that it isn't a definitive answer, but real life is (usually) a little more complicated than black-and-white solutions. As architects we need to find the best trade off for the situation at hand.


By the way, if you have a question regarding anything I write here or anything else related to software architecture and you want to hear what I think about it, feel free to send it to ask@rgoarchitects.com.

Posted by Arnon Rotem-Gal-Oz at 04:13 AM  Permalink |


June 19, 2007

Observations on REST and Contracts


I thought I has this  RESTful web services thing figured out, but following one of the threads on the Yahoo group on Service-Oriented Architecture I came to the conclusion that maybe I don't.



Steve Jones tried to see if he understands REST by giving an example and that example was corrected by Anne Thomas Manes (a research director with the Burton Group which recently stated that the future of SOA is REST).
Here are the examples from the above mentioned thread:

POST http://example.org/customer
HTTP message body contains a representation of "anne"
server creates a subordinate resource called http://example.org/customer/anne

GET http://example.org/customer/anne
returns a representation of "anne"

GET http://example.org/customer/personByName?name=anne
returns a representation of "anne"
or perhaps returns the URI of the "anne" resource
or perhaps returns a list of URIs of all people named "anne"
might also be specified more simply as
GET http://example.org/customer?name=anne

GET http://example.org/customer/personByAge?age=27
returns a list of URIs of people whose age is 27
or perhaps returns a collection of representations of all people aged 27
might also be specified more simply as
GET http://example.org/customer?age=27

PUT http://example.org/customer/anne
HTTP message body contains a representation of "anne"
either creates a new resource called "anne" (if none exists)
or replaces the existing "anne" resource

PUT http://example.org/company/newco
HTTP message body contains a representation of "newco"
either creates a new resource called "newco" (if none exists)
or replaces the existing "newco" resource

If you prefer the server to assign the URI you would instead say

POST http://example.org/company
HTTP message body contains a representation of "newco"
server creates a subordinate resource called http://example.org/company/newco

POST http://example.org/customer/anne?addCompany=http://example.org/company/newco
this would append the newco company reference to the "anne" resource


You can see another example for what I am talking about here on Jon Udell's blog giving an example from RESTful Web Services, by Leonard Richardson and Sam Rubycovering  of doing a transaction in RESTful style

If all these are indeed "legal" or "correct" RESTful interactions I have 2 observations to make
First, I guess Pat Helland is right when he said "Every noun can be verbed"
since I don't see the real difference between having a contract with a
PersonsByAge request which returns a document* of Persons and a REST
request like " GET http://example.org/customer/personByAge?age=27" or even " GET http://example.org/customer?age=27".

The second observation has to do with the so called "uniform interface".
I would argue that the resources and their attributes (age=27,
name="anne") are the interface. the POST, GET etc. uniform interface
does not mean much more than the "uniform" SEND, BROADCAST  interface of messaging.

Furthermore, if resources and their attributes are indeed "the interface" -- than not only does REST not have a uniform contract -- it actually has a dynamic one which changes in run-time as new resources are created -- such as the "POST http://example.org/company"  which creates a new resource "http://example.org/company/newco" in the example above.




* I think it is very important for SOA to have document-oriented messages and not RPC one I'll blog in a separate post about the differences. for now it is suffice to say that the REST
hypermedia notion of returning the URIs of all the relevant persons
should also be present (one way or another) in a good document-oriented message even if you are using WS-* or plain messaging as transport.

Posted by Arnon Rotem-Gal-Oz at 01:58 AM  Permalink |


June 13, 2007

Manning Early Access Program and My SOA Book


In addition to the drafts of selected patterns I publish on my site, you can now get my book and others (such as Ajax in Practice and SOA Security in Action, among others) via the Manning Early Access Program (MEAP).


MEAP means you can get chapter drafts as I write them and the complete book when its done (e-book or printed). Here is Manning's explanation:

Buy now through MEAP (Manning Early Access Program) and get early access to the book, chapter by chapter, as soon as they become available. You choose the format -- PDF or ThoutReader (or both). By subscribing to MEAP chapters, you get an opportunity to participate in the most sensitive, final piece of the publishing cycle by offering feedback to the author. Reader feedback to the author is welcome in the Author Online forum. As new chapters are released, announcements are made in the MEAP Announcement Forum. After all chapters are released, you will be able to download the complete edited ebook. If you order the print edition, we will ship it to you upon release, direct from the bindery, weeks before it is widely available elsewhere.

By the way, this is probably also a good time to mention that I'll be speaking about quite a few of the patterns in Architecture & Design World 2007 which will take place this July.

Anyway, there is still a lot of work, but I already like to thank all the people in manning that helped me get this far. especially to Cynthia Kane my editor (hey, maybe now she'll give me more slack :) )
Okay, 'nuff blubbering, back to completing Chapter 5....

Posted by Arnon Rotem-Gal-Oz at 04:38 AM  Permalink |


June 11, 2007

Agile Architecture and Documentation


Agile and documentation? What gives?

First things first. Documentation is not something that is prohibited by the Agile Manifesto. Working software is definitely preferred over "comprehensive documentation" but there can also be some value in documentation.

The first question is why would we want to document anything if we have a working software. I think there are several stakeholders like project newcomers, maintainers, etc., who will be interested in something that will let them get up to speed and provide them an overview of what's going on before they delve into code. You can read more on that in a post I wrote almost a year ago called "Who needs a software architecture document" but in essence the main motivation for documentation is that, assuming that the software is successful, it will outlive the team -- the people who built the software will not be the ones who will have to develop, maintain, and support it for most of the software's life.

If we agree that we need a software architecture document, the question is what to document and when.

There are two main "things" you can document in regard to architecture, the first is the obvious one -- the architecture itself. In my experience the most value can be derived from documenting recipes i.e. how to do stuff that is common in the architecture. These recipes can be a short description of the context and then a pointer to a test (or tests) and an implementation that exercises this. You can think of the recipes as a type of a tutorial to the architecture.

Other documentation-worthy elements related to the architecture are an overview and technology mapping (including what a developer needs to install to start working). The overview allows a newcomer to understand where to find what, the technology mapping allows for understanding what she needs to learn and install to be productive. Note that to be useful the overview should be at a higher level of abstraction than the code -- otherwise you run the risk of missing the forest for the trees or at least not saving any time.

It is obvious that documenting any of this before your architecture is stable more or less is useless, as a rule of thumb I would say this can be around the 5th-6th iteration -- assuming the team has to grow during the project. If the team stays stable for the duration of the project, this documentation can take place towards the end of the project (though I would probably add recipes to a wiki or something similar during the project as development patterns emerge).

The second "thing" you can document in regard to architecture are the decisions you decided against, in my opinion this is more important than all of the other items mentioned above. The reason for that, while it might take a while to understand a well-written software and infer its architecture it can be done, but it is virtually impossible to understand the options that were disqualified from looking at the chosen solution.

Understanding the options that weren't used can save time for the person reading that description, both in understanding why things are the way they are. Furthermore it can save time trying things that didn't work or provide clues to options when the circumstances change (since, as we all know, requirements change...)

The best time to document decisions you decided not to take is when you opt not to use them - this is when you remember best "why". For instance, in my current project we use x.509 certificates to authenticate clients and we use decided to use Kerberos tickets to authenticate components within the service. There's a reason for making that translation*,   there's also a reason for making the transition by replacing the client certificate with the edge component's credentials instead of mapping the client's certificate to a Kerberos ticket using an Identity provider*. we had two developers spike different options for two weeks until we came to the current solution instead of the more obvious choice of passing the x.509 certificate from the edge into the service and using the client's credentials. This question is likely to come up when/if someone else would take over the project, when the technology will be updated etc. Again, if we know why we didn't make that choice we can better decide what to do when the circumstances change.

To sum up, there are few architecture related issues that are worth documenting even in agile projects. some of them can be postponed some of them are worth documenting a little earlier. In any event it is better to document after the fact and to keep the documentation light.


* It all has to do to to limitations of WCF in regard to  the transports we use (HTTP, MSMQ and TCP) and the request/reaction pattern (asynchronous communication) we use.

Posted by Arnon Rotem-Gal-Oz at 05:53 PM  Permalink |


June 09, 2007

Considering REST


DevHawk (Harry Pierson) raised a question a few days ago which I've been toying with for a while now: If REST is an architectural style, can it exist without the specific technologies that define it today?

Or as Harry put it:

  1. REST is a an "architectural style for distributed hypermedia systems".
  2. REST "has been used to guide the design and development" of HTTP and URI.
  3. Therefore REST as an architectural style is independent of HTTP and URI.
  4. Yet I get the feeling that the REST community would consider a solution that uses the REST architectural style but not HTTP and/or URI as "not RESTful".

What I had in mind, for example, is to use messaging where the equivalent of the URI would be a topic hierarchy.

Topic hierarchy allows you to have a unique "URI" for each resource.

The next thing we need to take care of are the PUT, GET, POST, and DELETE verbs -- we can do that by making the verbs part of the message headers.

As an aside I'll also say that if we try to think about it as an
architectural constraint then we don't necessarily have to use these
verbs, a more general rule would say that the verbs are uniform and
well known rather than specific ones.

The rest (no pun intended) of the concerns, like specifying related states etc. can be dealt with making conventions on the message formats

Is that still REST?! I wonder...

In any event, what worries me the most in regard to REST is the religious manner that some people seem to treat it. By the way that is the same phenomena we see with some of the Agile folks. As for me? Well, I don't really care if I fit that label or the other. I am just paid to deliver working and viable software :), but hey, that's another discussion.

(For more on REST, see the article by Eric Bruno entitled SOA, Web Services, and RESTful Systems in the July 2007 issue of Dr. Dobb's Journal.)


Posted by Arnon Rotem-Gal-Oz at 07:50 AM  Permalink |


June 05, 2007

Treating Services as Products


Yesterday I attended an SOA governance presentation by Brent Carlson. The presentation was basically an updated version of an article he authored in 2006 "SOA Governance Best Practices: Architectural, Organizational and SDLC implications".


As a tool vendor, Brent has a lot of focus on the governance processes which I don't completely agree with (I prefer Jim Coplien's organizational patterns approach; see my post from last week). I also think the reuse figures he cites (registration required) are a little optimistic common place for what I consider the right granularity for services.

He also made a few points that I strongly agree with:

  • Brent talked about difference between the needs of run-time service repository (e.g. UDDI or an ESB) and a development time one. You need to address the services and their interactions during the development and you need to do that in a way that would be easy for the development teams. For example, one thing you want to log is usage, who is using the services since that will let you perform impact analysis when you have to make a change
  • Building an SOA for an organization is an iterative process not a "big-bang" effort. This means you can't do just top-down design. you need to be pragmatic and also roll out working services.

The reason for this post, however, is the insight Brent gave regarding treating services as products rather than applications.

Treating services as products is important because even if you don't believe that the SOA initiative should be an iterative process, once the move is finished you would have quite a few services deployed in your organization. These services would integrate and interact with other services -- some of which outside of your organization. You would also want to capitalize on flexibility claim that SOA makes and adapt your services to the changing business needs.

The challenges you face regarding updating and upgrading functionality, anticipating consumer's needs, allowing consumers to get used to changes, etc. are exactly the challenges product management techniques and principles come to answer.

Treating services as products means a lot of things. Let's look at a few examples: For one, it means predictable release cycles services like products get updated over time you want service users to be able to cope with this changes. Predictable release cycles means they can get organized in advance. Another aspect is the emphasis on backward compatibility; e.g. orderly deprecation of features and version management. One other thing is introducing a "product manager"; someone whose responsibility is to interact with customers, and potential customers, understand their needs, and build a release road map for the services.

You might be used to doing some of that with applications but thinking about services as products makes all this more explicit and that in itself is also important.

Posted by Arnon Rotem-Gal-Oz at 07:21 AM  Permalink |



October 2007
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      


BLOGROLL
 
INFO-LINK


Related Sites: DotNetJunkies, SD Expo, SqlJunkies