December 21, 2006
SOA and BI Impedance Mismatch
Service-Oriented Architecture is about autonomous loosely coupled components. These traits give you lots of benefits but it also means that services have internal data. Data that you don't want to expose to the outside as exposing it will decrease autonomy and increase coupling. This is why services only expose data and processes via contracts and not their internal structure.
That is all fine until you start to think about business intelligence. The cornerstone of any business intelligence initiative is gathering, collecting and consolidating data from all over the place. Once you have the data, you can use tools to analyze it, data mine it, slice, splice, aggregate, and whatnot. Traditionally BI builds on ETL (Extract, Transfer, Load) which goes directly to the database of the involved sources.
And here lies the problem: On the one hand we have services that want to keep their data private, and on the other we have a datamart or warehouse that wants that data badly.
- If you go with traditional ETL, you introduce coupling into your service.
- If you only rely on contracts that were constructed for business processes you may be missing out on important data.
- If you build a specific contract that exposes "all" the data you are back at the point-to-point integration -- solving point-to-point integration is one of the reason we want SOA in the first place.
The second option seems to be the most reasonable choice of the three -- but it also has several problems. One problem is that the BI needs to know about all the contracts. The second was already mentioned -- important data might be missing. The third problem is that the BI system need to fetch data from the services which means it may miss out on data in the intervals between request. On the other hand, too frequent requests and you can congest your network.
Clearly we need a fourth option -- and I'll talk about that option next time
Posted by Arnon Rotem-Gal-Oz at 03:48 PM Permalink
|