Site Archive (Complete)
Architecture Blog: Optimizing Serialization: Is It Worth Your While?
Architecture & Design
PATTERN LANGUAGE

Modeling, Managing, Making it Right.

by Jonathan Erickson
IF YOU BUILD IT

... Will they Come?

by Arnon Rotem-Gal-Oz
December 03, 2006

Optimizing Serialization: Is It Worth Your While?

I’ve received a few comments regarding my post about how optimizing serialization speed can be a good option for making the communication between two tiers (client-to-server or server-to-server) "better".

My claim is that in most cases optimizing things like serialization is a poor excuse for not designing the communications between tiers/services etc.

For one thing, networks are slower than memory and CPUs, and serialization is just one of the things you need to do. You also need encryption, signing, logging, and the like. And if you think about the whole cycle from a request to a reply, there’s also authentication, authorization, possible transformation (from external formats to internal) and what not. Is optimizing serialization really the answer to all these?

I guess that in most cases it all boils down to the problem of distributed objects. A lot of developers and architects are used to do object-oriented design -- and for a good reason. It has proven to be a very good paradigm for developing systems -- until we got to distributed systems where it didn’t really work well (which is why we need newer architectural styles like SOA).

Objects are not meant to be distributed. OO assumes locality. In classical OO you even control the lifetime of the objects you use (techniques like Dependency Injection allow us to lift this barrier). You assume the other object is there, so you can just call on its methods and read its properties etc. if every method call translates to a call on the wire. Then yes, you will soon find yourself optimizing stuff like serialization, but as Miagi once said in The Karate Kid the "best defense is no be there" don’t get to these situations. If you consider that a tier is a boundary , then you understand that the better approach is to carefully consider what data needs to be moved back and forth. If you design what data will travel, in most cases you can use the verbose XML format and not JSON or proprietary binary serialization. You won’t mind serialization time that much as it is just one link in the chain that takes more time.

One of the biggest advantages of SOA (as I see it) is that it explicitly makes you think about the distribution aspects of the system (vs. OO which doesn’t).

If you are building a simple solution, then it might not be worth your while to build a scalable distributed system. You may not need to integrate your solution with other system so you can serialize datasets, expose internal structure or any other behavior that will inhibit future system growth. But in this case you would probably won’t need to optimize serialization as well since you have modest requirements anyway.

Granted, there may be extreme cases where it just isn’t enough to think about what data should be transferred. In one system we built we had to transmit data through something like a 1200-baud modem, and no matter how much we tried to minimize the traffic, eventually, we had to find some smart way to compress the data -- even in this case the bottleneck was the amount of data we can push over the wire in a given time and not the time it takes to prepare the message.

So in my opinion, optimizing serialzation is a good answer -- but to the wrong question

Posted by Arnon Rotem-Gal-Oz at 02:05 PM  Permalink




 
INFO-LINK


Related Sites: DotNetJunkies, SD Expo, SqlJunkies