October 10, 2007
Grid-Enabling Resource-Intensive ApplicationsArchitectures
Distributed programming has several basic common architectures: client-server, peer-to-peer, distributed objects, clustering, and grid computing. These architectures are not all mutually exclusive, and particular implementations may be combinations of these architectures.
Communication Strategies
There are three common strategies for implementing a communications scheme in client-server architectures (Table 2). These are presented in the order of oldest and most standard to newest and most modular. Berkeley sockets (or just "sockets") are the earliest version we discuss, followed by RPC, and then RMI. The latter two strategies are both based upon sockets yet remove some of the management overhead normally required by sockets.
Table 2: Communication strategies.
Sockets are defined as endpoints for communication, with one socket at each endpoint of the communications channel. A socket works with Internet Protocol (IP) and either Transmission Control Protocol (TCP) or UDP (User Datagram Protocol). The difference between TCP and UDP is that TCP provides error checking and is bidirectional. UDP is unidirectional and provides no error checking. In this project, we needed TCP because it provides feedback that a connection has actually been established. With sockets the client has to manage the socket connection by opening it, closing it, and establishing an input stream to read from the socket. The server employs a "listener" to monitor a specific port. When a client sends a request for a connection, the server will accept the request and the connection will be established. With sockets each connection is unique. The client needs to know the address of the server, but the server does not need to know the address of the client prior to the connection being established. Once a connection is established, both sides can send and receive information.
The main disadvantage of plain sockets is the overhead of creating and managing the connections. There are newer communications strategies such as RPC and RMI that simplify these connection maintenance details.
When a process calls a procedure on a remote application in a distributed system, it is known as an RPC. With RPCs, an ordinary data structure is used in passing data to a remote procedure. The essential concept of RPC is hiding all the network code within "stub" procedures. The goal of RPC is to simplify writing distributed applications by making the networking code transparent. With RPC, a daemon or "stub" runs on a port of a remote system and listens for messages addressed to it. These stubs locate the appropriate port on the client or server and package the parameters into a form that can be transmitted over the network. This is known as "marshaling." The main drawbacks of RPC are that in passing parameters, true pass-by-reference is not permitted, it is difficult to send and make sense of complex data types such as user-defined types, exception handling is more difficult, and there can be issues with data representation (www.wikipedia.org/wiki/Remote_procedure_call). There is a selection of standardized RPC systems available such as Microsoft's .NET Remoting and Java's Remote Method Invocation.
For reasons of portability and tool availability, we decided to use RMI. With RMI, objects are passed using remote method calls known as "callbacks" instead of data structures. RMI simplifies the design of the client because all it does is get a proxy for the remote object from entries in the RMI registry. It then simply calls the method in the same way it would call a local method. Since RMI is Java specific, it doesn't provide a direct connection between objects implemented in different languages. Using Java's Native Interface (JNI) API, however, it is possible to wrap existing C or C++ code such as the CodeMatch DLL with a Java interface, then export this interface remotely through RMI (see Figure 1). The JNI lets Java code that runs inside a Java Virtual Machine (JVM) interoperate with applications and libraries written in other programming languages.
[Click image to view at full size]
Figure 1: CodeGrid network overview.
|
|
|||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|