Distributed Data Retrieval Protocol

Portal Reference Implementation Design



This document describes the reference implementation of the Portal Component of the Distributed Data Retrieval Protocol (DDRP). Please refer to the Requirements Document for an overview of the entire system. The Portal Component is an application that communicates with multiple providers and performs operations to retrieve and integrate data. The reference implementation of the portal follows the protocol. It is a set of classes built to interact with distributed providers and is interfaced with via a set of well defined API calls.

Object Model

Class Diagram


The entry point into the portal. All "clients" interface with the portal via API calls made to PortalServices or via streamed HTTP requests. This can be thought of as the main class of the portal component. During construction, the following things will occur:
The configuration object containing required values. The configuration object is generally loaded from a configuration file.
The API into the provider registry. Based on assumptions, this will be a wrapper around UDDI SOAP requests (i.e. the caller makes API calls that are converted internally to the appropriate SOAP request).
Manages provider data obtained from the registry (via RegistryAccess) and from the providers directly (i.e. metadata information).
Data object, or bean if you will, representing an individual datasource. To note, the requirements allow for each physical provider to host many databases. A Provider in this case relates to one database only. We do not support an array of databases within the Provider object. Perhaps this object should be called Datasource to avoid confusion. Accessors and Modifiers are provided for each of the above attributes but are omitted here for brevity.
Data object representing the metadata associated with an individual datasource. This object encapsulates a particular provider's classification and offerings. Metadata requirements must be defined still. Accessors and modifiers are provided for each of the above attributes but are omitted for brevity. An assumption is made that a class of constants (or constant keys) will exist for type data such as supportedOperations. Acceptable or valid values for such type data will be specified in the protocol.
The individual processor of a request. The PortalRequestHandler is the basic workflow component that takes such steps as pairing down the available providers based on offerings, marshalling the request into protocol compliant XML, threading submittal of requests to various providers, collecting responses and unmarshalling responses. Of course, all of this work is not done in this class alone, rather, the PortalRequestHandler acts as a controller to these processes. It will be a runnable object since the Portal must be able to handle any number of requests at a given time. Likely, pooling of this object will be implemented in order to manage resources.
A worker thread controlled by a PortalRequestHandler. An instance of this thread is instantiated per request to a single database of a provider. It streams the request to the provider and awaits the response. It is possible some pooling could occur here, or at least, some maximum limit placed on the number of individual threads spawned at a time.
An interface specifying the required methods for a provider filter to implement. The interface accounts for abstracting the filter component such that any number of filters, based on varying schemas and rulesets, may be implemented.
An implementation of ProviderFilterer based on the Darwin Core v.2.0 Federation Schema.
Simply translates requests in one form into another form. This class could possibly be static.
Simply translates responses in one form into another form. Largely, ResponseMarshaller may just append numerous responses together for streaming back to the caller. This class could possibly be static.


Some additional design considerations: