Distributed Data Retrieval Protocol

Portal Reference Implementation Design

$revision$

Introduction

This document describes the reference implementation of the Portal Component of the Distributed Data Retrieval Protocol (DDRP). Please refer to the Requirements Document for an overview of the entire system. The Portal Component is an application that communicates with multiple providers and performs operations to retrieve and integrate data. The reference implementation of the portal follows the protocol. It is a set of classes built to interact with distributed providers and is interfaced with via a set of well defined API calls.

Object Model

Class Diagram

Objects

PortalServices
The entry point into the portal. All "clients" interface with the portal via API calls made to PortalServices or via streamed HTTP requests. This can be thought of as the main class of the portal component. During construction, the following things will occur:
PortalConfig
The configuration object containing required values. The configuration object is generally loaded from a configuration file.
RegistryAccess
The API into the provider registry. Based on assumptions, this will be a wrapper around UDDI SOAP requests (i.e. the caller makes API calls that are converted internally to the appropriate SOAP request).
ProviderCache
Manages provider data obtained from the registry (via RegistryAccess) and from the providers directly (i.e. metadata information).
Provider
Data object, or bean if you will, representing an individual datasource. To note, the requirements allow for each physical provider to host many databases. A Provider in this case relates to one database only. We do not support an array of databases within the Provider object. Perhaps this object should be called Datasource to avoid confusion. Accessors and Modifiers are provided for each of the above attributes but are omitted here for brevity.
Metadata
Data object representing the metadata associated with an individual datasource. This object encapsulates a particular provider's classification and offerings. Metadata requirements must be defined still. Accessors and modifiers are provided for each of the above attributes but are omitted for brevity. An assumption is made that a class of constants (or constant keys) will exist for type data such as supportedOperations. Acceptable or valid values for such type data will be specified in the protocol.
PortalRequestHandler
The individual processor of a request. The PortalRequestHandler is the basic workflow component that takes such steps as pairing down the available providers based on offerings, marshalling the request into protocol compliant XML, threading submittal of requests to various providers, collecting responses and unmarshalling responses. Of course, all of this work is not done in this class alone, rather, the PortalRequestHandler acts as a controller to these processes. It will be a runnable object since the Portal must be able to handle any number of requests at a given time. Likely, pooling of this object will be implemented in order to manage resources.
PortalRequestHandlerThread
A worker thread controlled by a PortalRequestHandler. An instance of this thread is instantiated per request to a single database of a provider. It streams the request to the provider and awaits the response. It is possible some pooling could occur here, or at least, some maximum limit placed on the number of individual threads spawned at a time.
ProviderFilterer
An interface specifying the required methods for a provider filter to implement. The interface accounts for abstracting the filter component such that any number of filters, based on varying schemas and rulesets, may be implemented.
Darwin2ProviderFilter
An implementation of ProviderFilterer based on the Darwin Core v.2.0 Federation Schema.
RequestMarshaller
Simply translates requests in one form into another form. This class could possibly be static.
ResponseMarshaller
Simply translates responses in one form into another form. Largely, ResponseMarshaller may just append numerous responses together for streaming back to the caller. This class could possibly be static.

Notes

Some additional design considerations: