DeVisa Architecture

DeVisa Architecture

The PMML Model Service is a web service that provides different specialized operations. It is an abstract computational entity meant to provide access to the concrete services. Thus the PMML Model Service receives and returns SOAP messages that contain queries expressed in PMQL. To solve the incoming requests the web service detaches the PMQL fragment to the PMQL Engine.

The Admin PMML Service is based on XMLRPC or SOAP protocols and consists of methods for storing and retrieving PMML models. DeVisa redefines the basic SOAP store / retrieve web service with customized PMML features. Therefore, when a model is uploaded in the repository, it is validated against the PMML Schema or by using the XSLT based PMML validation script provided by DMG. Then the model is distributed in the appropriate collection (based on the domain / producer) and the catalog is updated with the new model’s metadata. Also the service provides features for updating/replacing an existing model with a newer one via XUpdate instructions.

The PMML Model Repository is a collection of models stored in PMML format that uses the native XML storage features provided by the underlying XML database system - DeVisa uses eXist for this purpose. A PMML document contains one or more models that share the same schema. The models are organized in collections (corresponding to domains) and identified via XML namespace facilities (connecting to the producer application). The documents in the repository are indexed for fast retrieval (structured indexes, full-text indexes and range indexes).

The PMQL-LIB module is a collection of functions entirely written in XQuery for the purpose of PMML querying. The functions in the PMQL-LIB module are called by the PMQL engine during the query plan execution phase. A scoring function has a PMQL query plan as input and produces a PMQL query answer.

The PMQL Engine is a component of the DeVisa system that processes and executes a query expressed in PMQL. After syntactic and semantic validation, query rewriting, it executes the query plan by invoking functions in the PMQL-LIB internal module.

The Metadata Catalog contains metadata about the PMML models stored in a specific XML format. The catalog XML Schema can be found in here. The catalog consists of the following type of information: available collections , model schema, model information (algorithm, producer application, upload date), statistics (e.g univariate statistics: mean, minimum, maximum, standard deviation, different frequencies), model performance (e.g precision, accuracy, sensitivity, misclassification rate, complexity), etc. This component is a materialized view on the PMML repository containing information on the PMML models in the repository. In DeVisa the Metadata Catalog is strongly dependent on the underlying XML database indexing system so that the performance of the retrieval process is influenced by the active configuration in a particular database instance.