The OWLIM Plug-in API is a framework and a set of public classes and interfaces that allow developers to extend OWLIM in many useful ways. These extensions are bundled into plug-ins that OWLIM discovers during its initialisation phase and then uses to delegate parts of its query processing tasks. The plug-ins are given low-level access to OWLIM repository data that enables them to do their job efficiently. The plug-ins are discovered via the Java service discovery mechanism which enables dynamic addition/removal of plug-ins from the system without having to recompile OWLIM or change any configuration files.
An OWLIM plug-in is a java class that implements the com.ontotext.trree.sdk.Plugin interface. All the public classes and interfaces of the plug-in API are located in this java package, i.e. com.ontotext.trree.sdk, so this package name will be omitted for the rest of this section. Here is what the Plugin interface looks like in abbreviated form:
Being derived from the Service interface means that plug-ins will be automatically discovered at run-time provided that the following conditions also hold:
The only method introduced by the Service interface is getName() which provides the plug-in's (service's) name. This name should be unique within a particular OWLIM repository and serves as a plug-in identifier that can be used at any time to retrieve a reference to the plug-in instance. The rest of the base Plugin methods will be described further in the following sections.
There are a lot more functions (interfaces) that a plug-in could implement, but these are all optional and are declared in separate interfaces. Implementing any such complementary interface is the means to announce to the system what this particular plug-in can do in addition to its mandatory plug-in responsibilities. It is then automatically used as appropriate.
A plug-in's life-cycle is separated into several phases:
In order to enable efficient request processing plug-ins are given low-level access to the repository data and internals. This is done through the Statements and Entities interfaces.
The Entities interface represents a set of RDF objects (URIs, blank nodes and literals). All such objects are termed entities and are given unique long identifiers. The Entities instance is responsible for resolving those objects from their identifiers and inversely for looking up the identifier of a given entity. Most plug-ins will process entities using their identifiers, because dealing with integer identifiers is a lot more efficient than working with the actual RDF entities they represent. The Entities interface is the single entry point available to plug-ins for entity management. It supports the addition of new entities, entity replacement, look-up of entity type and properties, resolving entities, listening for entity change events, etc. It is possible in an OWLIM repository to declare two RDF objects to be equivalent (e.g. by using owl:sameAs. In order to provide a way to use such declarations, the Entities interface assigns a class identifier to each entity. For newly created entities this class identifier is the same as the entity identifier. When two entities are declared equivalent one of them adopts the class identifier of the other and thus they become members of the same equivalence class. The Entities interface exposes the entity class identifier for plug-ins to determine which entities are equivalent.
The Statements interface represents a set of RDF statements where statement means a quadruple of subject, predicate, object and context RDF entity identifiers. Statements can be added, removed and searched for. Additionally, a plug-in can subscribe to receive statement event notifications (e.g. "statement was added").
An important abstract class which is related to OWLIM internals is StatementIterator. It has a single abstract method - boolean next() - which attempts to scroll the iterator onto the next available statement and returns true only if it succeeded. In the case of success its subject, predicate, object and context fields will be initialised with the respective components of the next statement.
Here is a brief example that puts Statements, Entities and StatementIterator together in order output all literals that are related to a given URI:
Getting to know these interfaces should be sufficient for a plug-in developer to make full use of OWLIM repository data.
As already mentioned, a plug-in's interation with each of the request-processing phases is optional. The plug-in declares if it plans to participate in any phase by implementing the appropriate interface.
A plug-in willing to participate in request pre-processing should implement the Preprocessor interface. It looks like this:
The preprocess() method receives the request object and returns RequestContext instance. The Request instance passed as the parameter will be a different class instance depending on the type of the request (e.g. SPARQL/Update or "get statements"). The plug-in can change the request object in the necessary way and should initialise and return its context object that will be passed back to it in every other method during the request processing phase. The returned request context may be null and whatever it is it will only be visible to the plug-in that initialised it. It can be used to store data, visible for (and only for) this whole request, e.g. to pass data relating to two different statement patterns recognised by the plug-in. The request context will give further request processing phases access to the Request object reference. Plug-ins that opt to skip this phase will not have a request context and will be not be able to get access to the original Request object.
This is one of the most important phases in the lifetime of a plug-in. In fact most plug-ins need to participate in exactly this phase. This is the point where request statement patterns need to get evaluated and statement results are returned. For example, consider the following SPARQL query:
There just one statement pattern inside this query: ?s <http://example/predicate> ?o. All plug-ins that have implemented the PatternInterpreter interface (thus declaring that they intend to participate in the pattern interpretation phase) will be asked if they can interpret this pattern. The first one to accept it and return results for it will be used. If no plug-in interprets the pattern it will be looked up using the repository's physical statements, i.e. the ones persisted on disk.
Here is the PatternInterpreter interface:
The estimate() and interpret() methods take the same arguments and are used in the following way:
The requestContext parameter is the value returned by the preprocess() method if one exists or null otherwise.
The plug-in framework also supports the interpretation of an extended type of list pattern. Consider the following SPARQL query:
If a plug-in wants to handle such list patterns it has to implement an interface very similar to the PatternInterpreter interface - ListPatternInterpreter:
It only differs by having multiple objects passed as an array of long's instead of a single long object. The semantics of both methods is equivalent to the one in the basic pattern interpretation case.
There are cases when a plug-in would like to modify or otherwise filter the final results of a request. This is where the Postprocessor interface comes into play:
The postprocess() method is called for each binding set that is to be returned to the repository client. This method may modify the binding set and return it or alternatively return null in which case the binding set is removed from the result set. After a binding set is processed by a plug-in, the possibly modified binding set is passed to the next plug-in having post-processing functionality enabled. After the binding set is processed by all plug-ins (in the case where no plug-in deletes it) it is returned to the client. Finally, after all results are processed and returned, each plug-in's flush() method is called to introduce new binding set results in the result set. These in turn are finally returned to the client.
As well as query/read processing, plug-ins are able to process update operations for statement patterns containing specific predicates. In order to intercept updates, a plug-in must implement the UpdateInterpreter interface. During initialisation the getPredicatesToListenFor is called once by the framework, so that the plug-in can indicate which predicates it is interested in.
From then onwards, the plug-in framework will filter updates for statements using these predicates and notify the plug-in. Filtered updates are not processed further by OWLIM, so if the insert or delete operation should be persisted, then the plug-in must handle this by using the Statements object passed to it.
The example plug-in will have two responsibilities:
For the first part, it is clear that the plug-in should implement the PatternInterpreter interface. A date/time literal should be stored as a request-scope entity to avoid cluttering the repository with extra literals.
For the second requirement the plug-in must first take part in the pre-processing phase in order to inspect the query and detect the FROM clause. Then the plug-in must hook into the post-processing phase, where if the pre-processing phase has detected the desired FROM clause it should delete all query results (in postprocess() and return (in flush()) a single result containing the binding set specified by the requirements. Again, request-scoped literals should be created.
The plug-in implementation extends the PluginBase class that provides a default implementation of the Plugin methods:
In this basic implementation the plug-in name is defined and during initialization a single system-scope predicate is registered. It is important not to forget to register the plug-in in the META-INF/services/com.ontotext.trree.sdk.Plugin file in the classpath.
The next step is to implement the first of the plug-in's requirements - the pattern interpretation part:
The interpret() method only processes patterns with a predicate matching the desired predicate identifier. Further on it simply creates a new date/time literal (in the request scope) and places its identifier in the object position of the returned single result. The estimate() method always returns 1, because this is the exact size of the result set.
Finally to implement the second requirement concerning the interpretation of the FROM clause:
The plug-in provides the custom implementation of the RequestContext interface, which can hold a reference to the desired single BindingSet with the date/time literal bound to every variable name in the query projection. The postprocess() method filters out all results if the requestContext is non-null (i.e. if the FROM clause was detected by preprocess()). Finally flush() returns a singleton iterator containing the desired binding set in the required case and returns nothing otherwise.
Plug-ins are expected to require configuring. There are two ways for OWLIM plug-ins to receive their configuration. The first practice is to define magic system predicates that can be used to pass some configuration values to the plug-in through a query at run-time. This approach is appropriate whenever the configuration should change from one plug-in usage scenario to another, i.e. when there are no globally valid parameters for the plug-in. However, in many cases the plug-in behaviour has to be configured "globally" and then the plug-in framework provides a suitable mechanism through the Configurable interface.
A plug-in implements the Configurable interface to announce its configuration parameters to the system. This allows it to read parameter values during initialization from the repository configuration and have them merged with all other repository parameters (accessible through the SystemOptions instance passed during the configuration phase).
This is the Configurable interface:
The plug-in needs to enumerate its configuration parameter names. The example plug-in will be extended with the ability to define the name of special predicate it uses. The parameter is called predicate-uri and it should accept a URI value.
Now that the plug-in parameter has been declared, it can be configured either by adding the http://www.ontotext.com/trree/owlim#predicate-uri parameter to the OWLIM configuration or by setting a Java system property using -Dpredicate-uri parameter for the JVM running OWLIM.
There are also a special kind of configuration parameters called "memory" parameters. These are parameters that are used to configure the amount of memory available for the plug-in to use. If a plug-in has such parameters it should use the MemoryConfigurable interface:
The getMemoryParameters() method enumerates the names of the plug-in's memory parameters in a similar way to Configurable.getParameters(). During the configuration phase, the plug-in's setMemoryParameter() method will be called once for each such parameter with its respective configured value in bytes. The parameters defined as memory parameters can be given values like "1g" or "300M", but such values will be interpreted and converted to bytes.
A special property of the memory parameters is that they can be configured in a group. OWLIM accepts a parameter called cache-memory. This parameter accumulates the values of a group of other parameters: tuple-index-memory, fts-memory and predicate-memory. Declaring a memory parameter automatically adds it in the group of parameters accumulated by cache-memory. What is good about this approach is that if cache-memory is configured to some amount and any of the grouped memory parameters is not configured (unknown), the amount configured for cache-memory is divided among all unknown memory parameters thus providing the user with a simple way to control the memory requirements of many plug-ins using a single parameter. For instance, if cache-memory is configured to "100m", tuple-index-memory to "20m", there are no predicate lists configured (which automatically disables the predicate-memory parameter) and there are 4 memory parameters declared by several plug-ins which weren't explicitly configured. The effect of such a setup would be that 80M (100M - 20M) will be divided among the 4 memory parameters and each of them will be set to 20M. This value is then reported to the plug-ins in bytes using their setMemoryParameter() method.
Plug-ins are able to make use of the functionality of other plug-ins. For example, the Lucene-based full-text search plug-in can make use of the rank values provided by the RDFRank plug-in to facilitate query result scoring and ordering. This is not a matter of re-using program code (e.g. in a jar with common classes), rather it is about re-using data. The mechanism to do this allows plug-ins to obtain references to other plug-in objects by knowing their names. To achieve this they only need to implement the PluginDependency interface:
They are then injected an instance of the PluginLocator interface (during configuration phase) that does the actual plug-in discovery for them:
Having a reference to another plug-in is all that is needed to call its methods directly and make use of its services.
Skip to end of metadata Go to start of metadata