Primer Introduction to OWLIM

Skip to end of metadata
Go to start of metadata
Search

OWLIM Documentation

All versions

OWLIM 5.4 (latest)
OWLIM 5.3
OWLIM 5.2
OWLIM 5.1
OWLIM 5.0
OWLIM 4.3
OWLIM 4.2 (this version)
OWLIM 4.1
OWLIM 4.0

OWLIM is a high-performance semantic repository, implemented in Java and packaged as a Storage and Inference Layer (SAIL) for the Sesame RDF database. This section describes the various editions of OWLIM.

OWLIM is based on Ontotexts's Triple Reasoning and Rule Entailment Engine (TRREE) – a native RDF rule-entailment engine. The supported semantics can be configured through the definition of rule-sets. The most expressive pre-defined rule-set combines unconstrained RDFS and OWL-Lite. Custom rule-sets allow tuning for optimal performance and expressivity. OWLIM supports RDFS (section 3.1.2), OWL DLP (section 3.1.5.1), OWL Horst (section 3.1.5.2), most of OWL Lite (section 3.1.5.4) and OWL2 RL (section 3.1.5.3).
The three editions of OWLIM are OWLIM-Lite, OWLIM-SE (standard edition) and OWLIM-Enterprise (cluster configuration). With OWLIM-Lite, reasoning and query evaluation are performed in-memory, while, at the same time, a reliable persistence strategy assures data preservation, consistency, and integrity. OWLIM-SE is the high-performance 'enterprise' edition that scales to massive quantities of data. Typically, OWLIM-Lite can manage millions of explicit statements on desktop hardware, whereas OWLIM-SE can manage billions of statements and multiple simultaneous user sessions. OWLIM-Enterprise is an enterprise grade cluster management component that uses a collection of OWLIM instances to provide a resilient, high-performance semantic database.
The key differences between the editions of OWLIM are discussed in section 4.5 and in the OWLIM presentation [28]. The results form a number of benchmarks, as well as plenty of other performance evaluation and analysis information, are available on the Web site http://www.ontotext.com/owlim/.

Advantages of OWLIM

One of the main advantages of OWLIM-Lite is the in-memory reasoning implementation: the full content of the repository is loaded and maintained in main memory, which allows for efficient retrieval and query answering. Although the reasoning is handled in-memory, the OWLIM-Lite SAIL offers a relatively comprehensive persistence and backup strategy.
The persistence of OWLIM-Lite is implemented via writing to file in N-Triple format. The repository can be split into several files, where all of these except one are read-only; the writable file is considered as both the source from which the triples are loaded and the target where the new statements are stored. This backup strategy ensures that no loss of newly asserted triples can occur in cases of power failure or abnormal termination. Although relatively simple, this strategy had proven to be very efficient and reliable over the years [22].

Limitations of OWLIM

The limitations of OWLIM are related to its reasoning strategy. In general, the expressivity of the language supported cannot be extended in the Description Logic direction, because the semantics must be able to be captured in (Horn) rules. The total materialisation strategy has drawbacks when changes to the explicitly asserted statements occur frequently. For expressive semantics and certain ontologies, the number of implicit statements can grow quickly with the expected degradation in performance. OWLIM-SE has a number of optimisations to reduce this problem, e.g. special handling of owl:sameAs. Removing explicit statements can adversely affect performance if the full closure needs to be recomputed. Again, OWLIM-SE uses special techniques to avoid this situation. Another limitation of OWLIM-Lite is that the volume of data it can process is limited by the size of the computer's main memory Considering currently available commodity hardware, OWLIM-Lite can handle millions of statements on desktop machines and above ten million on entry-level servers..

OWLIM Interoperability and Architecture

OWLIM version 3.X is packaged as a Storage and Inference Layer (SAIL) for Sesame version 2.x and makes extensive use of the features and infrastructure of Sesame, especially the RDF model, RDF parsers and query engines.
Inference is performed by the TRREE engine [39], where the explicit and inferred statements are stored in highly-optimized data structures that are kept in-memory for query evaluation and further inference. The inferred closure is updated through inference at the end of each transaction that modifies the repository.

Figure 5 - OWLIM Usage and Relationship to Sesame and ORDI

OWLIM implements the Sesame SAIL interface so that it can be integrated with the rest of the Sesame framework, e.g. the query engines and the web UI. A user application can be designed to use OWLIM directly through the Sesame SAIL API or via the higher-level functional interfaces. When an OWLIM repository is exposed using the Sesame HTTP Server, users can manage the repository through the Sesame Workbench Web application, or with other tools integrated with Sesame, e.g. ontology editors like Protégé and TopBraid Composer.
The easiest way for developers to integrate their applications with OWLIM is to use it with the Sesame framework as a set of libraries. The installation and configuration of OWLIM are discussed in the quick start and user guides. More information on the various aspects of the Sesame specifications, its architecture and implementations can be found in section 3.2.

The TRREE Engine

OWLIM is implemented on top of the TRREE engine. TRREE [39] stands for 'Triple Reasoning and Rule Entailment Engine'. The TRREE performs reasoning based on forward-chaining of entailment rules over RDF triple patterns with variables. TRREE's reasoning strategy is total materialisation, see section 3.1.7, although various optimisations are used as described in the following sections.

The semantics used is based on R-entailment [37] with the following differences:

  • Free variables in the head of a rule (without a binding in the body) are treated as blank nodes. This feature can be considered 'syntactic sugar';
  • Variable inequality constraints can be specified in the body of the rules, in addition to the triple patterns. This leads to lower complexity as compared to R-entailment;
  • the [cut] operator can be associated with rule premises, the TRREE compiler interprets it like the ! operator in Prolog;
  • Two types of inconsistency checks are supported. Checks without any consequences indicate a consistency violation if the body can be satisfied. Consistency checks with consequences indicate a consistency violation if the inferred statements do not exist in the repository;
  • Axioms can be provided as a set of statements, although those are not modelled as rules with empty bodies.

Further details of the rule language can be found in the corresponding user guides.
The TRREE can be configured via the rule-sets parameter, that identifies a file containing the entailment rules, consistency checks and axiomatic triples. The implementation of TRREE relies on a compile stage, during which custom rule-sets are compiled into Java code that is further compiled and merged in to the inference engine.

The edition of TRREE used in OWLIM-Lite is referred to as 'SwiftTRREE' and performs reasoning and query evaluation in-memory. The edition of TRREE used in OWLIM-SE is referred to as 'BigTRREE' and utilises data structures backed by the file-system. These data structures are organized to allow query optimizations that dramatically improve performance with large datasets, e.g. with one of the standard tests OWLIM-SE evaluates queries against 7 million statements three times faster than OWLIM-Lite, although it takes between two and three times more time to initially load the data.

Comparison of OWLIM-Lite and OWLIM-SE

The two OWLIM editions – OWLIM-Lite and OWLIM-SE – are identical in terms of usage and integration except for a few minor differences in some configuration parameters. The editions differ in which version of the TRREE engine they are based upon, but share the same inference and semantics (rule-compiler, etc).
OWLIM-Lite is designed for medium data volumes and prototyping. Its key features are:

  • reasoning and query evaluation performed in main memory;
  • persistence strategy that assures data preservation and consistency;
  • extremely fast loading of data (including inference and storage).

OWLIM-SE is suitable for massive volumes of data and heavy query loads. It is designed as an enterprise-grade semantic repository system. It features:

  • file-based indices (enables it to scale to billions of statements even on desktop machines.)
  • inference and query optimizations (ensures fast query evaluations.)


Parameter OWLIM-Lite OWLIM-SE
Scale 10 MSt, using 1.6 GB RAM
100 MSt, using 16 GB RAM
130 MSt, using 1.6GB
1068 MSt, using 12GB
Processing speed
(load+infer+store)
30 KSt/s on notebook
200 KSt/s on server
5 KSt/s on notebook
60 KSt/s on server
Query optimization No Yes
Persistence Back-up in N-Triples Binary data files and indices
Efficient owl:sameAs No Yes
Advanced features none RDF Rank
Full-text search
Geo-spatial extension
Licence and Availability Free-for-use Commercial.
Research and evaluation copies provided for free

Table 2 - Comparison between OWLIM-Lite and OWLIM-SE

Supported Semantics

OWLIM offers several predefined semantics by way of standard rule sets (files), but can also be configured to use custom rule sets with semantics better tuned to the particular domain. The required semantics can be specified through the ruleset for each specific repository instance. Applications, which do not need the complexity of the most expressive supported semantics, can choose one of the less complex, which will result in faster inference.

Pre-defined Rule Sets

The pre-defined rule-sets are layered such that each one extends the preceding one. The following list is ordered by increasing expressivity:

  • empty: no reasoning, i.e. OWLIM operates as a plain RDF store;
  • rdfs: supports standard RDFS semantics;
  • owl-horst: OWL dialect close to OWL Horst; the differences are discussed below;
  • owl-max: a combination of most of OWL-Lite with RDFS.
  • owl2-rl: Fully conformant OWL2 RL profile [44] except for D-Entailment, i.e. reasoning about data types;

Custom Rule-Sets

OWLIM has an internal rule compiler that can be used to configure the TRREE with a custom set of inference rules and axioms. The user may define a custom rule-set in a *.pie file (e.g. MySemantics.pie). The easiest way to do this is to start modifying one of the .pie files that were used to build the precompiled rule-sets – all pre-defined .pie files are included in the distribution. The syntax of the .pie files is easy to follow.

OWL Compliance

Regarding OWL compliance, OWLIM supports several OWL like dialects: OWL Horst [37] (owl-horst), OWL Max (owl-max) that covers most of OWL-Lite and RDFS, OWL2 QL (owl2-ql) and OWL2 RL (owl2-rl).

With the owl-max rule-set OWLIM supports the following semantics:

  • full RDFS semantics without constraints or limitations, apart from the entailments related to typed literals (known as D-entailment). For instance, meta-classes (and any arbitrary mixture of class, property, and individual) can be combined with the supported OWL semantics
  • most of OWL-Lite
  • all of OWL DLP

The differences between OWL Horst [37], and the OWL dialects supported by OWLIM (owl-horst and owl-max) can be summarized as follows:

  • OWLIM does not provide the extended support for typed literals, introduced with the D*-entailment extension of the RDFS semantics. Although such support is conceptually clear and easy to implement, it is our understanding that the performance penalty is too high for most applications. One can easily implement the rules defined for this purpose by ter Horst and add them to a custom rule-set;
  • There are no inconsistency rules by default;
  • A few more OWL primitives are supported by OWLIM (rule-set owl-max). These are listed in the OWLIM User Guides;
  • There is extended support for schema-level (T-Box) reasoning in OWLIM.

Even though the concrete rules pre-defined in OWLIM differ from those defined in OWL Horst, the complexity and decidability results reported for R-entailment are relevant for TRREE and OWLIM. To put it more precisely, the rules in the owl-horst rule-set, do not introduce new B-Nodes, which means that R-entailment with respect to them takes polynomial time. In KR terms, this means that the owl-horst inference within OWLIM is tractable.
Inference using owl-horst is of a lesser complexity compared to other formalisms that combine DL formalisms with rules. In addition, it puts no constraints with respect to meta-modelling.

The correctness of the support for OWL semantics (for those primitives which are supported) is checked against the normative Positive- and Negative-entailment OWL test cases [7]. These tests are provided in the OWLIM distribution and documented in the OWLIM user guides.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.