CAESAR

STEP/EPISTLE and the RDF

A discussion of the relationship

This Version:
http://www.cedarlon.demon.co.uk/epistle/epistle-rdf-note-2001-07-17.html
Author:
David Leal, CAESAR Systems

Foreword

The Resource Description Framework (RDF) is a key part of the 'Semantic Web' [BERNERS-LEE98] . It enables a vocabulary to be defined, and information to be recorded with respect to that vocabulary. By the use of standard vocabularies, information on the web can be computer interpretable.

The European Process Industry STEP Technical Liaison Executive (EPISTLE) is an association devoted to the development of STEP (STandard for the Exchange of Product data) for use within the process industry. STEP is the name given to the family of standards developed by ISO TC184/SC4.

EPISTLE has developed a generic information model for the exchange of product data in EXPRESS, which can be used in conjunction with a Reference Data Library (or vocabulary).

This document:

  1. shows that the RDF and EPISTLE approaches are similar and compatible;
  2. shows that the RDF implementation using XML is much more flexible than anything available for the implementation of EXPRESS models; and
  3. argues that RDF should be adopted an implementation route for EPISTLE, and that the EPISTLE RDL should be published as an RDF vocabulary.

Table of Contents


1. Introduction

1.1 Two communities

The Web and STEP have different origins:

Web
hypertext links between person readable documents, which may be stored on different computers;
STEP
the exchange of product data files between different CAD or CAE applications.

Both communities are moving beyond this original scope:

Web
computer readable as well a person readable data;
STEP/EPISTLE
engineering databases of broad scope which may exist throughout the life of a product.

NOTE 1 - The extension of the STEP scope has been pioneered by EPISTLE. Related approaches are now being followed by other projects, such as PLCS (Product Life Cycle Support).

The two communities use different technologies:

Semantic Web
The base technology for the 'semantic web' is XML. An XML document is a hierarchical (parent and child) structure of elements. Meaning is conveyed by the type of an element.
STEP
The base technology for STEP is EXPRESS. A STEP exchange file is a sequence of records, which are instances of entity types in an EXPRESS schema. Meaning is conveyed by the type of an entity.

NOTE 2 - XML is an extension of HTML. In HTML the meaning of an element is restricted to the role of the element within a document and to the presentation of the element to a person.

NOTE 3 - EXPRESS is very similar to UML, and in most cases conversion between the two languages is trivial. EXPRESS is stronger than UML in defining data constraints. UML has the object oriented concepts of interfaces and methods, which are not present in EXPRESS.

NOTE 4 - There is an algorithm for the representation of data defined by EXPRESS as an XML file. Just like XMI (the equivalent algorithm for UML), this gives a messy and complicated result.

Both communities have realised that a 'suck it and see' approach to the assignement of meaning to elements/entities is inadequate for this new scope. Instead more formal methods based upon vocabularies are required. RDF and EPISTLE have defined similar and compatible ways of extending the base technology, as explained below.

NOTE 5 - There is an important difference between the web and STEP - the web user base is many orders of magnitude larger than the STEP user base. The web can ignore STEP, but if STEP ignores the web it is dead.

1.2 Formal and informal XML

There are two approaches to defining meaning within an XML document:

informal
The meaning of an XML element type is defined by natural language text in a person readable document.

The meaning of the child elements is also defined in natural language text. Constraints on the specification of the child elements are made computer interpretable by an XML DTD.

meta-data
The meaning of an XML element type is defined within a vocabulary, which is itself an XML document. This vocabulary contains a combination of natural language text and computer interpretable relationships.

The meaning of the child elements is also defined in the vocabulary. Each child element is understood as a statement made about the parent.

NOTE 1 - The use of 'informal' here means without a definition expressed in a computer interpretable language. The definition may be quite formal and precise to a person.

Most XML documents today use the informal approach.

RDF is an alternative approach based upon a vocabulary. The use of standard vocabularies will make the Web computer interpretable.

1.3 Formal and informal EXPRESS

An EXPRESS schema defines entity types, and their attributes. An attribute can be another entity type or a literal (string, number etc.).

There are two approaches to defining meaning within an EXPRESS implementation:

informal
The meaning of each entity type is defined by natural language text in a person readable document.

The meaning of the attributes is also defined in natural language text. Constraints on the attributes are part of the EXPRESS schema.

Additional meaning, beyond the scope of the schema, is added by text attributes.

meta-data
The EXPRESS schema is kept small, and limited to a few generic concepts, such as thing, class, relation, etc..

The precise meaning of an entity instance is defined by a classification relationship with a standard instance of class.
The standard instances of class are identified and described using the same EXPRESS schema. The set of standard instances of class is a vocabulary or Reference Data Library (RDL).

Most EXPRESS based standards today use the informal approach. This approach requires a new schema for each application.

EPISTLE defines an alternative approach based upon:

2. RDF

2.1 RDF statement

The Resource Description Framework (RDF) Model and Syntax [RDFMS] specifies a method for recording statements or propositions. A statement is an expression that evaluates to give a Boolean value.

NOTE 1 - It is traditionally to convey information by recording a set of statements, and asserting that each evaluates to 'true'. The alternative is possible but silly.

NOTE 2 - Within the EPISTLE team, the word 'fact' has been used as a synonym for 'statement'.

A statement has three parts:

subject
a thing to which the predictate is applied
predicate
a function which has the subject within its domain
object
a thing within the range of the function

A statement takes the value 'true' if the application of the predicate to the subject gives the object.

NOTE 3 - In practice things are a bit messy. A predicate can be a multi-valued function (i.e. a mapping that is not a true function). In this case, the statement evaluates to true if the object is one of the values that is given by the application of the 'function'.

NOTE 4 - A function, or mapping, can be derived from a set theory relationship, which is not a function or mapping.

Classification is not a mapping. However, there is a corresponding classification mapping that consists of all (thing, class) pairs, where the thing is a member of the class.

The relationship between a thing that has a mass of 10 Kg and the class 10 Kg is classification. 10 Kg is a mass value within the 1D space of mass values. There is a corresponding mass function that consists of all (thing, mass value) pairs, where the thing is a member of the mass value.

The object of a statement can be:

The RDF Model and Syntax specifies a format for the following (and little else):

2.2 RDF example

Consider the recording of the following information about a company:

2.2.1 Informal approach

An informal approach to the recording of the information is as follows:

<companyOrganisation
  xmlns=http://www.joe_bloggs.com/organisation/schema#>
  <Company ID="jbc">
    <name>Joe Bloggs and Co.</name>
    <companySites>
      <Site id="Gas">
        <name>Gasworks Road</name>
      </Site>
      <Site id="Sticks">
        <name>Out-in-the-sticks Park</name>
      </Site>
    </companySites>
    <companyDepartments>
      <Department id="stress">
        <name>Stress office</name>
        <departmentSite>
          <Site id="Gas"/>
        </departmentSite>
      </Department>
      <Department id="design">
        <name>Design office</name>
        <departmentSite>
          <Site id="Sticks"/>
        </departmentSite>
      </Department>
    </companyDepartments>   
  </Company>  
</companyOrganisation>

NOTE 1 - It is necessary to give the sites ID's so that they can be defined in one place and referenced from elsewhere.

This document can be understood if there are text definitions for the different element types.

2.2.2 Formal approach

The information can be regarded as a number of statements. Each of these statements has a predicate that is one of the following functions:

name
which can be applied to any object and evaluates to give a text literal;
companySites
which can be applied to a company and evaluates to give a set of sites;
companyDepartments
which can be applied to a company and evalutes to give a set of departments;
departmentSite
which can be applied to a department can evaluates to give a single site.

NOTE 1 - In the RDF, a function or mapping that can be used as a predicate is called a property.

The classes that are not functions are:

NOTE 2 - The usual naming convention has been followed in which:
- classes that are not properties begin with a capital letter;
- classes that are properties begin with a lower case letter.

A reworking of the example, which uses the RDF syntax to make the nature of the relationships clear, is as follows:

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns=http://www.joe_bloggs.com/organisation/schema.rdf#>
  <Company rdf:ID="jbc">
    <name>Joe Bloggs and Co.</name>
    <companySites>
      <rdf:Bag> 
        <rdf:li>
          <Site rdf:ID="Gas">
            <name>Gasworks Road</name>
          </Site>
        </rdf:li>
        <rdf:li>
          <Site rdf:ID="Sticks">
            <name>Out-in-the-sticks Park</name>
          </Site>
        </rdf:li> 
      </rdf:Bag> 
    </companySites>
    <companyDepartments>
      <rdf:Bag> 
        <rdf:li>
          <Department rdf:ID="stress">
            <name>Stress office</name>
            <departmentSite>
              <Site rdf:resource="#Gas"/>
            </departmentSite>
          </Department>
        </rdf:li>
        <rdf:li>
          <Department rdf:ID="design">
            <name>Design office</name>
            <departmentSite>
              <Site rdf:resource="#Sticks"/>
            </departmentSite>
          </Department>
        </rdf:li> 
      </rdf:Bag> 
    </companyDepartments>   
  </Company>  
</rdf:RDF>

The changes between the two examples are:

2.2.3 Flat or hierarchical document

A RDF document can be either:

hierarchical
This is shown in the example in section 2.2.2. The subject of each statement is the parent element. This type of document is easy to transform into a convenient HTML document using XSLT.
flat
This is shown below. With this format, each statement is an independent object, as in KIF [KIF] or MathML [MathML].

The following example shows that a statement about companySites need not be a child element of Company.

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://www.joe_bloggs.com/organisation/schema.rdf#">
  <Company rdf:ID="jbc">
    <name>Joe Bloggs and Co.</name>
  </Company>

  <Site rdf:ID="Gas">

     <name>Gasworks Road</name>
  </Site>

  <Site rdf:ID="Sticks">
     <name>Out-in-the-sticks Park</name>
  </Site>

  <rdf:Description about="#jbc">
    <companySites>
      <rdf:Bag> 
        <rdf:li rdf:resource="#Gas" />
        <rdf:li rdf:resource="#Sticks" />
      </rdf:Bag>
    </companySites> 
  </rdf:Description>
</rdf:RDF>

NOTE 1 - A hierarchical RDF document would be a very effective publication route for the EPISTLE RDL. A flat RDF document is very similar in structure to an exchange file for the EPISTLE Core Model.

RDF supports an even more formal rendering of the statement, as follows:

<rdf:Statement>
  <rdf:subject rdf:resource="#jbc" />
  <rdf:predicate rdf:resource="#companySites" />
  <rdf:object>
    <rdf:Bag> 
      <rdf:li rdf:resource="#Gas" />
      <rdf:li rdf:resource="#Sticks" />
    </rdf:Bag>
  </rdf:object> 
</rdf:Statement>

There is a very simple XSLT transformation in both directions, between this and the equivalent MathML:

<reln>
  <eq/>
  <apply>
    <ci>companySites</ci>
    <ci>jbc</ci>
  </apply>
  <set> 
    <ci>Gas</ci>
    <ci>Sticks</ci>
  </set>
</reln>

2.2 RDF schema example

The class and property element types in an RDF document indicate the meaning. The Resource Description Framework (RDF) Schema [RDFS] defines an RDF document that specifies:

Each of the example RDF documents above references the RDF schema: http://www.joe_bloggs.com/organisation/schema.rdf. This documents the nature of the classes and properties in the example, as follows:

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

  <rdfs:Class rdf:ID="Company">
    <rdfs:comment>a trading organisation</rdfs:comment>
    <rdfs:subClassOf 
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
     />
  </rdfs:Class>
 
  <rdfs:Class rdf:ID="Department">
    <rdfs:comment>a part of an organisation</rdfs:comment>
    <rdfs:subClassOf 
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
     />
  </rdfs:Class>

  <rdfs:Class rdf:ID="Site">
    <rdfs:comment>area of land</rdfs:comment>
    <rdfs:subClassOf 
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
     />
  </rdfs:Class>

  <rdf:Property rdf:ID="name">
    <rdfs:comment>person memorable identification</rdfs:comment>
    <rdfs:domain    
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
     />   
 <rdfs:range
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"
     />	      
  </rdf:Property>
  
  <rdf:Property rdf:ID="companySites">
    <rdfs:comment>sites at which a company operates</rdfs:comment>
    <rdfs:domain rdf:resource="#Company" />   
    <rdfs:range 
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"
     />	      
  </rdf:Property>
  
  <rdf:Property rdf:ID="companyDepartments">
    <rdfs:comment>parts of a company</rdfs:comment>
    <rdfs:domain rdf:resource="#Company" />   
    <rdfs:range 
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"
     />	      
  </rdf:Property>
  
  <rdf:Property rdf:ID="departmentSite">
    <rdfs:comment>site of a department</rdfs:comment>
    <rdfs:domain rdf:resource="#Department" />   
    <rdfs:range rdf:resource="#Site" />	      
  </rdf:Property>
</rdf:RDF>

NOTE 1 - A 'resource' is the RDF word for thing. Hence the class Resource is the set of all things.

NOTE 2 - The subClassOf property can be used to create hierarchies, such as: Resource > Organization > Company.

3. EPISTLE

3.1 Core model and reference data library

EPISTLE has two strands:

The ERDL consists of standard instances of entities in the ECM. The ERDL can be extended to support different engineering applications.

NOTE 1 - It is possible to absorb the vast majority of the ECM into the ERDL, leaving the ECM consisting of only a handful of axiomatic entities. However at present, the ECM exists independently of the ERDL.

3.2 EPISTLE example

The RDF was explained by presenting an example in three stages:

  1. an informal XML document;
  2. an RDF document;
  3. the corresponding RDF schema.

The corresponding approach for EPISTLE, will consist of:

  1. an informal EXPRESS schema and instantiation;
  2. an instantiation of the ECM using reference data;
  3. the ERDL reference data for the example, as an ECM instantiation.

3.2.1 Informal approach

An informal EXPRESS schema is as follows:

An informal EXPRESS schema

The instantiation of this schema for the example is as follows:

#1 = COMPANY (
      'Joe Bloggs and Co.',
      (#2, #3),
      (#4, #5));

#2 = SITE ('Gasworks Road');
#3 = SITE ('Out-in-the-sticks Park');

#4 = DEPARTMENT ('Stress office', #2);
#5 = DEPARTMENT ('Design office', #3);

3.2.2 Formal approach

A simplification of the EPISTLE Core Model, containing parts relevant to the example, is as follows:

A simplification of the EPISTLE schema

The instantiation of this schema for the example is as follows:

#1 = INDIVIDUAL ();
#101 = CLASSIFICATION (#1, #9001);
#102 = CLASS_OF_IDENTIFICATION (#1, #103)
#103 = EXPRESS_STRING ('Joe Bloggs and Co.');

#2 = INDIVIDUAL ();
#201 = CLASSIFICATION (#2, #9002);
#202 = CLASS_OF_IDENTIFICATION (#2, #203)
#203 = EXPRESS_STRING ('Gasworks Road');

#3 = INDIVIDUAL ();
#301 = CLASSIFICATION (#2, #9002);
#302 = CLASS_OF_IDENTIFICATION (#3, #303)
#303 = EXPRESS_STRING ('Out-in-the-sticks Park');

#4 = INDIVIDUAL ();
#401 = CLASSIFICATION (#4, #9003);
#402 = CLASS_OF_IDENTIFICATION (#4, #403)
#403 = EXPRESS_STRING ('Stress office');

#5 = INDIVIDUAL ();
#501 = CLASSIFICATION (#5, #9003);
#502 = CLASS_OF_IDENTIFICATION (#5, #503)
#503 = EXPRESS_STRING ('Design office');

#6 = OTHER_RELATION (#1, #2);
#601 = CLASSIFICATION (#6, #9004);
#7 = OTHER_RELATION (#1, #3);
#701 = CLASSIFICATION (#7, #9004);
#8 = OTHER_RELATION (#1, #4);
#801 = CLASSIFICATION (#6, #9005);
#9 = OTHER_RELATION (#1, #5);
#901 = CLASSIFICATION (#7, #9005);
#10 = OTHER_RELATION (#2, #4);
#1001 = CLASSIFICATION (#7, #9006);
#11 = OTHER_RELATION (#3, #5);
#1101 = CLASSIFICATION (#7, #9006);

The types of individual and types of relation are indicated by references to instances #9001 to #9006. These are standard instances of class.

NOTE 1 - The instantiation of the EPISTLE schema is much less readable than the instantiation of the informal schema. This is a drawback of the EPISTLE approach, which has impeded its adoption. The RDF does not have this drawback.

3.3 Example RDL

The standard instances of class for the example can be held as instances of the ECM, as follows:

#9001 = CLASS ();
#9012 = CLASS_OF_IDENTIFICATION (#9001, #9013)
#9013 = EXPRESS_STRING ('company');

#9002 = CLASS ();
#9022 = CLASS_OF_IDENTIFICATION (#9002, #9023)
#9023 = EXPRESS_STRING ('department');

#9003 = CLASS ();
#9032 = CLASS_OF_IDENTIFICATION (#9003, #9033)
#9033 = EXPRESS_STRING ('site');

#9004 = OTHER_CLASS_OF_RELATION (#9001, #9002);
#9042 = CLASS_OF_IDENTIFICATION (#9004, #9043)
#9043 = EXPRESS_STRING ('department of company');

#9005 = OTHER_CLASS_OF_RELATION (#9001, #9003);
#9052 = CLASS_OF_IDENTIFICATION (#9005, #9053)
#9053 = EXPRESS_STRING ('site of company');

#9006 = OTHER_CLASS_OF_RELATION (#9002, #9003);
#9062 = CLASS_OF_IDENTIFICATION (#9006, #9063)
#9063 = EXPRESS_STRING ('site of department');

NOTE 1 - Each standard instance of class can have a person readable text description. This has been omitted for simplicity.

NOTE 2 - Each standard instance of other_class_of_relation can have cardinalities specified. For example, a company can have one or more sites; and a department (for the purpose of this example) has exactly one site.

NOTE 3 - It is difficult using EPISTLE to say that a company has a set of site, and then to specify all the members of that set. The usual EPISTLE approach is looser - one or more company site relations are specified. It is not possible, using the current EPISTLE model, to enumerate the complete membership of a set.

The example makes reference between the data (about Joe Bloggs and Co.) and the meta-data in the RDL using record numbers, e.g. #9001. This approach works for an implementation as a single file or within a single data base. If the data is in one file, and the RDL is in another, then the referencing mechanism is more complicated.

NOTE 4 - The Web is very good a references. EPISTLE struggles by comparison.

4. Conclusions

The RDF is as flexible and generic as EPISTLE, so EPISTLE will lose none of its generality if it is implemented as an RDF application.

The RDF is much easier to implement than EPISTLE for some applications. Its great strengths are:

NOTE 1 - A hierarchical document solves the EPISTLE 'cloud and nail' problem. An object in EPISTLE is a 'nail', which has a 'clould' of facts about it. A parent/child hierarchy can link the clould to the nail for presentation to a user.

The publication of the EPISTLE RDL as a RDF document is a priority. This is a convenient and general approach to publication, which will gain EPISTLE a new market.

5. References

[BERNERS-LEE98]
What the Semantic Web can represent, Tim Berners-Lee, 1998
http://www.w3.org/DesignIssues/RDFnot.html
[RDFMS]
Resource Description Framework (RDF) Model and Syntax, W3C Recommendation, 22 February 1999
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222
[RDFS]
Resource Description Framework (RDF) Schema Specification 1.0 , W3C Candidate Recommendation 27 March 2000
http://www.w3.org/TR/2000/CR-rdf-schema-20000327
[ECM]
EPISTLE Core Model version 4, working draft of ISO 15926-2, 2001-03-07
http://www.iso15926.org/ecm_400/index.html
[ERDL]
EPISTLE Reference Data Library, working draft of ISO 15926-4
http://www.stepcom.ncl.ac.uk/epistle/erdl1.htm
[KIF]
Knowledge Interchange Format (KIF)
[MathML]
Mathematical Markup Language (MathML)