STEP/EPISTLE and the RDF
Foreword
The Resource Description Framework (RDF) is a key part of the 'Semantic Web'
[BERNERS-LEE98] . It enables a vocabulary to be
defined, and information to be recorded with respect to that vocabulary. By the
use of standard vocabularies, information on the web can be computer
interpretable.
The European Process Industry STEP Technical Liaison Executive (EPISTLE) is
an association devoted to the development of STEP (STandard for the Exchange of
Product data) for use within the process industry. STEP is the name given to
the family of standards developed by ISO TC184/SC4.
EPISTLE has developed a generic information model for the exchange of
product data in EXPRESS, which can be used in conjunction with a Reference Data
Library (or vocabulary).
This document:
- shows that the RDF and EPISTLE approaches are similar and compatible;
- shows that the RDF implementation using XML is much more flexible than
anything available for the implementation of EXPRESS models; and
- argues that RDF should be adopted an implementation route for EPISTLE, and
that the EPISTLE RDL should be published as an RDF vocabulary.
Table of Contents
1. Introduction
1.1 Two communities
The Web and STEP have different origins:
- Web
- hypertext links between person readable documents, which may be stored on
different computers;
- STEP
- the exchange of product data files between different CAD or CAE
applications.
|
Both communities are moving beyond this original scope:
- Web
- computer readable as well a person readable data;
- STEP/EPISTLE
- engineering databases of broad scope which may exist throughout the life of
a product.
|
NOTE 1 - The extension of the STEP scope has been pioneered
by EPISTLE. Related approaches are now being followed by other projects, such
as PLCS (Product Life Cycle Support).
The two communities use different technologies:
- Semantic Web
- The base technology for the 'semantic web' is XML. An XML document is a
hierarchical (parent and child) structure of elements. Meaning is conveyed by
the type of an element.
- STEP
- The base technology for STEP is EXPRESS. A STEP exchange file is a sequence
of records, which are instances of entity types in an EXPRESS schema. Meaning
is conveyed by the type of an entity.
|
NOTE 2 - XML is an extension of HTML. In HTML the meaning of
an element is restricted to the role of the element within a document and to
the presentation of the element to a person.
NOTE 3 - EXPRESS is very similar to UML, and in most cases
conversion between the two languages is trivial. EXPRESS is stronger than UML
in defining data constraints. UML has the object oriented concepts of
interfaces and methods, which are not present in EXPRESS.
NOTE 4 - There is an algorithm for the representation of
data defined by EXPRESS as an XML file. Just like XMI (the equivalent algorithm
for UML), this gives a messy and complicated result.
Both communities have realised that a 'suck it and see' approach to the
assignement of meaning to elements/entities is inadequate for this new scope.
Instead more formal methods based upon vocabularies are required. RDF and
EPISTLE have defined similar and compatible ways of extending the base
technology, as explained below.
NOTE 5 - There is an important difference between the web
and STEP - the web user base is many orders of magnitude larger than the STEP
user base. The web can ignore STEP, but if STEP ignores the web it is dead.
1.2 Formal and informal XML
There are two approaches to defining meaning within an XML document:
- informal
- The meaning of an XML element type is defined by natural language text in a
person readable document.
The meaning of the child elements is also defined in natural language text.
Constraints on the specification of the child elements are made computer
interpretable by an XML DTD.
- meta-data
- The meaning of an XML element type is defined within a vocabulary, which is
itself an XML document. This vocabulary contains a combination of natural
language text and computer interpretable relationships.
The meaning of the child elements is also defined in the vocabulary. Each child
element is understood as a statement made about the parent.
|
NOTE 1 - The use of 'informal' here means without a
definition expressed in a computer interpretable language. The definition may
be quite formal and precise to a person.
Most XML documents today use the informal approach.
RDF is an alternative approach based upon a vocabulary. The use of standard
vocabularies will make the Web computer interpretable.
1.3 Formal and informal EXPRESS
An EXPRESS schema defines entity types, and their attributes. An attribute
can be another entity type or a literal (string, number etc.).
There are two approaches to defining meaning within an EXPRESS
implementation:
- informal
- The meaning of each entity type is defined by natural language text in a
person readable document.
The meaning of the attributes is also defined in natural language text.
Constraints on the attributes are part of the EXPRESS schema.
Additional meaning, beyond the scope of the schema, is added by text
attributes.
- meta-data
- The EXPRESS schema is kept small, and limited to a few generic concepts,
such as thing, class, relation, etc..
The precise meaning of an entity instance is defined by a classification
relationship with a standard instance of class.
The standard instances of class are identified and described using the same
EXPRESS schema. The set of standard instances of class is a vocabulary or
Reference Data Library (RDL).
|
Most EXPRESS based standards today use the informal approach. This approach
requires a new schema for each application.
EPISTLE defines an alternative approach based upon:
- a small schema which is independent of application; and
- the use of application specific RDLs.
2. RDF
2.1 RDF statement
The Resource Description Framework (RDF) Model and Syntax [RDFMS] specifies a method for recording statements or
propositions. A statement is an expression that evaluates to give a Boolean
value.
NOTE 1 - It is traditionally to convey information by
recording a set of statements, and asserting that each evaluates to 'true'. The
alternative is possible but silly.
NOTE 2 - Within the EPISTLE team, the word 'fact' has been
used as a synonym for 'statement'.
A statement has three parts:
- subject
- a thing to which the predictate is applied
- predicate
- a function which has the subject within its domain
- object
- a thing within the range of the function
A statement takes the value 'true' if the application of the predicate to
the subject gives the object.
NOTE 3 - In practice things are a bit messy. A predicate can
be a multi-valued function (i.e. a mapping that is not a true function). In
this case, the statement evaluates to true if the object is one of the values
that is given by the application of the 'function'.
NOTE 4 - A function, or mapping, can be derived from a set
theory relationship, which is not a function or mapping.
Classification is not a mapping. However, there is a corresponding
classification mapping that consists of all (thing, class) pairs, where the
thing is a member of the class.
The relationship between a thing that has a mass of 10 Kg and the class 10 Kg
is classification. 10 Kg is a mass value within the 1D space of mass values.
There is a corresponding mass function that consists of all (thing, mass value)
pairs, where the thing is a member of the mass value.
The object of a statement can be:
- a set of things; or
- a list of things.
The RDF Model and Syntax specifies a format for the following (and little
else):
- a statement, distinguishing between the subject, predicate and object
within a statement;
- the identification of a subject, predicate or object;
- a set or list.
2.2 RDF example
Consider the recording of the following information about a company:
- the company is 'Joe Bloggs and Co.';
- the company has two sites 'Gasworks Road' and 'Out-in-the-sticks Park';
- the company has two departments 'Stress office' and 'Design office';
- the Stress Office is at the Gasworks Road site, and the Design office is at
the Out-in-the-sticks Park site.
2.2.1 Informal approach
An informal approach to the recording of the information is as follows:
<companyOrganisation
xmlns=http://www.joe_bloggs.com/organisation/schema#>
<Company ID="jbc">
<name>Joe Bloggs and Co.</name>
<companySites>
<Site id="Gas">
<name>Gasworks Road</name>
</Site>
<Site id="Sticks">
<name>Out-in-the-sticks Park</name>
</Site>
</companySites>
<companyDepartments>
<Department id="stress">
<name>Stress office</name>
<departmentSite>
<Site id="Gas"/>
</departmentSite>
</Department>
<Department id="design">
<name>Design office</name>
<departmentSite>
<Site id="Sticks"/>
</departmentSite>
</Department>
</companyDepartments>
</Company>
</companyOrganisation>
|
NOTE 1 - It is necessary to give the sites ID's so that they
can be defined in one place and referenced from elsewhere.
This document can be understood if there are text definitions for the
different element types.
2.2.2 Formal approach
The information can be regarded as a number of statements. Each of these
statements has a predicate that is one of the following functions:
- name
- which can be applied to any object and evaluates to give a text literal;
- companySites
- which can be applied to a company and evaluates to give a set of sites;
- companyDepartments
- which can be applied to a company and evalutes to give a set of
departments;
- departmentSite
- which can be applied to a department can evaluates to give a single site.
NOTE 1 - In the RDF, a function or mapping that can be used
as a predicate is called a property.
The classes that are not functions are:
- Company;
- Site; and
- Department.
NOTE 2 - The usual naming convention has been followed in
which:
- classes that are not properties begin with a capital letter;
- classes that are properties begin with a lower case letter.
A reworking of the example, which uses the RDF syntax to make the nature of
the relationships clear, is as follows:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns=http://www.joe_bloggs.com/organisation/schema.rdf#>
<Company rdf:ID="jbc">
<name>Joe Bloggs and Co.</name>
<companySites>
<rdf:Bag>
<rdf:li>
<Site rdf:ID="Gas">
<name>Gasworks Road</name>
</Site>
</rdf:li>
<rdf:li>
<Site rdf:ID="Sticks">
<name>Out-in-the-sticks Park</name>
</Site>
</rdf:li>
</rdf:Bag>
</companySites>
<companyDepartments>
<rdf:Bag>
<rdf:li>
<Department rdf:ID="stress">
<name>Stress office</name>
<departmentSite>
<Site rdf:resource="#Gas"/>
</departmentSite>
</Department>
</rdf:li>
<rdf:li>
<Department rdf:ID="design">
<name>Design office</name>
<departmentSite>
<Site rdf:resource="#Sticks"/>
</departmentSite>
</Department>
</rdf:li>
</rdf:Bag>
</companyDepartments>
</Company>
</rdf:RDF>
|
The changes between the two examples are:
- making all element children, directly or indirectly, of <rdf>;
- using the class <Bag> and its property <li> to
indicate predicate objects that are an aggregate;
- using the rdf identification and referencing convention, which is:
- ID identifies a new resource;
- resource references an existing resource;
2.2.3 Flat or hierarchical document
A RDF document can be either:
- hierarchical
- This is shown in the example in section 2.2.2. The subject of each statement is the
parent element. This type of document is easy to transform into a convenient
HTML document using XSLT.
- flat
- This is shown below. With this format, each statement is an independent
object, as in KIF [KIF] or MathML [MathML].
The following example shows that a statement about companySites need
not be a child element of Company.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://www.joe_bloggs.com/organisation/schema.rdf#">
<Company rdf:ID="jbc">
<name>Joe Bloggs and Co.</name>
</Company>
<Site rdf:ID="Gas">
<name>Gasworks Road</name>
</Site>
<Site rdf:ID="Sticks">
<name>Out-in-the-sticks Park</name>
</Site>
<rdf:Description about="#jbc">
<companySites>
<rdf:Bag>
<rdf:li rdf:resource="#Gas" />
<rdf:li rdf:resource="#Sticks" />
</rdf:Bag>
</companySites>
</rdf:Description>
</rdf:RDF>
|
NOTE 1 - A hierarchical RDF document would be a very
effective publication route for the EPISTLE RDL. A flat RDF document is very
similar in structure to an exchange file for the EPISTLE Core Model.
RDF supports an even more formal rendering of the statement, as follows:
<rdf:Statement>
<rdf:subject rdf:resource="#jbc" />
<rdf:predicate rdf:resource="#companySites" />
<rdf:object>
<rdf:Bag>
<rdf:li rdf:resource="#Gas" />
<rdf:li rdf:resource="#Sticks" />
</rdf:Bag>
</rdf:object>
</rdf:Statement>
|
There is a very simple XSLT transformation in both directions, between this
and the equivalent MathML:
<reln>
<eq/>
<apply>
<ci>companySites</ci>
<ci>jbc</ci>
</apply>
<set>
<ci>Gas</ci>
<ci>Sticks</ci>
</set>
</reln>
|
2.2 RDF schema example
The class and property element types in an RDF document indicate the
meaning. The Resource Description Framework (RDF) Schema [RDFS] defines an RDF document that specifies:
- the classes and properties in an RDF schema; and
- the top resources that are referenced by an RDF schema.
Each of the example RDF documents above references the RDF schema:
http://www.joe_bloggs.com/organisation/schema.rdf. This documents
the nature of the classes and properties in the example, as follows:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdfs:Class rdf:ID="Company">
<rdfs:comment>a trading organisation</rdfs:comment>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
/>
</rdfs:Class>
<rdfs:Class rdf:ID="Department">
<rdfs:comment>a part of an organisation</rdfs:comment>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
/>
</rdfs:Class>
<rdfs:Class rdf:ID="Site">
<rdfs:comment>area of land</rdfs:comment>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
/>
</rdfs:Class>
<rdf:Property rdf:ID="name">
<rdfs:comment>person memorable identification</rdfs:comment>
<rdfs:domain
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"
/>
<rdfs:range
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"
/>
</rdf:Property>
<rdf:Property rdf:ID="companySites">
<rdfs:comment>sites at which a company operates</rdfs:comment>
<rdfs:domain rdf:resource="#Company" />
<rdfs:range
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"
/>
</rdf:Property>
<rdf:Property rdf:ID="companyDepartments">
<rdfs:comment>parts of a company</rdfs:comment>
<rdfs:domain rdf:resource="#Company" />
<rdfs:range
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag"
/>
</rdf:Property>
<rdf:Property rdf:ID="departmentSite">
<rdfs:comment>site of a department</rdfs:comment>
<rdfs:domain rdf:resource="#Department" />
<rdfs:range rdf:resource="#Site" />
</rdf:Property>
</rdf:RDF>
|
NOTE 1 - A 'resource' is the RDF word for thing. Hence the
class Resource is the set of all things.
NOTE 2 - The subClassOf property can be used to
create hierarchies, such as: Resource > Organization > Company.
3. EPISTLE
3.1 Core model and reference data
library
EPISTLE has two strands:
- the EPISTLE Core Model [ECM]; and
- the EPISTLE Reference Data Library [ERDL].
The ERDL consists of standard instances of entities in the ECM. The ERDL can
be extended to support different engineering applications.
NOTE 1 - It is possible to absorb the vast majority of the
ECM into the ERDL, leaving the ECM consisting of only a handful of axiomatic
entities. However at present, the ECM exists independently of the ERDL.
3.2 EPISTLE example
The RDF was explained by presenting an example in three stages:
- an informal XML document;
- an RDF document;
- the corresponding RDF schema.
The corresponding approach for EPISTLE, will consist of:
- an informal EXPRESS schema and instantiation;
- an instantiation of the ECM using reference data;
- the ERDL reference data for the example, as an ECM instantiation.
3.2.1 Informal approach
An informal EXPRESS schema is as follows:

The instantiation of this schema for the example is
as follows:
#1 = COMPANY (
'Joe Bloggs and Co.',
(#2, #3),
(#4, #5));
#2 = SITE ('Gasworks Road');
#3 = SITE ('Out-in-the-sticks Park');
#4 = DEPARTMENT ('Stress office', #2);
#5 = DEPARTMENT ('Design office', #3);
|
3.2.2 Formal approach
A simplification of the EPISTLE Core Model, containing parts relevant to the
example, is as follows:
The instantiation of this schema for
the example is as follows:
#1 = INDIVIDUAL ();
#101 = CLASSIFICATION (#1, #9001);
#102 = CLASS_OF_IDENTIFICATION (#1, #103)
#103 = EXPRESS_STRING ('Joe Bloggs and Co.');
#2 = INDIVIDUAL ();
#201 = CLASSIFICATION (#2, #9002);
#202 = CLASS_OF_IDENTIFICATION (#2, #203)
#203 = EXPRESS_STRING ('Gasworks Road');
#3 = INDIVIDUAL ();
#301 = CLASSIFICATION (#2, #9002);
#302 = CLASS_OF_IDENTIFICATION (#3, #303)
#303 = EXPRESS_STRING ('Out-in-the-sticks Park');
#4 = INDIVIDUAL ();
#401 = CLASSIFICATION (#4, #9003);
#402 = CLASS_OF_IDENTIFICATION (#4, #403)
#403 = EXPRESS_STRING ('Stress office');
#5 = INDIVIDUAL ();
#501 = CLASSIFICATION (#5, #9003);
#502 = CLASS_OF_IDENTIFICATION (#5, #503)
#503 = EXPRESS_STRING ('Design office');
#6 = OTHER_RELATION (#1, #2);
#601 = CLASSIFICATION (#6, #9004);
#7 = OTHER_RELATION (#1, #3);
#701 = CLASSIFICATION (#7, #9004);
#8 = OTHER_RELATION (#1, #4);
#801 = CLASSIFICATION (#6, #9005);
#9 = OTHER_RELATION (#1, #5);
#901 = CLASSIFICATION (#7, #9005);
#10 = OTHER_RELATION (#2, #4);
#1001 = CLASSIFICATION (#7, #9006);
#11 = OTHER_RELATION (#3, #5);
#1101 = CLASSIFICATION (#7, #9006);
|
The types of individual and types of relation are indicated by references to
instances #9001 to #9006. These are standard instances of class.
NOTE 1 - The instantiation of the EPISTLE schema is much
less readable than the instantiation of the informal schema. This is a drawback
of the EPISTLE approach, which has impeded its adoption. The RDF does not have
this drawback.
3.3 Example RDL
The standard instances of class for the example can be held as instances of
the ECM, as follows:
#9001 = CLASS ();
#9012 = CLASS_OF_IDENTIFICATION (#9001, #9013)
#9013 = EXPRESS_STRING ('company');
#9002 = CLASS ();
#9022 = CLASS_OF_IDENTIFICATION (#9002, #9023)
#9023 = EXPRESS_STRING ('department');
#9003 = CLASS ();
#9032 = CLASS_OF_IDENTIFICATION (#9003, #9033)
#9033 = EXPRESS_STRING ('site');
#9004 = OTHER_CLASS_OF_RELATION (#9001, #9002);
#9042 = CLASS_OF_IDENTIFICATION (#9004, #9043)
#9043 = EXPRESS_STRING ('department of company');
#9005 = OTHER_CLASS_OF_RELATION (#9001, #9003);
#9052 = CLASS_OF_IDENTIFICATION (#9005, #9053)
#9053 = EXPRESS_STRING ('site of company');
#9006 = OTHER_CLASS_OF_RELATION (#9002, #9003);
#9062 = CLASS_OF_IDENTIFICATION (#9006, #9063)
#9063 = EXPRESS_STRING ('site of department');
|
NOTE 1 - Each standard instance of class can have a
person readable text description. This has been omitted for simplicity.
NOTE 2 - Each standard instance of
other_class_of_relation can have cardinalities specified. For example, a
company can have one or more sites; and a department (for the purpose of this
example) has exactly one site.
NOTE 3 - It is difficult using EPISTLE to say that a company
has a set of site, and then to specify all the members of that set. The usual
EPISTLE approach is looser - one or more company site relations are specified.
It is not possible, using the current EPISTLE model, to enumerate the complete
membership of a set.
The example makes reference between the data (about Joe Bloggs and Co.) and
the meta-data in the RDL using record numbers, e.g. #9001. This approach works
for an implementation as a single file or within a single data base. If the
data is in one file, and the RDL is in another, then the referencing mechanism
is more complicated.
NOTE 4 - The Web is very good a references. EPISTLE
struggles by comparison.
4. Conclusions
The RDF is as flexible and generic as EPISTLE, so EPISTLE will lose none of
its generality if it is implemented as an RDF application.
The RDF is much easier to implement than EPISTLE for some applications. Its
great strengths are:
- The RDF has the ability to support flat and hierarchical documents. A
hierarchical document can be turned into a person readable document HTML by a
simple XSLT transformation. A flat document can be loaded into a database, in
the same way as an EPISTLE file.
- The RDF takes full advantage of the referencing methodology of the Web.
NOTE 1 - A hierarchical document solves the EPISTLE 'cloud
and nail' problem. An object in EPISTLE is a 'nail', which has a 'clould' of
facts about it. A parent/child hierarchy can link the clould to the nail for
presentation to a user.
The publication of the EPISTLE RDL as a RDF document is a priority. This is
a convenient and general approach to publication, which will gain EPISTLE a new
market.