Introduction

A theory has only the alternative of being right or wrong. A model has a third possibility: it may be right but irrelevant.

— Manfred Eigen
The Physicists Conception of Nature (ed. Jagdish Mehra) Reidel

There is only one reality, however, there are multiple ways to model that reality. Information modelling by nature is a reduction exercise — given that information models are used within particular contexts, only those aspects useful to those contexts should remain in the resulting information models.

Omission of essential aspects of the reality in an information model can be detrimental, and to a lesser extent, representation of non-essential aspects are superfluous.

The art of creating information models interoperable between groups and domains is the more challenging as multiple domains require different aspects of the same information model.

The practice of information modelling is one that rewards care, accuracy and expertise, and this course explains the tried and true best practices for the art of modelling with EXPRESS.

Modellers and the modelling team

In an information modelling exercise with a slightly larger scope requires collaboration between individuals, which we call the "modelling team".

The modeling team consists of three groups:

  1. Technical experts who know the model domain.

  2. Modeling experts who help the technical experts develop the model.

  3. Reviewers, who are domain experts, to ‘keep the developers honest’

Process of modelling

Generally, modelling consists of the following steps:

  1. Define the scope

  2. ‘Find the objects’

  3. ‘Sketch’ the structure

  4. Generalisation and specialisation

  5. Refine the structure

  6. Eliminate synonyms and homonyms

  7. Cardinality constraints

  8. Local constraints

  9. Global constraints

  10. Partition

Principles of modelling

These principles form the core knowledge of modelling best practices. Each aspect below are elaborated separately in the sections that follow.

Best practices

Scoping

A model must have a well defined scope (otherwise how do you know when you are finished?).

Before modeling:

  • Define the intended model scope.

  • Document the assumptions.

Readability

An information model, although if defined via EXPRESS is computer interpretable, should primarily be human understandable.

Modeling constructs should be chosen to aid the reader rather than obfuscate understanding by using complex, intertwined or opaque definitional relationships.

Simple Types

The EXPRESS simple types (i.e Integer, String, etc) carry virtually no semantics.

As far as possible, avoid the use of simple types as entity attributes. Instead, use Types to provide the semantics.

ENTITY antique_object;
  produced : date;
  maker    : name;
DERIVE
  years_old : age := elapsed(produced);
END_ENTITY;

TYPE name = STRING; END_TYPE;
TYPE age = INTEGER; END_TYPE;

Independence

Implementation independence

An information model defines the information necessary for some purpose; it does not define the data structuring.

  • Model objects, not syntax.

  • Model the ‘real world’ not a technology-based representation.

  • Do not define ‘precision’ of data types.

  • Do not define ‘default’ data.

Schema independence

In EXPRESS, each Schema defines a scope; definition names need only be unique within a Schema.

Attempt to maintain name uniqueness across all schemas in a model (see the Nym Principle). This will assist when restructuring a model, if necessary, by modifying the schema boundaries.

Context independence

Each entity exists in a context in which it may be used. This may vary from extremely broad to highly specific. An entity definition should be as context independent as possible, yet as context specific as required.

  • Only apply the minimum necessary number of constraints.

  • Use Subtyping to get more specificity.

Structural

Partitioning

  • A SCHEMA defines a scope and a context.

  • A model is often partitioned into a set of schemas (e.g to increase readability, to partition the development work, etc.)

  • Minimise the interaction between schemas.

  • Within a schema, minimise the constraints on the objects in question (to promote re-usability).

Nym Principle

In general, ‘one name, one meaning, one definition’. Synonyms and homonyms in a model are a fruitful and never-ending source of confusion.

  • ‘If things are the same, then they should have the same name.’

  • ‘If things are not the same, then they are different.’

  • ‘Different things should have different names.’

Invariance

The meaning of an entity should not be dependent on the values of its attributes. Do not use ‘flags’ to change meanings.

ENTITY poor_person_model;
  sex : enumeration_of_male_female;
  ... -- gender related attributes
  ... -- non-gender attributes
END_ENTITY;

ENTITY good_person_model
  SUPERTYPE OF (ONEOF(male, female));
  ... -- non-gender attributes
END_ENTITY;
  -- gender related attributes put into
  -- the relevant subtypes

Redundancy

A model should not contain redundant information; redundancy leads to the possibility of data inconsistencies.

ENTITY circle;
  center : point;
  radius : REAL;
DERIVE
  perimeter : REAL := 2.0*PI*radius;
  diameter : REAL := 2.0*radius;
END_ENTITY;

Inheritance

A Subtype inherits all the properties of its Supertype.

For readability it may appear desirable to migrate the common properties down to the leaves of the supertype tree. This, however, implies that the common properties are semantically different.

All common properties should be moved as close to the root of the Supertype tree as possible. This demonstrates that they ARE common.

SCHEMA INTERFACING
SCHEMA first;
  ENTITY aaa;
    -- attributes
  END_ENTITY;

  ENTITY original;
    attr : NUMBER;
  END_ENTITY;
END_SCHEMA; -- first

SCHEMA second;
  USE FROM first (aaa AS bbb);
  REFERENCE FROM first (original);

  ENTITY constrained
    SUBTYPE OF (original);
    attr : INTEGER(7);
    WHERE
      positive : attr > 0;
    END_ENTITY;
END_SCHEMA; -- second

Constraints

General

An EXPRESS information model is permissive (i.e what is not explicitly prohibited is permissable).

Add all necessary constraints — a model is as much about the limitations of objects as it is about the objects themselves.

TYPE age = INTEGER;
WHERE
  non_negative : SELF >= 0;
END_TYPE;

Ordering of constraints

Specify constraints by the following ordered preferences:

  1. Model structure

  2. Local constraints

  3. Global rules

Global rules

ENTITY male SUBTYPE OF (person);
  wife : OPTIONAL female;
  -- other attributes
END_ENTITY;

ENTITY female SUBTYPE OF (person);
  husband : OPTIONAL male;
  -- other attributes
END_ENTITY;

RULE married FOR (male, female);
  -- check declared husbands
  -- and wives match each other
END_RULE;

Local rules

ENTITY male SUBTYPE OF (person);
  wife : OPTIONAL female;
  -- other attributes
WHERE
  -- check wife says she is
  -- married to me
END_ENTITY;

ENTITY female SUBTYPE OF (person);
  husband : OPTIONAL male;
  -- other attributes
WHERE
  -- check husband says he is
  -- married to me
END_ENTITY;

Structural

ENTITY male SUBTYPE OF (person);
  -- other attributes
END_ENTITY;

ENTITY female SUBTYPE OF (person);
  -- other attributes
END_ENTITY;

ENTITY married;
  husband : male;
  wife    : female;
UNIQUE
  no_bigamy    : husband;
  no_polyandry : wife;
END_ENTITY;