Hypermedia authoring tools usually suffer from a lack of validation capabilities that would make it possible to check a document against temporal inconsistencies. The document design methodology proposed in the paper overcomes this problem by translating a high-level model of an hypermedia document into a RT-LOTOS formal specification, on which standard reachability analysis may be applied for verification purposes. The proposed approach extends an early work to verify both intrinsic and extrinsic temporal consistency properties of an hypermedia document.
Among various facilities offered by hypermedia authoring tools, few attention has been paid to the design of documents that would be free from inconsistencies that come from temporal constraints applied to their components through various relationships among their anchors. In general, inconsistencies result from several combined temporal constraints that depend on the presentation environment. Their checking is a hard task, and the need for a formal design validation methodology is obvious. The benefits of formal methods for multimedia and hypermedia document validation have been exposed in several papers [Cour96], [Cour95b], [Jour97],[Will96].
Early results on temporal consistency validation for multimedia and hypermedia documents were presented in [Cour96]. A new temporal synchronization model, relying on the composition of basic presentation and constraints objects, has been used as authoring model. That approach was in fact an ad-hoc proposal based on a reverse engineering from the underlying formalism, the RT-LOTOS process algebra; it presented three major drawbacks:
An enhanced approach is discussed in this paper, with two major contributions.
The proposed methodology includes an automatic translation from a high-level hypermedia model into a RT-LOTOS formal specification. It relies on those objects that are usually found in hypermedia document descriptions, namely nodes (including composite nodes), anchors and links. The ideas behind the general model discussed in the paper are currently being implemented within the HyperProp System [Casa91].
Hypermedia document checking against temporal inconsistencies does take into account the specific platform on which the document will be presented. The platform introduces so-called extrinsic constraints. The latter correspond to the undesired delays introduced by multimedia presentation devices, and by concurrent access to common resources. In the paper, we particularly demonstrate how the specification of the presentation devices (RT-LOTOS processes) can easily be composed with the RT-LOTOS specification derived from the high-level hypermedia model.
The paper is organized as follows. Section 3 introduces the high level model used to describe documents. Sections 4 and 5 present how the Formal Description Technique RT-LOTOS has been used to formalize the document model with the purpose of verifying intrinsic and extrinsic temporal consistency properties. Section 6 illustrates the proposed validation methodology and its associated support tools on simple examples. Finally, Section 7 draws some conclusions.
Hypermedia authoring systems support many tasks, such as: integration of different media types (often distributed), definition of temporal and spatial constraints among document components, creation of structured documents, navigation control and presentation monitoring. The authoring tool must be simple to use and sufficiently expressive in order to not impose a cognitive overhead to authors. Usually, better expressiveness implies a greater probability to generate inconsistent documents. In order to overcome this problem, some systems only permit the use of causal relations between events (start or end of a component presentation) [Buch93], [Ross93]. Others, however, consider relative synchronization, where inconsistent presentation specifications can happen [Jour97].
Multimedia and hypermedia applications are real-time and their performance is limited by the available resources. For example, video and audio media can need a delay before starting their presentation in order to decompress the information, and the duration and quality of the presentation may be sensitive to the network traffic. Moreover, hypermedia presentations are dynamic and depend on user interactions. For example, if the same information can be presented as a text or an audio (and consequently, with different duration), depending on a user interaction, synchronization constraints will be defined at presentation time. Consistency verification becomes much more difficult in these cases.
Authoring techniques usually postpone design validation to the run-time stage. This is often too late, and the lack of well-established procedure makes the validation process uncertain. A complete formal validation approach is, however, too complex for naive users. They usually prefer a pictorial language ranging from basic timelines to object oriented models [Casa91],[Frai97]. Consequently, the proposed approach makes it possible for the designer to use his or her favorite tool. The latter's output will then be translated into a RT-LOTOS specification for checking against design errors, using partial (simulation) and, if possible, exhaustive (formal verification) validation techniques. The methodology discussed in the paper goes one step beyond: predefined resource specifications written in RT-LOTOS can be associated with the RT-LOTOS specification derived from the high-level document model. A dynamic validation can then be applied to the resulting global specification, taking into account a particular platform.
The possibility to have relationships among components of a document represents the main characteristic of a hypermedia system. However, it is important to organize the document in order to reduce the so-called lost in the hyperspace problem [Much97]. The structured definition of documents is desirable as it carries built-in concepts of modularity, encapsulation and abstraction. Conceptual models with composite nodes support such mechanisms, in particular, models that allow nested compositions.
The definition of hypermedia documents in the Nested Context Model (NCM), the conceptual model of the HyperProp system, is based on two familiar concepts, namely nodes and links. Nodes are fragments of information and links interconnect nodes into networks of related nodes. The model goes further and distinguishes two basic classes of nodes, called content and composite nodes, the latter being the central concept of the model [Soar95].
Intuitively, the content node contains data whose internal structure, if any, is application dependent (they are the usual hypermedia nodes). The class of content nodes can be specialized into other classes (text, video, audio, image, etc.), as required by applications.
A composite node C is a node whose content is a collection S of nodes and links such that every base node of every link occurring in S is either C itself or a node occurring in S (the definition of the base node of a link is given below). It is considered that an entity E in S is a component of C and that E is contained in C. We also say that a node A is recursively contained in C iff A is contained in C or A is contained in a node recursively contained in C. Note that a component may be included more than once in S. However, in this paper, we will assume links and nodes collection as a set, without loss of generality for our discussion. An important restriction however must be done: a node cannot be recursively contained in itself. It is worth to note that composite nodes contain nodes and links, generalizing some models that group only nodes into compositions.
As the model permits a node to be recursively contained in different composite nodes and composite nodes to be nested to any depth, it is necessary to introduce the concept of perspective. Intuitively, the perspective of a node identifies through which sequence of nested composite nodes a given node instance is being observed. Formally, a perspective of a node N is a sequence P=(Nm,...,N1), with m>1, such that N1=N, Ni+1 is a composite node, Ni is contained in Ni+1, for 1 < i <m and Nm is not contained in any node. Note that there can be several different perspectives for the same node N, if this node is contained in more than one composite node.
Composite nodes, as defined, have several desired properties, such as:
A detailed comparison between NCM and hypermedia models used in Firefly, CMIF, I-HTSPN and others, with respect to temporal and spatial synchronization aspects as well as to the semantic power of their compositions can be found in [Rodr97].
A link has three main attributes, the source end point set, the destination or target end point set and the meeting point. The set of source and destination end points will define events. The meeting point will define relationships among events. An event is defined by the presentation of a marked set of information units of a node (presentation event) or by its selection (selection event) or by the changing of an attribute of a node (attribution event).
The end points are tuples of the form <(Nk,...,N1), a, type> such that N1 is a node, Ni+1 is a composite node and Ni is contained in Ni+1, for 1 < i <k, with k > 0, a is a set of information units (an anchor) of N1 or an attribute identifier (and its value) of N1, and type specifies the event type associated with a (selection, presentation or attribution). The node Nk is called a base node of the link. The meeting point contains one operation, as usual, composed by one condition and one action. Once the condition is satisfied, the associated action is triggered. The conditions evaluate logical expressions over event states and attribute values of nodes specified by the source end points of the link. Actions in a meeting point are operations that must be executed over the destination end points of the link, or are just to introduce a time delay. Examples of simple actions are: start, pause, stop, prepare a presentation event E; wait a time delay; etc.
For each presentation event, one should be able to specify how and by which tool the associated node (data object) will be presented. In HyperProp, descriptor objects contain these specifications. The independence between descriptors and data objects will permit better reuse of objects. Using distinct descriptors, one can define different presentations for the same data object. For example, a text media segment can be presented as text, using descriptor D1, or it can be synthesized as audio using descriptor D2. The aggregation of a data object and a descriptor in order to present a component is called a representation object.
The descriptor can be specified in composite nodes, in links, in nodes itself or in their classes. When presenting a node, the descriptor explicitly defined on-the-fly by the end user bypasses the descriptors defined during the authoring phase. These in turn have the following precedence order: first, that defined in the composite node that contains the node, if it is the case; second, that defined in the link used to reach the node; third, that defined within the node; and finally the default descriptor defined in the node class.
The purpose here is not to define a new formalism for the description of temporal structure in hypermedia documents, but instead, to formalize basic and high-level hypermedia objects via a mapping of these objects onto a general-purpose formal description technique (FDT in short). The FDT is therefore completely hidden to the authors. General-purpose validation techniques, such as reachability analysis or model-checking can be applied to an FDT-based document specification in order to analyze application-oriented properties of the design, particularly consistency properties [Cour96].
How to perform such a mapping is the question. The selected method should present some unique features in order to be useful, as well as efficient:
Process algebra meet the first three requirements and LOTOS (Language of Temporal Ordering Specification) becomes a strong candidate because it has been an international standard for a decade. Since standard LOTOS is not able to express time-constrained behaviors, different extensions have been proposed within the international LOTOS community, and are currently being standardized. Among these proposals, we selected RT-LOTOS (Real-Time LOTOS) proposed at LAAS-CNRS [Cour95a],[Cour95b], in particular because the RTL (Real-Time LOTOS Laboratory) software tool is available and operational [Cour94]. The same approach might easily be adapted to the new emerging LOTOS standard (E-LOTOS) once stabilized with an adequate tool support.
The approach developed in the paper addresses the formal semantics of the basic objects of a hypermedia document by providing a mapping between these objects and RT-LOTOS processes, and another mapping between object composition rules and RT-LOTOS process parallel compositions. The approach relies on general mapping rules, as well as on the definition of RT-LOTOS process libraries specifying the behavior of reusable basic hypermedia objects, like content nodes and links. The approach is therefore general and might be fully automated within an authoring system.
...
hide user, start, end, trigger, endRequest, grant in
(Hypermedia[user, start, end, trigger, endRequest, grant] |[user]| Users[user] )
where
process Hypermedia[user, start, end, trigger, endRequest, grant] : exit :=
compositeNode[user, start, end, trigger, endRequest, grant] (...)
|[start,end, trigger, endRequest]|
( parallel composition of the different synchronization links )
endproc
process Users[user] : exit :=
endproc
Figure 1 - Specification overall architecture
The overall architecture of the specification is presented in Figure 1, where the specification of an hypermedia document (process Hypermedia) is composed in parallel with a process representing the document outside environment (process Users), the resulting process being itself composed with a process specifying at a high abstract level the multimedia platform on which the document is presented (process MultimediaPlatform). user, start, end and grant characterize gates on which the composed processes must synchronize themselves.
Process Hypermedia is itself specified as the instanciation of a process specifying the document top level (composite) node, synchronized with processes describing the links defined for this node.
The basic structure in a hierarchical composition model of documents is the Composite Node. This general structure is described by the compositeNode RT-LOTOS process definition as presented in Figure 2.
Process compositeNode has three formal parameters characterizing respectively the node ID, and its minimal and maximal presentation duration. It features six external gates: gates start and end associated with the start and end of the node presentation; gates user and trigger related to the capture of user interactions and to the trigger of an anchor, respectively; gate endRequest related to the termination of the node, and gate grant related to the synchronization with the multimedia platform. All these gates are parameterized in order to identify the node ID and the media type (for gates start, end and endRequest), the anchor ID (for gates user and trigger), different parameters related to a presentation device (for gate grant).
When process compositeNode is instanciated with some ID value and other parameters, the process first synchronizes on gate start. Using the recursive definition of compositeNode, one may note that the composite node may be re-instantiated again Once the synchronization on start is performed, the behavior of the process is essentially described by process bodyNode together with the specification of the termination conditions of the composite node (see process endNode);
Process bodyNode permits to select the body of the current composite node with respect to the node ID; assuming some generic node identifier i, the associated bodyNode process is bodyNodei; it describes the internal logical structure of composite node i; it is made up by the parallel composition of the RT-LOTOS processes characterizing content nodes and/or other composite nodes nested within composite node i.
(* RT-LOTOS formal specification of a generic composite node *)
process compositeNode[start, end, user, trigger, endRequest, grant] (ID, dmin, dmax : nat) : exit :=
let composite : nat = 0 in
start!ID!composite;
( ( bodyNode[start, end, user, trigger, endRequest, grant](ID)
[> endNode[endRequest,end] (ID,dmin,dmax) )
||| compositeNode[start, end, user, trigger, endRequest, grant] (ID, dmin, dmax) )
where
process bodyNode[start, end, user, trigger, endRequest, grant] (ID:nat) : exit :=
[ID == i] -> bodyNodei[start, end, user, trigger, endRequest, grant]
endproc
process endNode[endRequest,end] (ID,dmin,dmax : nat) : exit :=
let composite : nat = 0 in
process bodyNodei[start, end, user, trigger, endRequest, grant] : exit :=
... ||| contentNode[start, end, user, trigger, endRequest, grant] (...)
||| ... ||| compositeNode[start, end, user, trigger, endRequest, grant] (...)
endproc
(* RT-LOTOS formal specification of a generic content node *)
process contentNode[start, end, user, trigger, grant] (ID, dmin, dmax, media : nat) : exit :=
||| contentNode[start, end, user, trigger, grant] (ID, dmin, dmax, media) )
Figure 2 - compositeNode and contentNode generic process definitions
The modularity and hierarchy of a RT-LOTOS specification make it easy to compose the process specifying the document presentation with another process modeling an available multimedia platform. As all interactions are performed by synchronization at the process gates, it is not necessary to distinguish between a centralized and a distributed platform model.
The multimedia platform considered in the paper is rather simple and abstract. It consists essentially in:
As suggested by the excerpts of specifications provided in Figures 1 and 2, we did not try to get the most readable formal RT-LOTOS specification, but instead the most general (with respect to the expressive power of the NCM model) and generic, based on reusable components. The goal was to completely automatize the derivation of the formal specification from an NCM authoring. As a consequence, our specification methodology is based on the identification and formalization of several process libraries characterizing, respectively, basic content nodes, basic links, basic composition of nodes and links and basic termination conditions. Some components of these libraries have been inherited from previous work on a simpler hypermedia synchronization model [Cour96].
Using formal methods within the context of a high-level authoring tool brings two main advantages. First, it provides a formal semantics to the high-level model, describing without any ambiguity the behavior of a presentation. Second, it permits to check consistency properties on the formal specification derived from the high-level model, using standard validation techniques developed and implemented for the RT-LOTOS in RTL software tool [Cour94]. The verification consists in interpreting the minimal reachability graph that has been achieved by means of RTL. Our purpose in the verification process is to prove that the action corresponding to the end of a document presentation is always reachable from the initial state of the associated minimal reachability graph. If this occurs, the document presentation is, by definition [Cour96], consistent. Each node in the graph represents a reachable state. Each arc corresponds to either an action occurrence (event) or a global time progression (arc labeled by time).
Synchronization constraints expressed when specifying a hypermedia document using a high-level approach imply several complex causal and temporal relationships to be fulfilled together when presenting the document. Depending on how these synchronization constraints relate to each other, there is a potential risk to create inconsistent situations, i.e. situations where different contradictory synchronization requirements cannot be handled together. Consequently, the document will deadlock at some point during the presentation. In [Cour96], two different inconsistent situations have been identified, namely the qualitative and the quantitative inconsistencies. An inconsistent situation is qualitative if it does not depend on the particular duration of object presentations and quantitative, otherwise.
In this work, we extended the consistency concept, taking into account the interaction between the object presentations and platform resources. Through this new approach, documents may be verified against two types of property, named intrinsic and extrinsic temporal consistency.
A document is intrinsically consistent if no internal synchronization constraint can lead to inconsistent situations when the presentation is performed over an ideal platform. However, even if a document is intrinsically consistent, his actual presentation can potentially lead to a deadlock situation, because the previous analysis did not take into account time constraints, possibly enforced by the multimedia platform, that may affect the timing of the presentation. Thus, a document is extrinsically consistent if its presentation on an actual and resource-limited platform is also consistent. In this case, the verification will be performed over the global RT-LOTOS specification resulting from the composition of the document specification with the specification of a model of the underlying platform.
Figure 3 depicts the logical structure of a subset of a hypermedia document used as a running example throughout this section. For space and clarity reasons, the example is kept reasonably simple. It nevertheless suffices for illustrating the potential and benefits of the methodology discussed in previous sections.
Composite node C is made up of four content nodes V, A, AT and I, and it also includes three links l3, l4 and l5. The presentation of this composition can be started through links l1 or l2, coming from other parts of the document. Node V contains a video. Node A contains a music that can be played in background. Node AT has a content, which can be either exhibited as a text or synthesized as an audio, depending on the start link chosen. Finally, node I contains a still picture. The end of the presentation of node I triggers link l6, leading to the presentation of another part of the document.

There are two different ways to start the presentation of the composition. When C starts through link l1, nodes V and AT must start at the same time. Node AT must be presented as an audio during d1 seconds, and node V must be presented as a video during d2 seconds. On the other hand, if C starts through link l2, nodes V and A must start at the same time with a duration of d2 seconds and d5 seconds, respectively. So, the presentation of node C depends on the external context, that is, on the navigation that led to presentation of the composite node. Furthermore, C can be presented with a background music (navigation coming through link l2) or without any background music (navigation coming through link l1).

Link l3 contains a temporal relation stating that node AT must start its presentation d3 seconds after the start of node A. It further provides a descriptor, which defines if AT will be presented either as an audio (of duration d1) or as a text (of duration d4). Link l4 specifies that the image I should be presented just after the end of V and that its presentation may last indefinitely. Consequently, the end of the presentation of node I must be defined by another constraint. This is achieved by link l5, which specifies that node I should end as soon as the presentation of node AT ends.
Figure 4 presents the two presentation scenarios expected by the author (expressed by means of timelines)[2]. Figure 4a illustrates what happens when one navigates through link l1, assuming that d1>d2. Figure 4b shows another scenario when one navigates through link l2, assuming that node AT is presented as a text and that d3 + d4 > d2. No temporal inconsistency is featured in these figures.
Let us now assume that the descriptor of node AT has been inappropriately inverted by the document author:

The scenario corresponding to the timeline of Figure 5a is not consistent for the following reason: node AT ended before the start of node I, then node I will never end, and consequently the document presentation will never end either, leading to a deadlock situation. On the other hand, the scenario corresponding to the timeline of Figure 5b is still consistent, even if the descriptor of AT has been modified.
Temporal consistency has been until now studied case by case analyzing the timelines corresponding to the different presentation scenarios. A global verification method, based on the minimal reachability graph that can automatically be generated from the RT-LOTOS specification of an hypermedia document, is now used instead.
Figure 6 depicts the minimal reachability graph associated with the last version of the document. One may note that the reachability of action endC is only valid for the upper branch of the reachability graph (see the selection of anchor2 corresponding to link l2 ). For the other branch (see the selection of anchor1 corresponding to link l1), the transition sequence from state 23 to state 1 shows that the presentation of I has started and after this, only a time progression action can be performed. Action endC is clearly not reachable following this branch. As a consequence, the document is not intrinsically consistent.
The previous intrinsic inconsistency may be corrected in different ways. A easy way consists in adding a delay d6 in the definition of link l1 to postpone the presentation of node AT as a text. The resulting minimal reachability graph is presented in Figure 7, where one can see that action endC is reachable for all the branches of the graph.


Considering the previous example, it was assumed that: (1) the presentation was performed over an ideal multimedia platform with no resource constraints and (2) no additional delays were introduced by the presentation devices. Therefore, presentations started as soon as their logical and temporal constraints were satisfied and no resource contention occurred during the presentation. However, if resources limitations are considered, even if the intrinsic consistency is satisfied, a document may become inconsistent on a specific platform. Figure 8a shows this situation: the consistent presentation (see the reachability graph of Figure 7) features a conflict between nodes AT and A for accessing the audio channel. In our example, this is considered as a design error, since:
This resource conflict can easily be seen on the reachability graph of Figure 9, where, following the upper branch (anchor2) of the graph, one can see a transition sequence with event startA followed by event grantA (grant of the audio channel to the presentation of node A) followed by startAT-A (request of the audio channel for the presentation of node AT as an audio) without any intermediate endA to release the resource previously granted to the presentation of node A, and leading finally to a deadlock situation.
The previous analysis did not either consider additional delays introduced by the resources (due to a CPU sharing, bandwidth limitation, access delays, etc.) during the document presentation. With the platform specification, it is possible to model the elapsed time between the instant where a resource is requested to present a content node and the instant when the presentation effectively starts. Figure 8b shows how the intrinsically consistent presentation of the Figure 5c becomes inconsistent if a delay is introduced before video decompression. Resource delays can be adjusted by a simple parameter modification in the RT-LOTOS platform specification. The lower branch (anchor1) of the reachability graph depicted in Figure 9 corresponding to a document presentation with the video decompression delay of 5 seconds. It is assumed that there is no audio presentation delay. Note that the action endC cannot be executed when the node presentation starts through link l1 (anchor1), because the duration of V is modified by a platform delay and it finishes after the end of AT node. In consequence, the presentation is deadlocked on the presentation of image I. Thus, in all previous cases, action endC is not reachable from initial state. As a consequence, the document is not extrinsically consistent due to a resource conflict (first case) and to platform delays (second case).


From our work, partially discussed in this paper, the following remarks may be drawn. First, it is extremely easy and flexible to work with high-level models at the authoring stage. Second, it is easier for the author to handle document logical structuring than only document presentation structuring. Hence, models based on compositions allowing every type of relationships among their components become very important [Much97]. Third, the use of a hidden formal description technique, like RT-LOTOS, brings two important advantages: (i) it provides a formal semantics for high level hypermedia models, permitting better understanding of some critical behaviors that may be expressed by the model (ii) it allows applying exhaustive validation techniques to check against document design errors.
The methodology discussed allowed pre-defined resource specifications, in RT-LOTOS, to be associated with the RT-LOTOS specification derived from the high-level document model, so that a dynamic validation could be applied to the resulting global specification, taking into account a particular platform. Hypermedia document checking against temporal inconsistencies can now be carried out considering the specific platform on which the document will be presented. The limitation of the proposed method is related to the state space explosion of the reachability graph. The definition of the aggregation method proposed in [Sant98] appears to be an important step towards the verification of complex structured documents.
The work reported in this paper is funded by a CNRS research grant (Action
Télécoms). The first author is supported by a grant from the
Brazilian Government (CAPES). The cooperation between LAAS-CNRS and PUC/RJ is
funded by a grant from CAPES-COFECUB.