This paper presents the idea of interval scripts as a means for programming interactive environments and computer characters

This paper presents the idea of interval scripts as a means for programming interactive environments and computer characters. The main issue that is being addressed is the need to deal with user and media actions that occur over time and not instantly (states take time).

The paper starts out describing scenarios where traditional systems break down and represents an alternative to event based systems. The authors claim that the use of a constraint based network provides a more maintainable and expandable way to program interaction sequences.

Programming is accomplished by establishing temporal relationships as constraints between the intervals. Unlike previous temporal constraint-based programming languages we employ a strong temporal algebra based in Allen’s interval algebra with the ability to express mutually exclusive intervals and to define complex temporal structures. To avoid the typical computational complexity of strong temporal algebras we propose a method, PNF propagation, that projects the network implicit in the program into a simpler, 3-valued (past, now, future) network where constraint propagation can be conservatively approximated in linear time.

An important observation is that interval scripts decouple actual state from desired state. This allows for methods of recovery from sensor error resulting in a more robust system.

All actions have non-zero duration.

in interval scripts we de-couple the goal of starting or stopping an action (represented by the START and STOP functions) from the actual state of its associated node.

This paper is very interesting because it simplifies the way time is modeled in interactive installations. I have two critiques to make.

First, there needs to be in the nomenclature a way of specifying causality in primitive relationships. In the various cases like A before B or A during B, there needs to be a reference to weather or not A caused B or A and B can occur in this given relationship. For instance in the A before B relationship, it could be that the system sets a flag after A occurs and then opens a gate to allow B to happen in relation to some other event. Or, it could be that A occurs and then some specified time after that, perhaps random, B occurs because A occurred. It seems that the authors are assuming the latter because they are using a constraint analysis to determine what comes next. What is not clear is how conflicts are resolved in the situation where A causes B and C causes D but D is incompatible with B.

Another confusing thing with the primitive relationships is that the wording is not intuitive in the inverse case. Furthermore, the inverse relationship case seems to be redundant. A i-before B is the same as B before A. The model can be simplified to have 7 time relationships if you remove the inverse case and add a convenient 'not' or 'inverse' operator. Furthermore, they use complex compound statements to express simple relationships, which calls into question the simplicity of the thinking required to program and debug a system. For instance, the phrase A start OR equal OR i-start B represents the case where the sensor and the action have the same starting time. Why not introduce an A causes B phrase to represent this case?

Another thing that can simplify the model is to assume that the before and meet cases are the same thing and that an expression is associated with the relationship of A to B that evaluates to a time value representing the amount of time between the end of one and the beginning of the other. This expression could also be applied to the overlap and during cases (this time is implied in the equal, start and finish cases). Now the model is down to 6 cases with a time expression and an not operator. With this expression, the finish and start cases are specific instances of the overlap case where the time expression is B-A for the finish case and 0 for start.

Also, the means by witch the synchronization occurs is a component that has to be considered. In the equal case, how does the system make the start and end times equivalent? Does it cut off one if its too long? Is the short one speeded up or slowed down? etc.

A equal B (the activities begin and end together, there is a how expression here)

A before B (time between, must be positive)

A overlap B (time B overlaps A, must be less than A's length and positive)

A start B (system makes them start together)

A finish B (the system makes them finish together)

A during B (time to start A after B starts, must be less than B length - A length and positive)

(Actually, the constraints could be removed to create degenerate cases). Another observation is that there are at least three interacting processes going on here. There is the constraint based analysis that is based on the PNF network as the most visible component. However, the constraints A and B are sensing activities as well as control activities (here I am using the word control to mean anything the computer does in response). Because sensing is involved, there is a degree of uncertainty in its calculation that must be taken into account. Rightly, the author note that there is duration to these activities, but doesn't note that the constraints based on sensing can be relaxed or strengthened based on the technique used to form the constraint. The other process that is occurring here is related to timing of events in relationship to each other. While duration is noted, there is no discussion about how much time should elapse between A and B if A causes B to occur.

The system could be reduced further by noting that really all that is necessary is to specify the time between the events (possibly negative, possibly random, possibly boolean) and the duration of each of the interrelating events. But that's not the point, the point is that the authors are creating a means to organize responses to interactions through a constraint based language.

The fundamental concept of the interval scripts paradigm is that all actions and states are associated with the temporal interval in which they occur. However, the actual beginning and end time of the intervals are not part of the script. Instead, the script contains a description of the temporal relationships or constraints that the intervals should satisfy during run-time. These constraints are enforced by a run-time engine that examines the current states of all intervals, and then coordinates the system’s reaction to ensure that the interval’s states respect the constraints.

The key is that this scheme, at least this is my analysis of the problem, relates to the interrelationship between parallel sequences, processes, or virtual entities. In a control system that has multiple layers of activity, there sometimes need to be synchronization points where activity occurs in concert. The sensing aspect of an interactive system is in effect another layer of activity that occurs asynchronously with the control layer, with both layers affecting each other back and forth.

The other thing that is a bit hard to accept is the idea that the computer will resolve what happens when based on a set of rules or algebra network. While this is useful in a character situation where a virtual actor it might not be as appropriate in other situations where causal relationships need to be established.

This kind of programming could especially be problematic in interaction situations where the user is required to do something to move the system forward, but the programming is too ridged to allow for a softening of constraints. However, this may not be a valid critique because the softening could occur through the way the states of the constraints are calculated. For example in their example where a conflict occurs, the camera can't take a picture until the user stops moving. If the calculation of the user stops moving is sloppy, then the constraint is effectively relaxed.

This paper exposes a lot of issues in interaction that are interesting to think about and poses an interesting approach to a solution of these problems. Yet it hasn't quite hit the right notes, although it has stimulated the conversation in this area.

Interesting statements:

1. little attention has been paid to the special requirements imposed by interactive environments on programming methods. In particular, in highly immersive situations both the actions and states of the environment and its computational agents, as well as the actions of the users, are not instantaneous events, but take time to be performed. This makes difficult the use of the traditional event-based programming paradigm commonly employed in desktop interaction.

Although the problem of system actions with nonzero duration has been the object of research in the multimedia community, interactive environments extend significantly the problem since they include situations where the user’s actions also extend through periods of time.

2 The most common technique used to program and control interactive applications is to describe the interaction through finite-state machines. This is the case of one of the most popular languages for the developing of multimedia software, Macromedia Director’s Lingo [19]. In Lingo the interaction is described through the handling of events whose context is associated with specific parts of the animation. There are no provisions to remember and reason about the history of the interaction and the management of story lines. The same situation occurs with Max [20], a popular language for control of music devices (see Roads [21]) and adopted in some interactive art installations [22]

Videogames are traditionally implemented through similar event-loop techniques [23]. To represent the interaction history, the only resort is to use state descriptors whose maintenance tends to become a burden as the complexity increases. Most of all, a fundamental problem with the finite state model is that it lacks appropriate ways to represent the duration and complexity of human action, computational agents, and interactive environments: hidden in the structure is an assumption that actions and occurrences are pinpoint-like events in time (coming from the typical point-and-click interfaces for which those languages are designed).

3. Although an appealing model, the use of multiple agents without centralized control makes authoring extremely difficult in practice.