2-D ANALOGUES OF ALLEN INTERVAL ALGEBRA FOR
IMAGE ANALYSIS: TOWARDS JUSTIFICATION
SCOTT A. STARKS, DIMA IOURINSKI, and VLADIK KREINOVICH
NASA Pan-American Center for Earth and Environmental Studies
University of Texas, El Paso, TX 79968, USA, vladik@cs.utep.edu

Abstract

1.2

In reasoning about time and duration, researchers often use Allen’s Interval Algebra.
This algebra describes possible relations between 1-D intervals. An interval can precede
the other one, follow the other one, start the
other one, etc. This algebra describes the relationship between different intervals in terms
of words from natural language. To give a natural language description of 2D images, it is
desirable to develop a similar approach for describing the relationship between 2-D objects
in a picture. In their recent papers, Jim Keller
and his collaborators proposed a new approach
based on a simulation of a “force” between
these objects. In this paper, we show that their
force formula is theoretically optimal.

Keywords
Image analysis, relation between images.

1 Introduction
1.1

How to describe relation between
2-D objects: informal description
of the problem

In reasoning about time and duration, researchers often use Allen’s Interval Algebra;
see, e.g., [2–5]. This algebra describes possible
relations between 1-D intervals. An interval can
precede the other one, follow the other one, start
the other one, etc. This algebra describes the relationship between different intervals in terms
of words from natural language.
To give a natural language description of 2D
images, it is desirable to develop a similar approach for describing the relationship between
2-D objects in a picture.

Force histogram method of
describing the relation between
2-D objects

A new method of describing the relation between the 2-D objects, a method that seems to
be in a very good accordance with the expert
reasoning, was recently described in [6], [7],
[9]. Namely, to describe the relation between
the two sets A and B , this “force histogram”
method proposed, crudely speaking, to do the
following:
 take all possible pairs a 2 A and b 2 B ,
 compute, for each pair, the “force” whose
direction is from a to b and whose value is
r?2 , and then,
 add up these force vectors.

The orientation of the resulting vector F~ describe the intuitive understanding of where the
sets are in relation to each other, and the length
of this vector describes how close these sets are
to each other.

1.3

This method is somewhat ad hoc

In the above description of the force
method, the idea that we should:




take all possible pairs compute, for each
pair of points a 2 A and b 2 B , the
“force” whose direction is from a to b and
whose value f (r) depends on the distance
between the points, and then,

add up these force vectors
is very natural. What is not very natural is the
exact form of the dependence on f (r) on r. It is
therefore desirable to consider all possible dependencies and come up with the optimal (best
possible) function f (r).

Proceedings of the 2001 IEEE Systems, Man, and Cybernetics Conference
Copyright c 2001

1.4

What we are planning to do

In order to solve this problem, we must
describe, in precise terms, which functions
are possible and which functions are the best.
In this paper, we propose such a description,
and show that this description leads to a new
symmetry-based justification for the above formula (actually, for a slightly more general formula r? ).
This new justification is in line with the
general symmetry-based approach which has
been shown, in [8], to explain similar heuristic formulas in fuzzy, neural, genetic, and other
approaches, and to explain a similar heuristic
force formula in robotic control.

2 First Idea: Only Decreasing
Functions f (r) Make Sense
The function f (r) describes the “weight”
with which directions between different pairs
(a; b) influence the resulting sum. Intuitively,
when we decide which of the two objects is,
say, to the left and which is to right, we pay
more attention to close points and less attention
to points which are far away from each other.
Therefore, it is reasonable to require that the
weight f (r) should be larger for closer points
and smaller for more distant points. In other
words, it is reasonable to require that the function f (r) be strictly decreasing.

3 Second Idea: We Must Choose a
Family of Functions, Not a Single
Function
Once we select a force function f (r), for
every two sets A and B , the orientation of the
resulting vector F~ describe the intuitive understanding of where the sets are in relation to each
def
other, and the length ` = = jF~ j of this vector
describes how close these sets are to each other.
Our main goal is to describe the orientation,
i.e., the typical angle between the two sets. We
also get some information from the value of the
“closeness” `, but this information comes only
from the comparison of different pairs of sets. If
we simply say that the closeness between two
sets A and B is, say, 1.5, we did not gain any
knowledge about how close they are.

If, however, for some other pair of sets A0 and
B 0 , the closeness is 4.5, we can already make
some meaningful conclusions: namely, we can
say that A is closer to B than A0 is close to B 0 .
We can even say that A is three times closer to
B than A0 is close to B 0 .
From this viewpoint, the absolute values of
F do not carry much information, what carries
information is the relation between the values F
corresponding to different pairs of sets.
In particular, if we multiply all the values of
the original function f (r) by a constant, i.e., if
we replace the function f (r) with a new funcdef
tion fe(r) = C  f (r) for some constant C , then
all the values of the forces get multiplied by C
and hence, all the resulting vectors F~ are multiplied by the same constant C . When we multiply a vector by a constant, it orientation does
not change, and its length gets multiplied by the
same constant.
Thus, if we replace the function f (r) by a
new function C  f (r), for every pair of sets, we
get the exact same orientation as before, and for
every two pairs, we get the exact same ratio between their “closenesses”. So, intuitively, there
is no big difference between using the original
function f (r) and the new function C  f (r).
Hence, we cannot select a unique function
f (r) and claim it to be the best, because for every function f (r), the function C  f (r) leads to
exactly the same results. In view of this, instead
of formulating a problem of choosing the best
force function, it is more natural to formulate a
problem of choosing the best family fC  f (r)gC
of force functions.

4 Which Family Is the Best? We
May Need Non-Numerical
Optimality Criteria
Among all the families fC f (r)gC , we want
to choose the best one.
In mathematical optimization problems, numerical criteria are most frequently used, when
to every alternative (in our case, to each family) we assign some value expressing its performance, and we choose an alternative (in
our case, a family) for which this value is the
largest. In our problem, as such a numerical criterion, we can select, e.g., the average approximation error A, measured as the mean square
deviation between the orientation generated by
the corresponding force method and the orientation marked by an expert.

Proceedings of the 2001 IEEE Systems, Man, and Cybernetics Conference
Copyright c 2001

However, it is not necessary to restrict ourselves to such numerical criteria only. For example, if we have several different families that
have the same average approximation error A,
we can choose between them the one that has
the minimal computation time T . In this case,
the actual criterion that we use to compare two
families is not numerical, but more complicated: a family F1 is better than the family
F2 if and only if either A(F1 ) < A(F2 ), or
A(F1 ) = A(F2 ) and T (F1 ) < T (F2 ). A criterion can be even more complicated. What a
criterion must do is to allow us for every pair of
families to tell whether the first family is better
with respect to this criterion (we’ll denote it by
F1  F2 ), or the second is better (F1  F2 ) or
these families have the same quality in the sense
of this criterion (we’ll denote it by F1  F2 ).
Of course, it is necessary to demand that these
choices be consistent, e.g., if F1  F2 and
F2  F3 then F1  F3 .

5 The Optimality Criterion Must
Select a Unique Optimal Family
Another natural demand is that this criterion must choose a unique optimal family (i.e.,
a family that is better with respect to this criterion than any other family). The reason for this
demand is very simple.
If a criterion does not choose any family at
all, then it is of no use.
If several different families are “the best”
according to this criterion, then we still have a
problem to choose among those “best”. Therefore, we need some additional criterion for that
choice. For example, if several families turn out
to have the same average approximation error,
we can choose among them a family with the
minimal computation time.
So what we actually do in this case is abandon that criterion for which there were several
“best” families, and consider a new “composite” criterion instead: F1 is better than F2 according to this new criterion if either it was better according to the old criterion or according to
the old criterion they had the same quality and
F1 is better than F2 according to the additional
criterion.
In other words, if a criterion does not allow
us to choose a unique best family it means that
this criterion is not final. We have to modify it
until we come to a final criterion that will have
that property.

6 The Optimality Criterion Must
Be Scale-Invariant
The next natural condition that the criterion
must satisfy is connected with the fact that the
numerical value of the distance r depends on
the choice of the unit for measuring distance.
If we replace the original unit of length by
a new unit which is  times larger (i.e., replace
feet by meters), then numerical values change
from r to re = r=. How will the force function
look in the new units? Let us assume that in the
new units, the distance between the two points
equals re. Then, the same distance in the old
units is equal to r =   re. Thus, the force between the two points is equal to f (r) = f (  e).
Thus, if we know the distance re in the new units,
we can compute the corresponding force value
as fe(re), where fe(z ) denotes f (  z ). So, the
same force function which, in the old units, had
the form f (r), in the new units, has a new form
f (  r).
Since this change is simply a change in a
unit of length, it is reasonable to require that going from f (r) from f (  r) should not change
the relative quality of the force functions, i.e.,
if a family fC  f (x)gC is better that the family
fC  g(r)gC , then for every  > 0, the family fC  f (  r)gC must be still better than the
family fC  g (  r)gC .
So, we arrive at the following definitions.

7 Definitions and the Main Result
Definition 1.
 By a force function we mean a strictly decreasing function from non-negative real
numbers to non-negative real numbers.
 By a family of functions we mean the family fC f (r)gC , where f (r) is a given force
function and C runs over arbitrary positive
real numbers.
 A pair of relations (; ) is called consistent [8] if it satisfies the following conditions:
(1) if a  b and b  c then a  c;
(2) a  a;
(3) if a  b then b  a;
(4) if a  b and b  c then a  c;
(5) if a  b and b  c then a  c;
(6) if a  b and b  c then a  c;
(7) if a  b, then b  a or a  b are
impossible.

Proceedings of the 2001 IEEE Systems, Man, and Cybernetics Conference
Copyright c 2001

Definition 2.
 Assume a set A is given. Its elements will
be called alternatives. By an optimality
criterion we mean a consistent pair (; )
of relations on the set A of all alternatives.
If b  a, we say that a is better than b; if
a  b, we say that the alternatives a and
b are equivalent with respect to this criterion.
 We say that an alternative a is optimal (or
best) with respect to a criterion (; ) if
for every other alternative b either b  a or
a  b.
 We say that a criterion is final if there exists an optimal alternative, and this optimal
alternative is unique.
 Let  > 0 be a real number. By the
-rescaling R (f ) of a function f (r) we
def
mean a function (R (f ))(r) = f (  r).
 By the -rescaling R (F ) of a family F ,
we mean the set of the functions that are
obtained from f 2 F by -rescaling.
In this paper, we consider optimality criteria on
the set F of all families.
Definition 3. We say that an optimality criterion on F is scale-invariant if for every two
families F and G and for every number  > 0,
the following two conditions are true:
i) if F is better than G in the sense of this
criterion (i.e., G  F ), then

ii)

R (G)  R (F );
if F is equivalent to G in the sense of this
criterion (i.e., F  G), then
R (F )  R (G):

As we have already remarked, the demands
that the optimality criterion is final and scaleinvariant are quite reasonable. The only problem with them is that at first glance they may
seem rather weak. However, they are not, as the
following Theorem shows:
Theorem. If a family F is optimal in the sense
of some optimality criterion that is final and
scale-invariant, then every force function f (r)
from this optimal family F which has the form
f (r) = A  r? for some real numbers A and .

Thus, our theorem says, in effect, that the force
functions used by J. Keller (or, to be more precise, a slightly more general class of force functions) are indeed optimal.

8 Proof
1. Let us first prove that the optimal family Fopt
is scale-invariant, i.e., that R (Fopt ) = Fopt for
all  and .
Indeed, let  > 0 be a positive real number. By definition of optimality, for every family F 2 F , we have F  Fopt . In particular,
we can conclude that R1= (F )  Fopt . From
the scale-invariance of the optimality criterion,
we can now conclude that R (R1= (F )) 
R (Fopt ), i.e., that F  R (Fopt ). This is true
for every family F 2 F and therefore, the family R (Fopt ) is optimal. But since the criterion
is final, there is only one optimal family; hence,
R (Fopt ) = Fopt . So, the optimal family is
indeed scale-invariant.
2. Scale-invariance R (Fopt ) = Fopt means, in
particular, that for every function f (r) from the
optimal family Fopt , we have R (f ) 2 Fopt .
By definition, R (f ) means f (  r), and Fopt
consists of all the functions of the type C  f (r).
Thus, we conclude that for every  > 0, there
exists a C > 0 (depending on ) for which, for
every r, we have

f (  r) = C ()  f (r):

(1)

All decreasing solutions of this functional equations are known (see, e.g., [1]); these solutions
are f (r) = A  r? . Thus, the theorem is
proven.
For readers’ convenience, let us describe
how this can be proven when f (r) is a differentiable function. In this case, f (  r) is a
differentiable function as well, hence their ratio C () = f (  r)=f (r) is also differentiable.
Thus, both sides of the equality (1) are differentiable with respect to . Differentiating both
sides relative to  and substituting  = 1, we
get the following equation:

r  dfdr(r) = C0  f (r);

(2)

where C0 denotes the derivative of the function
C () at  = 1. Multiplying both sides of the
equation (2) by dr=(r  f ), we conclude that:

Proceedings of the 2001 IEEE Systems, Man, and Cybernetics Conference
Copyright c 2001

C0  drr :
Integrating, we get ln(f ) = C0  ln(r) + C1 .
df
f

=

Hence,

f = exp(C  ln(r) + C1 ) = exp(C1 )  rC0 ;
i.e., the desired formula for A = exp(C1 ) and
= ?C0 .
Acknowledgments
This work was supported in part by NASA
grants NCC5-209 and NCC 2-1232, by the Air
Force Office of Scientific Research grant number F49620-00-1-0365, by Grant No. W-00016
from the U.S.-Czech Science and Technology
Joint Fund, and by Grant NSF 9710940 Mexico/Conacyt.
The authors are thankful to Jim Keller for
useful discussions.

References
[1] J. Aczel, Lectures on functional equations
and their applications, Academic Press, N.Y.–
London, 1966.
[2] J. Allen, Maintaining knowledge about temporal intervals”, Communications of the ACM,
1983, Vol. 26, pp. 832–843 (reprinted in [10]).
[3] J. Allen, Towards a general theory of action
and time, Artificial Intelligence, 1984, Vol. 23,
pp. 123–154.

[4] J. Allen and P. Hayes, A common-sense theory of time, Proceedings of the 9th International
Joint Conference on Artificial Intelligence, Los
Angeles, 1985, pp. 528–531.
[5] J. F. Allen and H. A. Kautz, A model of
naive temporal reasoning, In: J. R. Hobbs and
R. C. Moore (eds), Formal theories of the commonsense world, Ablex, Norwood, NJ, 1985,
pp. 251–268.
[6] J. M. Keller and P. Matsakis, Aspects
of hugh-level computer vision using fuzzy
sets, Proceedings FUZZ-IEEE’99, Seoul, Korea, 1999, pp. 847–852.
[7] P. Matsakis and L. Wendling, A new way
to represent the relative position between areal
objects, IEEE Transactions on Pattern Analysis
and Machine Intelligence, 1999, Vol. 21, No. 7,
pp. 634–643.
[8] H. T. Nguyen and V. Kreinovich, Applications of continuous mathematics to computer
science, Kluwer, Dordrecht, 1997.
[9] O. Sjahputera, J. M. Keller, P. Mastakis,
P. Gader, and J. Marjamaa, Histogram-based
matching measures, Proceedings of the 19th International Conference of the North American
Fuzzy Information Society NAFIPS’2000, Atlanta, Georgia, July 13–15, 2000, pp. 392–396.
[10] D. Weld and J. de Kleer, Qualitative reasoning about physical systems, Morgan Kaufmann, San Mateo, CA, 1989.

Proceedings of the 2001 IEEE Systems, Man, and Cybernetics Conference
Copyright c 2001