File: FEATURES

package info (click to toggle)
tom 1.1.1-2
links: PTS
area: main
in suites: potato
size: 6,340 kB
ctags: 2,244
sloc: objc: 27,863; ansic: 9,804; sh: 7,411; yacc: 3,377; lex: 966; asm: 208; makefile: 62; cpp: 10
file content (367 lines) | stat: -rw-r--r-- 17,327 bytes
$Id: FEATURES,v 1.7 1996/02/11 01:07:47 tiggr Exp $

LANGUAGE CONCEPTS

    TOM is an object oriented language [note 1].  Syntaxtically it has the
    usual C constructs like `if', `for', `while', and expressions; its
    notion of objects is based on Objective-C.  Apart from that,
    everything is different.

  Typing

    TOM has basic types for things processors are good at [note 1a, 1b]:
    8 bit unsigned `byte', 16 bit unsigned `char', 32 bit signed `int', 64
    bit signed `long', 32 bit IEEE single precision floating point
    `float', 64 bit IEEE double precision `double', and the `boolean'.
    The single non-basic type is the object type [note 1c].

    Elements of all types are unboxed, except for the object type, which
    is boxed: an object typed variable is, by definition, a reference.
    This implies that objects can not be statically allocated, be it in
    another object, in the program's data segment or on the stack: all
    objects are allocated on the heap.

    TOM has the notion of tuple types, though the tuple is not a first
    class type: it is not possible to declare a variable with a tuple
    type.  Tuple types are used in method invocations, both as argument
    type and return type, and in simultaneous assignments.  As a
    notational convenience, a singleton tuple denotes both the tuple type
    and the type of its element [note 1c1].

    There is one other type indicator: `id'.  `id' stands for the type of
    the receiver of the current method.  Thus, the method `self' has `id'
    as the return type.  Like any object class indication, `id' can be
    modified by shifting the meta level to its class or instance, as in
    `class (id)' or `instance (id)'.  For example, the root class defines
    an `instance (id) alloc'.

    Arrays are provided by classes which are part of the standard
    environment [note 1d].

  Binding

    All operations on the basic types are statically bound.  Methods,
    instance variables, and class variables are bound dynamically.

    For variables, binding is slightly more expensive than accessing a
    member of a virtual base class in C++; a method invocation is slightly
    more expensive than the invocation of a virtual member function.

    The memory overhead of all binding is one pointer per object.  In
    contrast, in C++ the overhead can be one pointer for the virtual table
    plus one for every superclass needing such a table.  And there is, of
    course, the overhead of initializing the virtual table pointers for
    each object that is created.

  Inheritance

    Like SmallTalk and Objective-C, tom has the notion of instances and
    class objects [note 2].  Unlike Smalltalk, tom allows multiple
    inheritance.

    A class can be deferred; in this case it (or one of its superclasses)
    does not provide the implementation of some method it (or one of its
    superclasses) declared as being deferred.  Only instances of
    non-deferred subclasses of the top `State' class can be instantiated
    [note 3]; class objects and meta class objects exist always.

    A class or instance can inherit the behaviour of another class or
    instance.  Such behaviour can be a pure interface, containing only
    deferred methods, or a full implementation, or somewhere in between.
    Behavioural instances may not have instance variables.  Since a class
    always has state, behavioural classes may define class variables [note
    3a].  Obviously, if an instance behaviour needs access to state,
    deferred accessor methods can be declared.  An example of behavioural
    inheritance is the `Common' class, which is inherited by both the
    State class and the State instance.  (The Common instance serves no
    purpose and is empty.)

    Multiple inheritance raises the issue of repeated inheritance.  In
    C++, the use of a virtual base class indicates that the subclass
    inheriting the base class twice will carry the base class' state only
    once.  If the repeated base class is not virtual the subclass
    inheriting the base N times will carry N copies of the base class.
    These semantics are, as such, not unacceptable were it not that the
    base class defines how it is inherited.  In the context of usability
    of non-source distributed libraries however, they are unacceptable
    [note 4].

    In Eiffel, whether inheritance is repeated or copied can be defined in
    the inheriting class, on a per attribute basis.  This provides great
    flexibility but it also implies that either the sources to all
    superclasses are needed and must be recompiled for the subclass or
    that an elaborate virtual table scheme is needed to access the
    instance variables.  Clearly, again in the context of usability, the
    first implication is unacceptable.  The second solution requires a
    varying value of `this' in methods implemented by different classes,
    which is also unacceptable [note 5].

    In tom, the semantics of repeated inheritance are that the repeated
    superclass is shared between the inheriting subclasses.

  Encapsulation

    Instance variables can not be directly accessed from within any class
    or instance but the instance which declared them, and, if qualified as
    such, by instances of a subclass.  In the first case, the variable was
    declared `private'; in the second case `protected' (the default).  A
    variable can be declared `public'; this declaration implicitly defines
    an accessor method with the same name as the variable.  If it is
    declared `mutable' a modifier method is implicitly defined; for the
    variable `<type> foo', the modifier method will be `void setFoo:
    <type> value'.

    The accessibility of class variables is the same as for instance
    variables, irrespective of whether the variable is accessed from
    within a class or instance method.

    Methods can be declared `public' (the default), implying availability
    to all.  `protected' implies availability to the class itself and
    subclasses thereof.  `private' means availability only to the class
    itself.  `private' methods are intended to be used like C macros and
    are statically bound.

  Usability

    The most important feature of tom is the way it advocates usability.
    Usability is the extent to which a class can be tailored to specific
    needs in case access to the source of certain classes is restricted.
    Such a restriction can be caused by the unavailability of the source
    of a binary distributed library, or it can be imposed on certain
    sources by the policy of a design team.

    If a class exists, but it does not totally fulfil your needs, due to
    bugs for instance, it is a waste of time to have to completely write
    it, or a very similar class, from scratch.  Furthermore, if instances
    of the faulty class are allocated in a location beyond your control
    (within that library, for instance), not being able to tailor it might
    just mean total abstination from using the whole library.

    There are a lot of other circumstances where usability is hampered by
    a language.  For instance, in Eiffel, the availability of only the
    short form of a class, as opposed to its long form, imposes
    restrictions on subclassing that class.  Another example is the
    declaration of `virtual' methods in C++: If a superclass does not
    declare a method as being `virtual', a subclass can not override it.
    This implies that the user of a library has his or her actions
    constrained by the designer of the library being used.

    TOM has the following features to increase its usability beyond
    anything achieved by other languages.

    Dynamic Method Binding

      A class can in no way restrict the way in which a subclass overrides
      any method.

    Extensions

      An extension adds state and behaviour to an existing class.  This
      addition can be performed explicitly, or by inheritance [note 5a].
      An extension may replace the implementation of a method by one
      deemed more appropriate.

      In the context of dynamic loading, if the class which is extended,
      or any of its subclasses, already had instances allocated, the
      extension may not add instance variables.

    Class Posing

      A class can be declared to pose as another class.  The effect of
      this is similar to that of an extension, with one major difference,
      being that, when `replacing' a method, the original method is still
      available for invocation.

      In the context of dynamic loading, if the class which is posed, or
      any of its subclasses, already had instances allocated, the posing
      class may not add instance variables.

    `this' (`self')

      For any incarnation of an object, the value of `this' (C++ speak;
      `self' in Smalltalk speak) is identical.  Thus, in tom, each object
      has a unique identity which can not be changed.

      This is unlike C++, where a `FooBar', inheriting from both `Foo' and
      `Bar', has a different `this' in methods implemented by the `Bar'
      and in methods implemented by the `Foo'.  Even worse, if a `FooBar'
      is passed to a method as a `Bar' and later on retrieved, it won't
      behave as a `FooBar' anymore; it will have become a `Bar', even
      though, by inheritance, it is contained in a `Foobar' [note 6].

      When `this' is identical in all methods, from any superclass of an
      object, `eq' testing (the fastest and semantically most transparent
      way of testing for equality, or better, testing for identity) is
      meaningful.  `Sometimes the whole is greater than the sum of the
      parts'.  Keeping `this' identical in all incarnations of an object
      implies that the whole is always greater than the sum of the parts.

    Dynamic Loading

	Dynamic loading (of classes and extensions) is defined as part of
	the standard environment.

  Overloading

    Methods can be overloaded on the type of the arguments.  Note that
    objects all share the same type, the object type, implying that
    overloading on object class is not possible.  The reason for this is
    that due to the usability, at compile time, the signature of the
    method to be invoked can not be decided depending on the class of an
    argument, and that it is considered too expensive to deduce this at
    run time.

    Note that the return type of a method is not part of the method's
    signature: the return type is insignificant when selecting which
    method to invoke.  The return types of methods with the same name but
    different signatures do not need to be equal.

    Operators can not be overloaded in the C++ sense: addition of objects
    is undefined.

  Method Arguments and Return Values

    Method arguments are passed by value.  Any number of values can be
    returned by a method, by collecting them in a tuple.  This way, `in'
    parameters are normal arguments, `out' parameters are part of the
    return value, and `inout' parameters are handled through a combination
    of an `in' parameter and an `out' parameter.

    The arguments in a method invocation are evaluated from left to right.
    The elements of a tuple argument are evaluated from left to right.

  Interfacing to Other Languages

    The tom compiler translates tom to C [note 7].  Any tom method can be
    implemented in C by declaring it `foreign "C"'; tom methods can be
    easily invoked from C, and most interesting C functions can be invoked
    from within tom.  Note that such `normal' C functions are still
    dyanmically bound.

  Namespaces and scoping

    The global namespace holds units.  A unit is conceptually like a
    library; it holds classes and extensions, the latter probably adding
    or modifying classes from other units.

    Within a unit, all class names must be unique.  Class names can be
    qualified.  When referring to a class without qualifying its name, it
    either denotes a class in the current unit, or must uniquely denote a
    class in any other known unit.

    Extensions are never referred to from within code, but they do have a
    name to be able to discern all extensions of a class.  The names of
    all extensions of a class must be unique within that class.  Thus,
    extension names are never qualified.

  Casting

    Numeric types can be casted to each other; the compiler will perform
    such a cast implicitly if necessary.

    An object can be implicitly cast to one of its superclasses.  It can
    explicitly be cast to one of it subclasses.  The reason for this is to
    accomodate the following, normal, scenario: Suppose the existence of
    an object of which some instance variable can be read and set through
    the methods `(A) value' and `(void) setValue: (A)', respectively.
    This variable can be set to any B, as long as B is a subclass of A.
    If the value is then retrieved, the object is still a B, and should be
    allowed to be a B.  The B can then used as an argument to another
    method, which expects an object of type C, as long as B is a subclass
    of C.

    Note that this free casting is only possible if for all incarnations
    of an object, the value of `this' does not vary, or an elaborate
    virtual table administration is maintained and casting becomes an
    expensive operation due to the checks needed [note 8].

TOM RUNTIME

    This section presents some features available to tom programs, and
    which are provided primarily by the tom runtime

  Memory Management

    The tom runtime provides automatic, configurable, time constrained,
    garbage collection.

  Flexible Method Forwarding

    If an object does not respond to a method, the method `forward::' is
    invoked instead, which is passed the original selector and an array of
    boxed arguments (i.e. the original arguments with each unboxed
    argument wrapped in an object).

    The basic method for computed method invocation is through
    `perform::', which accepts as it argument the selector of the method
    to be invoked, and an array of boxed arguments.  The implementation of
    `perform::' as inherited from the State class will invoke the
    indicated method of the receiver with the indicated arguments.  Note
    that this circumvenes the encapsulation of methods as known to the
    compiler.  Also note that methods declared `private' can not be
    perform::ed.

    These two methods together form the basis of a lot of possible
    abstract functionality, such as method currying and distributed
    objects.

NOTES

  1 The word `tom' is written in same-caps: normally lower case but if
    rules apply which state that the word should be capitalized, tom is
    written in upper case.

 1a Yes, I know I should renumber the notes.

 1b These basic types very much resemble the types as found in Java.

 1c Enumerations and bitsets are being considered as a type.

1c1 Obviously, this is more than a notational convenience; it is also
    needed to be able to put parentheses around an expression.

 1d Arrays are also being considered as a type.  If included, arrays will
    be available in various forms, with fixed or flexible base index and
    with a fixed or flexible upper bound.  Every array access will of
    course be bounds checked.

  2 Of course, meta class objects also exist in the runtime; they are
    needed to describe the behaviour and state of the class objects; the
    behaviour of meta class objects is defined by the State meta class
    object; meta class objects do not have state.

    On a related issue, in Objective-C the distinction between a class and
    its instances is rather blurred.  For example, it does not matter if
    you ask `[a isKindOf: [Foo self]]' or `[a isKindOf: [Foo class]]',
    where as at most one of there expressions will return TRUE in tom.
    Other examples are the usage of `+new' to set instance variables and
    the fact that class objects can not conform to instance methods set
    for a protocol, or the other way around.

  3 The `State' class is needed by the runtime system.  The `State'
    instance provides a reference (by the name of `isa') to the object's
    class.  The `State' class provides an identically named reference to
    the class' class.

 3a However, if a behaviour class defines state, it may not be inherited
    as behaviour for an instance.

  4 The term `usability' is defined later in this document.

  5 Why the value of `this' is not allowed to change is defined later in
    the document, in the section on usability.

 5a With this definition, a class is nothing more than a container of
    extensions, the main extension being mandatory.

  6 Only through virtual methods will the retrieved Bar behave as a
    FooBar.

  7 A better compiler would generate assembly instead of C.  For instance,
    a C compiler can not know that the variable `isa' and the binding (and
    other runtime) information it points to are all constant.

  8 Also note that casting can be used too liberately by calling
    everything an `Any', the superclass of all classes, although arrays
    and other containers are not parameterized in the objects they hold;
    the element of a generic container is always an Any.