1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367
|
$Id: FEATURES,v 1.7 1996/02/11 01:07:47 tiggr Exp $
LANGUAGE CONCEPTS
TOM is an object oriented language [note 1]. Syntaxtically it has the
usual C constructs like `if', `for', `while', and expressions; its
notion of objects is based on Objective-C. Apart from that,
everything is different.
Typing
TOM has basic types for things processors are good at [note 1a, 1b]:
8 bit unsigned `byte', 16 bit unsigned `char', 32 bit signed `int', 64
bit signed `long', 32 bit IEEE single precision floating point
`float', 64 bit IEEE double precision `double', and the `boolean'.
The single non-basic type is the object type [note 1c].
Elements of all types are unboxed, except for the object type, which
is boxed: an object typed variable is, by definition, a reference.
This implies that objects can not be statically allocated, be it in
another object, in the program's data segment or on the stack: all
objects are allocated on the heap.
TOM has the notion of tuple types, though the tuple is not a first
class type: it is not possible to declare a variable with a tuple
type. Tuple types are used in method invocations, both as argument
type and return type, and in simultaneous assignments. As a
notational convenience, a singleton tuple denotes both the tuple type
and the type of its element [note 1c1].
There is one other type indicator: `id'. `id' stands for the type of
the receiver of the current method. Thus, the method `self' has `id'
as the return type. Like any object class indication, `id' can be
modified by shifting the meta level to its class or instance, as in
`class (id)' or `instance (id)'. For example, the root class defines
an `instance (id) alloc'.
Arrays are provided by classes which are part of the standard
environment [note 1d].
Binding
All operations on the basic types are statically bound. Methods,
instance variables, and class variables are bound dynamically.
For variables, binding is slightly more expensive than accessing a
member of a virtual base class in C++; a method invocation is slightly
more expensive than the invocation of a virtual member function.
The memory overhead of all binding is one pointer per object. In
contrast, in C++ the overhead can be one pointer for the virtual table
plus one for every superclass needing such a table. And there is, of
course, the overhead of initializing the virtual table pointers for
each object that is created.
Inheritance
Like SmallTalk and Objective-C, tom has the notion of instances and
class objects [note 2]. Unlike Smalltalk, tom allows multiple
inheritance.
A class can be deferred; in this case it (or one of its superclasses)
does not provide the implementation of some method it (or one of its
superclasses) declared as being deferred. Only instances of
non-deferred subclasses of the top `State' class can be instantiated
[note 3]; class objects and meta class objects exist always.
A class or instance can inherit the behaviour of another class or
instance. Such behaviour can be a pure interface, containing only
deferred methods, or a full implementation, or somewhere in between.
Behavioural instances may not have instance variables. Since a class
always has state, behavioural classes may define class variables [note
3a]. Obviously, if an instance behaviour needs access to state,
deferred accessor methods can be declared. An example of behavioural
inheritance is the `Common' class, which is inherited by both the
State class and the State instance. (The Common instance serves no
purpose and is empty.)
Multiple inheritance raises the issue of repeated inheritance. In
C++, the use of a virtual base class indicates that the subclass
inheriting the base class twice will carry the base class' state only
once. If the repeated base class is not virtual the subclass
inheriting the base N times will carry N copies of the base class.
These semantics are, as such, not unacceptable were it not that the
base class defines how it is inherited. In the context of usability
of non-source distributed libraries however, they are unacceptable
[note 4].
In Eiffel, whether inheritance is repeated or copied can be defined in
the inheriting class, on a per attribute basis. This provides great
flexibility but it also implies that either the sources to all
superclasses are needed and must be recompiled for the subclass or
that an elaborate virtual table scheme is needed to access the
instance variables. Clearly, again in the context of usability, the
first implication is unacceptable. The second solution requires a
varying value of `this' in methods implemented by different classes,
which is also unacceptable [note 5].
In tom, the semantics of repeated inheritance are that the repeated
superclass is shared between the inheriting subclasses.
Encapsulation
Instance variables can not be directly accessed from within any class
or instance but the instance which declared them, and, if qualified as
such, by instances of a subclass. In the first case, the variable was
declared `private'; in the second case `protected' (the default). A
variable can be declared `public'; this declaration implicitly defines
an accessor method with the same name as the variable. If it is
declared `mutable' a modifier method is implicitly defined; for the
variable `<type> foo', the modifier method will be `void setFoo:
<type> value'.
The accessibility of class variables is the same as for instance
variables, irrespective of whether the variable is accessed from
within a class or instance method.
Methods can be declared `public' (the default), implying availability
to all. `protected' implies availability to the class itself and
subclasses thereof. `private' means availability only to the class
itself. `private' methods are intended to be used like C macros and
are statically bound.
Usability
The most important feature of tom is the way it advocates usability.
Usability is the extent to which a class can be tailored to specific
needs in case access to the source of certain classes is restricted.
Such a restriction can be caused by the unavailability of the source
of a binary distributed library, or it can be imposed on certain
sources by the policy of a design team.
If a class exists, but it does not totally fulfil your needs, due to
bugs for instance, it is a waste of time to have to completely write
it, or a very similar class, from scratch. Furthermore, if instances
of the faulty class are allocated in a location beyond your control
(within that library, for instance), not being able to tailor it might
just mean total abstination from using the whole library.
There are a lot of other circumstances where usability is hampered by
a language. For instance, in Eiffel, the availability of only the
short form of a class, as opposed to its long form, imposes
restrictions on subclassing that class. Another example is the
declaration of `virtual' methods in C++: If a superclass does not
declare a method as being `virtual', a subclass can not override it.
This implies that the user of a library has his or her actions
constrained by the designer of the library being used.
TOM has the following features to increase its usability beyond
anything achieved by other languages.
Dynamic Method Binding
A class can in no way restrict the way in which a subclass overrides
any method.
Extensions
An extension adds state and behaviour to an existing class. This
addition can be performed explicitly, or by inheritance [note 5a].
An extension may replace the implementation of a method by one
deemed more appropriate.
In the context of dynamic loading, if the class which is extended,
or any of its subclasses, already had instances allocated, the
extension may not add instance variables.
Class Posing
A class can be declared to pose as another class. The effect of
this is similar to that of an extension, with one major difference,
being that, when `replacing' a method, the original method is still
available for invocation.
In the context of dynamic loading, if the class which is posed, or
any of its subclasses, already had instances allocated, the posing
class may not add instance variables.
`this' (`self')
For any incarnation of an object, the value of `this' (C++ speak;
`self' in Smalltalk speak) is identical. Thus, in tom, each object
has a unique identity which can not be changed.
This is unlike C++, where a `FooBar', inheriting from both `Foo' and
`Bar', has a different `this' in methods implemented by the `Bar'
and in methods implemented by the `Foo'. Even worse, if a `FooBar'
is passed to a method as a `Bar' and later on retrieved, it won't
behave as a `FooBar' anymore; it will have become a `Bar', even
though, by inheritance, it is contained in a `Foobar' [note 6].
When `this' is identical in all methods, from any superclass of an
object, `eq' testing (the fastest and semantically most transparent
way of testing for equality, or better, testing for identity) is
meaningful. `Sometimes the whole is greater than the sum of the
parts'. Keeping `this' identical in all incarnations of an object
implies that the whole is always greater than the sum of the parts.
Dynamic Loading
Dynamic loading (of classes and extensions) is defined as part of
the standard environment.
Overloading
Methods can be overloaded on the type of the arguments. Note that
objects all share the same type, the object type, implying that
overloading on object class is not possible. The reason for this is
that due to the usability, at compile time, the signature of the
method to be invoked can not be decided depending on the class of an
argument, and that it is considered too expensive to deduce this at
run time.
Note that the return type of a method is not part of the method's
signature: the return type is insignificant when selecting which
method to invoke. The return types of methods with the same name but
different signatures do not need to be equal.
Operators can not be overloaded in the C++ sense: addition of objects
is undefined.
Method Arguments and Return Values
Method arguments are passed by value. Any number of values can be
returned by a method, by collecting them in a tuple. This way, `in'
parameters are normal arguments, `out' parameters are part of the
return value, and `inout' parameters are handled through a combination
of an `in' parameter and an `out' parameter.
The arguments in a method invocation are evaluated from left to right.
The elements of a tuple argument are evaluated from left to right.
Interfacing to Other Languages
The tom compiler translates tom to C [note 7]. Any tom method can be
implemented in C by declaring it `foreign "C"'; tom methods can be
easily invoked from C, and most interesting C functions can be invoked
from within tom. Note that such `normal' C functions are still
dyanmically bound.
Namespaces and scoping
The global namespace holds units. A unit is conceptually like a
library; it holds classes and extensions, the latter probably adding
or modifying classes from other units.
Within a unit, all class names must be unique. Class names can be
qualified. When referring to a class without qualifying its name, it
either denotes a class in the current unit, or must uniquely denote a
class in any other known unit.
Extensions are never referred to from within code, but they do have a
name to be able to discern all extensions of a class. The names of
all extensions of a class must be unique within that class. Thus,
extension names are never qualified.
Casting
Numeric types can be casted to each other; the compiler will perform
such a cast implicitly if necessary.
An object can be implicitly cast to one of its superclasses. It can
explicitly be cast to one of it subclasses. The reason for this is to
accomodate the following, normal, scenario: Suppose the existence of
an object of which some instance variable can be read and set through
the methods `(A) value' and `(void) setValue: (A)', respectively.
This variable can be set to any B, as long as B is a subclass of A.
If the value is then retrieved, the object is still a B, and should be
allowed to be a B. The B can then used as an argument to another
method, which expects an object of type C, as long as B is a subclass
of C.
Note that this free casting is only possible if for all incarnations
of an object, the value of `this' does not vary, or an elaborate
virtual table administration is maintained and casting becomes an
expensive operation due to the checks needed [note 8].
TOM RUNTIME
This section presents some features available to tom programs, and
which are provided primarily by the tom runtime
Memory Management
The tom runtime provides automatic, configurable, time constrained,
garbage collection.
Flexible Method Forwarding
If an object does not respond to a method, the method `forward::' is
invoked instead, which is passed the original selector and an array of
boxed arguments (i.e. the original arguments with each unboxed
argument wrapped in an object).
The basic method for computed method invocation is through
`perform::', which accepts as it argument the selector of the method
to be invoked, and an array of boxed arguments. The implementation of
`perform::' as inherited from the State class will invoke the
indicated method of the receiver with the indicated arguments. Note
that this circumvenes the encapsulation of methods as known to the
compiler. Also note that methods declared `private' can not be
perform::ed.
These two methods together form the basis of a lot of possible
abstract functionality, such as method currying and distributed
objects.
NOTES
1 The word `tom' is written in same-caps: normally lower case but if
rules apply which state that the word should be capitalized, tom is
written in upper case.
1a Yes, I know I should renumber the notes.
1b These basic types very much resemble the types as found in Java.
1c Enumerations and bitsets are being considered as a type.
1c1 Obviously, this is more than a notational convenience; it is also
needed to be able to put parentheses around an expression.
1d Arrays are also being considered as a type. If included, arrays will
be available in various forms, with fixed or flexible base index and
with a fixed or flexible upper bound. Every array access will of
course be bounds checked.
2 Of course, meta class objects also exist in the runtime; they are
needed to describe the behaviour and state of the class objects; the
behaviour of meta class objects is defined by the State meta class
object; meta class objects do not have state.
On a related issue, in Objective-C the distinction between a class and
its instances is rather blurred. For example, it does not matter if
you ask `[a isKindOf: [Foo self]]' or `[a isKindOf: [Foo class]]',
where as at most one of there expressions will return TRUE in tom.
Other examples are the usage of `+new' to set instance variables and
the fact that class objects can not conform to instance methods set
for a protocol, or the other way around.
3 The `State' class is needed by the runtime system. The `State'
instance provides a reference (by the name of `isa') to the object's
class. The `State' class provides an identically named reference to
the class' class.
3a However, if a behaviour class defines state, it may not be inherited
as behaviour for an instance.
4 The term `usability' is defined later in this document.
5 Why the value of `this' is not allowed to change is defined later in
the document, in the section on usability.
5a With this definition, a class is nothing more than a container of
extensions, the main extension being mandatory.
6 Only through virtual methods will the retrieved Bar behave as a
FooBar.
7 A better compiler would generate assembly instead of C. For instance,
a C compiler can not know that the variable `isa' and the binding (and
other runtime) information it points to are all constant.
8 Also note that casting can be used too liberately by calling
everything an `Any', the superclass of all classes, although arrays
and other containers are not parameterized in the objects they hold;
the element of a generic container is always an Any.
|