1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223
|
-*- fill-prefix: " " -*-
1 [Thu Jan 11 15:10:23 1996] Created.
2 [Thu Jan 11 15:10:43 1996] A little something on (implicit) up and
down casting, together with the problem of casting through mutability.
The latter is the problem of being able to address an Array as a
MutableArray, if the latter plainly is a subclass of the former and
downcasting is allowed. Obviously, upcasting is always allowed: if
Foo is a Bar, then one can pass a Foo where a Bar is expected.
A possible solution for the mutable subclass problem seems to be to
extend the language to be able to have two classes: `Array' and
`mutable Array', and to not allow downcasting trough a mutable
qualification. However, it is unclear what kind of implications this
will have.
Another possibility is the following: Up to now, I always thought that
casting an A to a B was allowable if they had a common subclass, since
then it is possible for something assumed to be an A to actually be a
C which also happens to be a subclass of B. However, this can only be
true if not both A and B are state instances. Thus, either A is a
state instance, or B is, but not both. I guess that the proper use of
behaviour added to state instances could solve the mutable subclass
problem, though I still have to work this out.
3 [Thu Jan 11 17:51:45 1996] There exists public, protected and private
at the level of classes. Is it desirable for such hiding to exist at
the level of modules/units/packages/insert-your-favourite-name?
4 [Thu Jan 11 22:50:08 1996] One moment to stop and write about units,
now I have written a bit of them, and the parser. A unit file
declares the classes and extensions present in the unit. The compiler
resolves qualified class names using the unit definition files. The
resolver will use them to know what to collect from
5 [Fri Jan 12 13:04:43 1996] Nice abstraction, that SmallTalk `Process'
class.
6 [Wed Jan 17 14:50:19 1996] When having built a system to include
condition checking, the post conditions of methods not catching an
exception must also be checked. This probably implies that including
condition checking implies recompilation (at least of those parts of
which conditions are to be checked?).
7 [Wed Jan 17 16:36:23 1996] Intitutionalize uniqued strings, which
actually employ a subclass of the normal (non-mutable) strings, and
which override `hash'. Hashing (strings) repeatedly while searching
through several dictionaries (or sets) is a waste of time.
8 [Sat Jan 20 19:42:12 1996] [The results of Michael and Tiggr visiting
the Amsterdam Arena.] Multiple state inheritance, as implemented in
C++, has the disadvantage that for `FooBar', inheriting from `Foo' and
`Bar', the value of `self' (or `this' in C++ speak) is different in
methods implemented by `Foo' and those implemented by `Bar'. In fact,
passing a `FooBar' where a `Bar' is expected may not pass a pointer to
the `FooBar' but to the `Bar' part of it. In short, if I have a
`FooBar', pass it as a `Bar' and later retrieve it (thus as a `Bar'),
I won't be able to handle it as a `FooBar', even though it is, or
better, was.
Thus, in short, the problem is that the value of `self' is not
identical in all methods which can ever be invoked for a given
instance of a class having multiple state superclasses.
A possible solution to this problem is to access instance variables
indirectly, much like pointers to method implementations are retrieved
indirectly. The latter are keyed on the selector of the method being
invoked. Instance variables can be related to a unique value, which
can then be used to retrieve the offset from `self' to the desired
instance variable. This value is called the `ivar index value' (iiv).
Of course, for every class, all instance variables, including those
inherited, must have a unique index value. In the context of dynamic
loading, which can introduce new classes, this implies that all
instance variables of all classes must have a unique index value and
also that the index value can not be precomputed for dynamically
loaded code. In fact, libraries, which are not built in the context
of the resulting application executable also can not use precomputed
iiv's. If this doesn't seem obvious, consider two libraries, each
having been built in an empty context, which are then used in the same
application.
If multiple inheritance semantics resulting from this mechanism for
instance variable accessing imply that repeated inheritance is always
shared inheritance. This seems not illogical, since copied
inheritance raises the question whether the distinction between the
`is-a' and `part-of' hierarchies has been garbled.
Another observation, resulting from having an identical `self' in all
methods invokable for a given object, is that it implies maximum code
reuse, as opposed to the source reuse of Eiffel, where inheritance
sharing can be defined for each distinct instance variable, but only
if the long form of the superclasses is available (the short form does
not suffice for this).
A (raw, imprecise, header file based, not showing `hidden' classes)
count of instance variables in Objective-C class libraries shows +/-
150 for <foundation/foundation.h>, +/- 800 for the <appkit/appkit.h>,
+/- 300 for <dbkit/dbkit.h>, +/- 200 for <eoaccess/eoaccess.h>, and
+/- 200 for <eointerface/eointerface.h>.
This implies that, for a normal EOF application, which uses appkit,
foundation, eoaccess, and eointerface, there are more that 1000
instance variables. Clearly, a straight table for +/- 200 classes in
that application would induce too high a memory penalty: 200 * 1000 *
4 = 800k of memory.
Fortunately, the instance variable index tables are very sparse, and
only those locations containing information are actually being used.
This means that tables can be shared as long as they do not contain
conflicting information. Such conflicts arise because of multiple
inheritance: The tables for `Foo' and `Bar' can not be merged and used
for `FooBar', since the offsets for the ivars from one of the classes
will have shifted according to the size of the other. Observe that in
a single state inheritance lattice, a single table can be used.
9 [Sun Jan 21 14:38:18 1996] A quick test (see test-iit.c) on the m68k
shows that, with SELF on the stack, a direct ivar read reference is
two (memory referencing) instuctions. A reference through the ivar
index table (iit) using a precomputed iiv is two instructions extra
(both memory referencing, though GCC 2.7.0 throws in an extra, and
unnecessary, register-to-register transfer); in case the iiv is not
precomputed another extra reference is needed. Of course, these
numbers change with register allocation and CSE activity.
10 [Sun Jan 21 21:50:35 1996] Running a little test on C++, it shows that
a direct ivar reference is equally expensive (obviously), being 2
instructions. When accessing a member of a virtual superclass, one
extra instruction is needed. This is cheaper than using a precomputed
iiv, but at the expense of the virtual superclass table pointer in
each instance.
In fact, given that virtual inheritance or virtual methods are not
that uncommon, the following statement can be assumed to be generic: a
class inheriting from multiple superclasses not only needs a (pointer
to) a virtual table itself, but also inherits a (pointer to) the
virtual table from each of its superclasses. This implies that, for a
class A: B, C, each of which need a virtual table, 12 bytes per
instance are needed for table maintenance.
11 [Sun Jan 21 22:20:58 1996] Obviously, the size of the tom ivar index
tables can be greatly reduced by providing only one entry per class
introducing new instance variables. However, this implies that, for
all but the first ivar of each class, the offset must be computed by
adding the `class ivar offset' retrieved from the iit and the offset
of the desired ivar, which is known at compile time. Thus, it is
questionable whether the extra add instruction is desirable and worth
it for saving some memory in the iit's.
Actually, if the iit holds an entry per ivar (not per class) and if
we're not using precomputed iiv's, recompilation of all existing
methods for a class is not necessary if instance variables are added,
since the global variables already holding, at run time, the iit's
will not have changed. Of course, resolving and linking must still be
redone.
12 [Tue Jan 23 17:14:36 1996, tiggr@tom] Another thing about dynamic ivar
binding: Currently, I expect that all stuff needed by the runtime to
be created by the resolver in one big C file, and that all selectors
and iit's wil be close to their own species and ordered by class.
This implies such a locality of reference that it will increase the
CPU's cache efficiency, compared to the case where all stuff needed
would be put in the C file resulting from compiling the tom file in
which the selector or ivar was defined.
13 [Wed Apr 24 01:35:23 1996, tiggr@tricky.es.ele.tue.nl] Instance-less
classes can be useful.
14 [Wed Apr 24 01:35:34 1996, tiggr@tricky.es.ele.tue.nl] Optional parts
in method names can be useful. The idea is that every argument can be
assigned a default value. This value will be used (at compile time!)
to complement the argument provided by the caller. Thus, actual
shorter selectors will not be used (this raises the question what to
do with subclasses defining methods with more optional parts).
15 [Wed Apr 24 01:55:04 1996, tiggr@tricky.es.ele.tue.nl] Names must be
fully scoppable. Thus, just like class names can be scoped by the
unit, any element defined for a class must be scopable by the class
name. Clashes between instance and class elements become an error.
16 [Sun Apr 28 00:26:57 1996, tiggr@tricky.es.ele.tue.nl] Suppose you
have a FileStream---an open file abstraction. It is a subclass of the
fundamental stream abstraction, BasicStream. However, the BasicStream
has many implementations on different platforms: BasicUnixStream is
different from BasicVMSStream. Still, you always want FileStream to
be a subclass of the actual implementation of the BasicStream. Posing
(by the actual implementing class for this environment) solves this
problem (that FileStream never needs to inherit from a stream for a
specific environment, but can inherit the generic BasicStream), but
the question is if it suffices? Another question is what problem I'm
trying to solve, and that every extension to the language in solving
problems like this isn't an implementation of a particular use of
cpp...
17 [Mon Apr 29 23:35:35 1996, tiggr@tricky.es.ele.tue.nl] What if the two
implicit arguments SELF and CMD are put in a tuple to be a single
first implicit argument typed `(id, selector)'?
18 [Tue Jan 28 22:20:37 1997, tiggr@tricky.es.ele.tue.nl] Units as the
unit of linkage does not suffice. For example, when DCE is available
on HP-UX, the Thread class in the tom unit will provide an abstraction
to the threads provided by DCE, and every TOM program will need to be
linked against libdce. As a result, all kinds of library and system
calls will come with a significant penalty due to the locking and
unlocking of critial regions (even when running single-threaded but
that shows only part of libdce's braindeadness).
Units should provide only a namespace. The unit of linkage should be
the class or extension, commonly packed in an object file. Should
this be implemented, a mechanism is needed to indicate which classes
and extensions are needed by some class or extension. Obviously,
using a class implicitly needs the class' superclasses and invoking a
method which is implemented by an extension marks that extension as
needed.
For example, in the current setup of the TOM unit, the Thread class is
needed when the method `performInThread' provided by the All instance
is used. This suggests that this method should be provided by an
optional extension of All which is depended upon by the Thread class,
as the means of starting new threads, and which itself depends on the
Thread class to provide its functionality. Furthermore, the Thread
class should have linkage information that makes libdce linked in when
the Thread class is.
|