In this note I propose simplified, modern definitions for "object"
and "object oriented". A modern definition is needed
because we have learned quite
a bit over the last 20 years since the last time there was a concerted
effort to define objects. Due to extensive experimentation, we can
distinguish what is absolutely essential from what is common and
useful. Eliminating the non-essential allows the definitions to
be simpler. Simplicity will help in communicating essential ideas broadly,
while still enabling more detailed discussion of advanced features
within the OO community.
Simplicity is almost always
the result of hard work, it is not the starting point.
This effort is entirely one of wording and presentation.
The fundamental nature of objects is well-known and understood
by the OO community, and these definitions will not change those
characteristics. However, this proposal does change some of the
words that we use to talk about our shared understanding.
This is necessary because our previous definitions have become
unwieldy and out of sync with everyday usage.
It is becoming clear to me that this summary is far too concise.
If there is interest, I might expand it with examples and additional
discussion. It might even turn into a book, if I find the time. If
you want to discuss this topic in more detail and have something
to contribute, please email me. We have an active discussion group.
This note is a companion to my essay
On understanding data abstraction, revisited [1].
For additional background, see [2,3].
This note begins with the proposed definitions followed
by some discussion of terms used in the definitions. It then reviews
related concepts, including inheritance, mutable state, etc. I also
discuss the implications of these definitions for describing and
classifying existing programming languages. Finally, I provide some
background on previous definitions of "object" and "object oriented".
Definitions
The definitions given below are intended to capture the characteristics that
are essential and required for something to be an object, not what features are useful
or frequently included in object-oriented languages. Since there are many ways to
implement objects, the definitions are formulated in terms of the properties that
an object must satisfy, not the internal details of how they are constructed.
An
object is a first-class, dynamically dispatched behavior.
A
behavior is a collection of named operations that can be invoked by clients
where the operations may share additional hidden details.
Dynamic dispatch means that different objects can implement the same operation name(s) in different ways,
so the specific operation to be invoked must come from the object identified in the client's request.
First class means that objects have the same capabilities as other kinds of values,
including being passed to operations or returned as the result of an operation.
A language or system is
object oriented if it supports the dynamic
creation and use of objects.
Support means that objects
are easy to define and use. It is possible to encode objects in C
or Haskell, but an encoding is not support.
Behavior
The new definition focuses on the key characteristic of objects
as
behavioral abstractions. In other words, objects are known only by what they
do.
In common usage, "behavior is the range of
responses that an agent may have in response to stimulus from their environment.
For objects, the environment is represented by
clients, which are external
entities that make use of the object. The stimulus from the environment to
the object is represented by
invocation of an operation on the object.
Thus the collection of operations provided by the object defines the
range of behaviors it may have to stimulus from clients, which represent its
environment.
The use of the word "behavior" makes the definition somewhat abstract, but it
avoids tying the definition to any implementation technology. However, it is
perfectly reasonable to substitute more specific words in place of behavior, and
the definition is still meaningful. Consider this slight variation of the
definition: "An
object is a first-class, dynamically dispatched X that
provides named operations that can be invoked by clients." In this version,
X can be replaced by "module", "record", "structure", "function", "entity",
or many other words and still be essentially valid definitions. The point is
that the essential characteristics are "dynamic dispatch", "first-class",
and "collection of operations". Exactly how the object is concretely represented
is not important.
Behavioral abstractions are effective for modeling
a wide range of concepts, including real-world objects, processes,
algorithms and also data. But it has not been proven that objects
are the only way, or are even an inherently better way, to model
real-world objects. Modeling concepts (including data) by their
observable behavior contrasts sharply with other approaches based
on data structures, relations, or algebraic data types. One might
argue from the other direction, that any program that uses behavioral
abstraction to model data is object oriented.
Taken to the extreme, we might say that only thing that can be known about an object is its behavior.
This idea is implemented in Microsoft COM, for example. However, it is often expedient or
useful to allow more kinds of interactions between clients and the objects they manipulate.
The definition requires that objects have behavior, but it does not prohibit other
features from being included.
However, objects are required to present a behavioral interface to clients,
not a pure structural one. This means that some classes in Java do
not create objects, but instead create structures. An object that
contains operations and public fields is a hybrid. I think this
fits well with accepted usage of terminology.
First-class values and declarations
There is a subtle distinction
between use of a value being first class and a declaration form
being first class. The definition of objects invokes the usage
criteria: it says that objects can be used anywhere that values can
be used. There is general consensus that first-class functions
require both the usage and the declaration criteria: functions are
first class only if they can be used anywhere that other values can
be used, and also that functions can be declared anywhere in the
program. If functions only had to meet the usage criteria, then C
would have first-class functions, because it supports function
pointers. To be truly first class, a declaration form must be allowed
anywhere and also have access to all the lexical context of enclosing
scopes. For example, class declarations are not truly first class
in C++: a class can be defined inside a function, but the class
does not have access to the lexically enclosing function arguments.
We do
not require objects to satisfy the first-class
declaration criteria, because that would require objects/classes
to be definable anywhere in a program, which is clearly not the
case in some object-oriented languages, including Eiffel. Some
object-oriented languages, including Beta, Grace, Scala, JavaScript
and Self, do have first-class object/class declarations. Java
satisfies the declaration criteria since the introduction of anonymous
and inner classes. Smalltalk also satisfies the declaration criteria,
because classes can be created dynamically anywhere. Nested class
definitions in C++ are second class: they can be defined anywhere,
but do not have access to the enclosing lexical scopes.
Being first-class means that objects have a recognizable
identity.
This is necessary for them to be passed to operations or returned
as results of operations. However, it does not imply that the system
must allow identities of two objects to be compared for equality.
Given the definition of "objects as first-class
behaviors", it should be clear that objects can be used to encode
first-class functions, because an object can contain a single function.
Scala and Java 8 provide syntactic support to make this encoding
easy to use. There is a nice parallel between "first-class functions"
and "first-class behaviors".
It is very important to understand that
objects can also represent what is normally understood as "data".
They do so by representing "data" as "behavior". That is, the data
is defined by what it can
do not by any idea of structure
that can be inspected. This is why many object-oriented languages
do not have any concept of "data structures", for example structs/records
and union in C, Pascal, ML or Haskell.
Given dynamic dispatch, object-oriented languages have little need for
"pattern matching", which tends to expose internal representations
anyway. It is possible to define views on objects that support
pattern matching without exposing the internal representation, e.g.
"unapply" in Scala.
Dynamic Dispatch
Dynamic dispatch of operations is the essential characteristic of
objects. It means that the operation to be invoked is a dynamic
property of the object itself. Operations cannot be identified
statically, and there is no way in general to exactly what operation
will executed in response to a given request, except by running it.
This is exactly the same as with first-class functions, which are
always dynamically dispatched.
Dynamic dispatch is sometimes called "message passing", "late binding",
"dynamic binding",
or "polymorphism".
Programming language researchers
have identified several kinds of polymorphism, including subtype
polymorphism, ad-hoc polymorphism and parametric polymorphism.
Polymorphism is often associated with type systems, but in object-oriented
programming it is a dynamic property.
The Greek word "polymorphism" means roughly "having many forms".
It does not mean that a particular object has many forms, but rather
that a client can interact with many different forms of objects
without having to know exactly which kind is being used.
We call this
object polymorphism.
Object-oriented languages introduced two forms of extensibility:
dynamic dispatch and inheritance. They are often tied together, but they are in fact
separate concepts.
Dynamic dispatch allows new kinds of
objects to be defined that implement the same behavioral interfaces
as existing objects. The new objects can then be mixed together
with the existing objects, without rewriting existing code.
Inheritance is a mechanism for incremental modification of
self-referential structures. Dynamic dispatch is essential to objects,
while inheritance is useful but not absolutely essential.
Discussion
The definitions are certainly simple and, I hope, clearly stated.
Remember that the goal is to capture the
essential
characteristics at an appropriate level of abstraction, using
familiar terminology. That is, the definitions specify what is
required, not what is useful or familiar.
I believe that the definition of "object",
while presented using different terminology, corresponds very closely
to the broad understanding of the concept, and to previous descriptions
in the literature. Some examples from the literature are discussed
at the very end of this note.
The proposed
definition of "object oriented" does not match some previous definitions,
because it leaves out topics that are often considered essential
to objects, especially mutable state, classes, inheritance, and
identity. I argue below that these related ideas, while widely used
and very useful, are not absolutely essential.
The proposed definition
does brings the term "object-oriented" into closer alignment with
current usage. For example, under many previous definitions,
JavaScript is
not an object-oriented language, because it
does not have classes and inheritance. The
definition
on Wikipedia is fairly close to the proposal given here.
Related Concepts
In seeking a fundamental, primitive concept
of "object", there are several ways to identify non-essential
features. One is that if a feature already has a good name and is
orthogonal to the concept of "object", then there is no reason to
include the feature in the definition of "object". Thus, for example,
we might have "mutable objects" and "non-mutable objects" if the
concept of mutation is orthogonal. Another technique is to imagine
whether a significant and useful object-oriented program can be
written without the feature, then that feature is probably not
essential. We can also example the variety of languages that have
been defined to identify their common features. I propose that the
following features are not essential parts of the definition of an
object.
It is interesting that much of the criticism of object-oriented
programming focuses on these optional features.
Mutable State
Mutable state is not essential to
the definition of objects. It is useful and common, but not essential.
It is clearly possible to define objects that do not allow mutation.
One kind of immutable object is the Value Objects, which are quite
common and useful. Obviously mutable state can exist without objects.
But the key point is that, at the level of language semantics,
combining first-class behaviors and mutable state doesn't introduce
any complex interactions. They just work together fine.
There are some interactions
between mutable state and inheritance. At the implementation level
there can also be complex issues. But conceptually, objects and
mutability don't have any fundamental interference between each
other. Thus they are orthogonal.
Excluding
mutable state as a required part of the definition of "object" is
certainly radical. But our understanding of mutable state has
increased considerably in the last 20 years. We have better ways
to program without mutation and better ways to control it. Of the
23 patterns in "Design Patterns: Elements of Reusable Object-Oriented
Software", only State, Memento, Observer, Decorator, and possibly
Chain or Responsibility and Adapter, require mutability. Patterns
involving representing data as behavior but without mutation are
becoming more common.
Inheritance
Inheritance is neither essential
nor only useful for object-oriented programming. You can define objects
perfectly well without (implementation) inheritance in dynamic
languages or static languages with interfaces. Unfortunately, in
some languages (e.g. Simula and C++) implementation inheritance is
the only way to achieve dynamic dispatch. In other languages (e.g. Java)
is is less work to achieve dynamic dispatch using implementation
inheritance than with interfaces. This problems only arise in
statically typed languages that allow classes to be used as types.
In C++, the convention is to use fully abstract classes as interfaces.
Inheriting from such a class is really just a way to declare interface
compatibility. To summarize, in Java "extends" is not essential but
"implements" is. In other words, as long as the language supports
dynamic dispatch, inheritance is not essential. On the other hand,
Java has class
inheritance but not interface inheritance. This is because the
implicit self-reference in an interface is not modified when
an interface extends another interface. Java allows extension of
interfaces, but not true inheritance of interfaces.
Inheritance
is not required for polymorphism/dynamic dispatch. For example, in Smalltalk two
classes can implement the same methods, which allows their objects
to be used interchangeably, even if neither class inherits the
other. Go has dynamic dispatch but not inheritance.
On the other hand, inheritance can also be used for wrapping functions,
deriving new data types, or extending ML-style modules. Functional
programming papers that include a notion of data extensibility often
use a form of "open recursion", which is the functional programmer's
name for inheritance. The essence of inheritance is composition
where the "self reference" of the inherited parts is modified to
refer to the combined structure. As just one example, type and
function inheritance is used extensively in Wouter Swierstra's
paper on Data Type a la Carte.
Delegation is the dynamic analog of inheritance, which is usually
defined statically.
Classes
There are successful and useful object-oriented languages
that do not include a concept of "class". The most well known example
is Self. To me, classes are best understood as factories for creating
objects. As such, they can be implemented as ordinary functions
with a nested object definition. Even in a statically typed language,
there is no requirement that a class act as a type. The idea that
classes are types has caused more confusion and poorly designed
code than anything else I know, other than the use of "null" for
references.
Identity
Identity is often listed as an essential property of
objects. As mentioned above, objects have identity in the
sense that they exist. This basic idea of identity allows
object to be referenced by clients, and also refer to themselves.
However, identity can also mean the ability to determine if two
references denote the same object. By "the same object" I mean that
both references refer to the result of a single object creation
event. In many languages, identity corresponds to the ability to
compare object references for pointer equality. It is possible to
have mutable state without supporting an operation to compare two
references for equality. While it is possible to support identity
on immutable objects, I'm not sure if that is useful. In either
case, this demonstrates that identity and mutability are orthogonal
concepts.
There are good reasons why objects should not be
required to have identity. Objects should be able to impersonate
other objects. This is necessary for the Wrapper pattern to work
properly. Identity also conflicts with the principle that objects
should have absolute control over what clients know about them. If
identity is needed, it is easy to define and implement an interface
that allows identity checking, although this does not prohibit
impostors from faking an identity. It can certainly be useful to
have non-forgeable identities, and this feature can be added to
objects without unwanted interactions. For example, Mark Miller
argues that true identity is required to solve
the Grant
Matcher Puzzle.
One reason for including identity comparison is to be able to write
low-level libraries that need some form of identity, for example serialization
of a graph of objects. However, I believe that all such cases can
be defined by adding a public operation to the object itself,
rather than building identity into the system.
Multi-methods
Mutli-methods do not fit very well into the
definition of objects given above. A multi-method call is not a
request on a particular object to perform an operation. It is a
request to the system to select an appropriate method to operate
on a combination of arguments. However, multi-methods include the
kind of object-oriented dynamic dispatch described above as a special
case, where the dispatch is performed on just one argument. The
question of whether multi-methods are better than other approaches
is still a matter of debate. Proponents argue that multi-methods
give significant additional expressive power, while others argue
that multi-methods reduce the modularity of solutions. The bottom
line is that multi-methods support object-oriented programming, but
multi-methods in their full generality are different from objects
as defined here.
Purity
In a
pure object system, everything is an object.
This is an idea that is difficult to achieve, but Smalltalk
and Self come close.
Other features
Other features, including reflection, static
typing, interfaces, first-class classes, concurrent objects,
synchronization, etc. are included in some OO languages but not
others. This means they are not essential to the
definition of "object", even if they are very useful.
Implications for Languages
The proposed definitions of object
and object-oriented programming are more liberal than previous
definitions. In particular, it allows more languages to be called
"object-oriented" than previous, more restrictive definitions.
The definition does not preclude languages from
supporting other features.
All familiar object-oriented languages are included in
the definition, including Simula, Smalltalk, C++, Eiffel, Beta,
Objective-C, Visual Basic, Self, Java, C#, Perl, Matlab, PHP, Python,
Scala, OCaml, and Ruby. I'm not sure if Lua is object oriented. Of
the top 20 languages in the TIOBE Programming Community, 14 are
object oriented.
One important point is that using an object-oriented
language does not automatically imply that one is doing object-oriented
programming. The definition of "object oriented" states that the
language must
support the creation and use of objects, but
it does not require that they be used. Most languages include a
wide range of features, and even object-oriented features can be
used for other purposes besides objects.
For example,
classes in C++ and Java do not necessarily create objects. In other
words, it is also possible to
not do object-oriented
programming in Java. If you work at it, you can. Just as there were
programmers who wrote Lisp as if it were FORTAN, there are Java
programmers who write as if Java were C or ML. They use classes as
if they were structs, define lots of static methods, and use
"instanceof" for representation-based pattern matching. There is
even an undergraduate PL textbook that uses Java this way. As I
have said before, it is possible to do object-oriented programming
in Java. Since Java is so widely known, there is a tendency to
assume that object-oriented programming is defined to be "anything
you can do in Java". But just because you are using Java doesn't
mean your program is object-oriented.
ML, Modula-2 and the original version of Ada support modules, but are not
object oriented because the modules are not first-class.
Modula-3 is object oriented, and Ada 95 added support for objects to Ada.
JavaScript was not object oriented by most previous
definitions, because it lacks classes and inheritance. It is
object-oriented under the proposed definition, because it supports
both dynamic dispatch and delegation for extensibility.
Erlang is an interesting case. It is
object oriented, (pure) functional, and imperative all at the same
time. Sequential Erlang is pure functional it does not support
mutable data. Concurrent Erlang is imperative because it has aliasing
of mutable processes. This mutable state does not exist within a
process, but rather exists in the system at the level of references
to a process. In other words, the behavior of an process reference
can change over time. The transitions between states of an process
are pure functional. Finally, by sending structured messages that
specify an operation request, an Erlang process can act as an object.
The process is a behavior that uses a dispatch
function to invoke different operation within the object. One might
argue that this is an encoding of objects, not true support, but
the practice seems to be reasonably common in real Erlang programs.
Erlang can also implement objects using parameterized modules.
There is also a convention for describing interfaces, called
behaviors in Erlang.
Erlang can be understood as an
object-oriented language that is locally functional and globally
imperative.
As mentioned above, multi-methods support
objects as as special case, by defining multi-methods that dispatch
on only one argument. CLOS, Dylan, and Cecil are object oriented.
However, it is also possible to program multi-method abstractions
that are not objects.
All languages in the Lisp
family, including Scheme and Racket are object-oriented because it
is possible to define a few simple macros that allow objects to be
created and used as if they were built-in features of the languages.
These macros count as "support" even though they are in some sense
an encoding. Usually objects are defined as closures that dispatch
to the appropriate operation based on an operation name in the
argument list. The result is very similar to the way objects are
implement in Erlang. Such macros exist for all Lisp-based languages,
and in many cases they macros are included as part
of the base language.
Haskell is not object-oriented.
It includes a feature called
type classes that involves
creating "classes" and "instances" of those classes. However, type
classes, as normally used, are statically bound. Hence they do not
meet the requirement that operations are late-bound, or polymorphic,
in objects. One might also argue that the instances are not
first-class. There are many ways to encode objects in Haskell,
including some based on combining type classes and existentials.
There is a recent trend in functional programming to
adopt behavioral representations for data, especially when tackling
the Expression Problem. One great example is Carette, Kiselyov and
Shan's "Tagless, Finally" representation, which is related to the
work on "Polymorphic Embedding of DSLs" in Scala. The influences
between OO and FP go in both directions.
Go is
object-oriented because it supports dynamic dispatch, even though it
does not have inheritance. Protocols in Clojure provide support
for object-oriented programming. [if you can provide more details
on these, please write to me].
As I have suggested
before, the untyped lambda calculus is also object oriented, because
the only way to implement "data" is with behavior. If this is true,
then object-oriented programming and functional programming were
born at the same moment. The story of their fall from grace and
potential future reconciliation is a long and fascinating one.
Discussion
David Barbour argues that any definition of object
must include state. He tweeted "Specific procedures are pure.
Procedures include functions. Does this mean effects aren't the
essential nature of procedures?" I responded "It means that Procedure
= Function + Mutable State. We decompose ideas into their primitive,
orthogonal components." Exactly how we decompose concepts, and what
names we choose to use for these concepts, is not predefined. He
responded "Similarly, I would suggest that 'Object' is not a
primitive, orthogonal concept. Try: Object = Existential + Mutable
State." But this is incorrect, because existentials can be used for
other things, especially ADTs. Pierce's book presents two standard
encodings for objects, but only one of them uses existentials. I
believe that existential encodings of objects are provably
non-essential, because the existentials can always be removed by
simple transformations. In addition, Abadi and Cardelli's object
calculus does not require existentials.
He also
questions whether "object" should be considered a fundamental,
orthogonal concept. It could be explained as a form of codata, for
example. Or we could simply use "first-class module" instead. Some
people clearly have an agenda to preclude using "object" as a name
for a fundamental concept. Other people, including myself, prefer
to use the term "object" in a fundamental way. My guess is that the
winner gets to pick the terminology, just as the winner gets to
write the history.
Thanks to all those who provided
comments and suggestions for previous versions, and especially the
members of
IFIP
WG 2.16 on Language Design for their input.
There
is no way to prove that terminology is correct. Deciding what words
to use for what concepts is a social problem, not a mathematical
one. However, I have talked with hundreds of people about these
issues over the years, and I believe that my proposal is reasonable
given the current usage of terminology in the community.
Historical Notes
Kristen Nygaard, one of the designers of Simula,
wrote that object-oriented "program execution is regarded as a
physical model, simulating the behavior of either a real or imaginary
part of the world." As stated, this is a way of regarding or viewing
an object-oriented program. It is generally understood that Nygaard
intended it as a definition, that the execution of an object is a
physical model of the part of a world. While this is perfectly fine
as a statement of the purpose or goal of using objects, I do not
think it is suitable as a
definition of object-oriented
programming. The problem is that it provides very little guidance
about what an object actually is. In other words, its not implementable
just based on the description. What I take as the key point is that
focus on
behavior, which is central to the definition of
objects proposed here.
In Jul 2003 Alan Kay wrote
"OOP to me means only messaging, local retention and protection and
hiding of state-process, and extreme late-binding of all things.
It can be done in Smalltalk and in LISP. There are possibly other
systems in which this is possible, but I'm not aware of them." The
proposed definition shares a focus on polymorphic invocation of
behavior, which Kay called "messaging". He also mentions state,
which we should assume to be mutable. But the focus of that part
of the discussion is on "hiding", which is also important in the
proposed definition. The final requirement, for "extreme late-binding
of all things" is less easy to interpret. It is this requirement
that leads Kay to reject all other programming languages that were
widely known at the time. I'm curious if he would allow Ruby or
Python to be object oriented.
In another email he
wrote "The big idea is 'messaging' - that is what the kernal of
Smalltalk/Squeak is all about... The key in making great and growable
systems is much more to design how its modules communicate rather
than what their internal properties and behaviors should be."
Peter Wegner's write-up of his OOPSLA 89 Keynote, "Concepts
and Paradigms of Object-Oriented Programming Expansion," gives what
I think is a widely accepted definition:
"Objects are
collections of operations that share a state. The operations determine
the messages (calls) to which the object can respond, while the
shared state is hidden from the outside world and is accessible
only to the object's operations (see Figure 1). Variables representing
the internal state of an object are called instance variables and
its operations are called methods. Its collection of methods
determines its interface and its behavior."
Note that
his first paragraph does not explicitly require the state to be
mutable. His first example does include mutable state, but it is
not highlighted strongly in the definition.
He goes
on to say "The object's behavior is entirely determined by its
responses to acceptable messages and is independent of the data
representation of its instance variables. Moreover, the object's
knowledge of its callers is entirely determined by its messages.
Object-oriented message passing facilitates two-way abstraction:
senders have an abstract view of receivers and receivers have an
abstract view of senders."
Peter Wegner proposed a widely used classification
system: "object-based languages: the class of all languages that
support objects class-based languages: the subclass that requires
all objects to belong to a class object-oriented languages: the
subclass that requires classes to support inheritance"
The proposal given here simplifies this and removes the requirements
to support classes and inheritance.
Alan Snyder: The Essence of Objects: Concepts and
Terms [IEEE Software 10(1): 31-42 (1993)] says "An object embodies
an abstraction characterized by services." He goes on to explain
each of these concepts in detail and mentions that the objects in
question can have mutable state. But the essential concept is the
idea of clients requesting behavior of an object via its public
services.
There many more early
important papers that define "object" in similar ways. Mutable state
is always included in the definition, but it is not the primary
characteristic.
Revised July 13, 12:34pm CDT: Fixed treatment of "functional" and discussion of language implications.
Revised July 13, 6:34pm CDT: Revised discussion of Common Lisp.
Revised July 14, 12:22pm CDT: Added Alan Kay quote. Modified module section slightly.
Revised July 19, 10:46am CDT: Revised extensively to use "behavior" instead of "module".