A Proposal for Simplified, Modern Definitions of "Object" and "Object Oriented"

In this note I propose simplified, modern definitions for "object" and "object oriented". A modern definition is needed because we have learned quite a bit over the last 20 years since the last time there was a concerted effort to define objects. Due to extensive experimentation, we can distinguish what is absolutely essential from what is common and useful. Eliminating the non-essential allows the definitions to be simpler. Simplicity will help in communicating essential ideas broadly, while still enabling more detailed discussion of advanced features within the OO community. Simplicity is almost always the result of hard work, it is not the starting point.

This effort is entirely one of wording and presentation. The fundamental nature of objects is well-known and understood by the OO community, and these definitions will not change those characteristics. However, this proposal does change some of the words that we use to talk about our shared understanding. This is necessary because our previous definitions have become unwieldy and out of sync with everyday usage.

It is becoming clear to me that this summary is far too concise. If there is interest, I might expand it with examples and additional discussion. It might even turn into a book, if I find the time. If you want to discuss this topic in more detail and have something to contribute, please email me. We have an active discussion group.

This note is a companion to my essay On understanding data abstraction, revisited [1]. For additional background, see [2,3]. This note begins with the proposed definitions followed by some discussion of terms used in the definitions. It then reviews related concepts, including inheritance, mutable state, etc. I also discuss the implications of these definitions for describing and classifying existing programming languages. Finally, I provide some background on previous definitions of "object" and "object oriented".

Definitions

The definitions given below are intended to capture the characteristics that are essential and required for something to be an object, not what features are useful or frequently included in object-oriented languages. Since there are many ways to implement objects, the definitions are formulated in terms of the properties that an object must satisfy, not the internal details of how they are constructed.

An object is a first-class, dynamically dispatched behavior. A behavior is a collection of named operations that can be invoked by clients where the operations may share additional hidden details. Dynamic dispatch means that different objects can implement the same operation name(s) in different ways, so the specific operation to be invoked must come from the object identified in the client's request. First class means that objects have the same capabilities as other kinds of values, including being passed to operations or returned as the result of an operation.

A language or system is object oriented if it supports the dynamic creation and use of objects. Support means that objects are easy to define and use. It is possible to encode objects in C or Haskell, but an encoding is not support.

Behavior

The new definition focuses on the key characteristic of objects as behavioral abstractions. In other words, objects are known only by what they do. In common usage, "behavior is the range of responses that an agent may have in response to stimulus from their environment. For objects, the environment is represented by clients, which are external entities that make use of the object. The stimulus from the environment to the object is represented by invocation of an operation on the object. Thus the collection of operations provided by the object defines the range of behaviors it may have to stimulus from clients, which represent its environment.

The use of the word "behavior" makes the definition somewhat abstract, but it avoids tying the definition to any implementation technology. However, it is perfectly reasonable to substitute more specific words in place of behavior, and the definition is still meaningful. Consider this slight variation of the definition: "An object is a first-class, dynamically dispatched X that provides named operations that can be invoked by clients." In this version, X can be replaced by "module", "record", "structure", "function", "entity", or many other words and still be essentially valid definitions. The point is that the essential characteristics are "dynamic dispatch", "first-class", and "collection of operations". Exactly how the object is concretely represented is not important.

Behavioral abstractions are effective for modeling a wide range of concepts, including real-world objects, processes, algorithms and also data. But it has not been proven that objects are the only way, or are even an inherently better way, to model real-world objects. Modeling concepts (including data) by their observable behavior contrasts sharply with other approaches based on data structures, relations, or algebraic data types. One might argue from the other direction, that any program that uses behavioral abstraction to model data is object oriented.

Taken to the extreme, we might say that only thing that can be known about an object is its behavior. This idea is implemented in Microsoft COM, for example. However, it is often expedient or useful to allow more kinds of interactions between clients and the objects they manipulate. The definition requires that objects have behavior, but it does not prohibit other features from being included. However, objects are required to present a behavioral interface to clients, not a pure structural one. This means that some classes in Java do not create objects, but instead create structures. An object that contains operations and public fields is a hybrid. I think this fits well with accepted usage of terminology.

First-class values and declarations

There is a subtle distinction between use of a value being first class and a declaration form being first class. The definition of objects invokes the usage criteria: it says that objects can be used anywhere that values can be used. There is general consensus that first-class functions require both the usage and the declaration criteria: functions are first class only if they can be used anywhere that other values can be used, and also that functions can be declared anywhere in the program. If functions only had to meet the usage criteria, then C would have first-class functions, because it supports function pointers. To be truly first class, a declaration form must be allowed anywhere and also have access to all the lexical context of enclosing scopes. For example, class declarations are not truly first class in C++: a class can be defined inside a function, but the class does not have access to the lexically enclosing function arguments.

We do not require objects to satisfy the first-class declaration criteria, because that would require objects/classes to be definable anywhere in a program, which is clearly not the case in some object-oriented languages, including Eiffel. Some object-oriented languages, including Beta, Grace, Scala, JavaScript and Self, do have first-class object/class declarations. Java satisfies the declaration criteria since the introduction of anonymous and inner classes. Smalltalk also satisfies the declaration criteria, because classes can be created dynamically anywhere. Nested class definitions in C++ are second class: they can be defined anywhere, but do not have access to the enclosing lexical scopes.

Being first-class means that objects have a recognizable identity. This is necessary for them to be passed to operations or returned as results of operations. However, it does not imply that the system must allow identities of two objects to be compared for equality.

Given the definition of "objects as first-class behaviors", it should be clear that objects can be used to encode first-class functions, because an object can contain a single function. Scala and Java 8 provide syntactic support to make this encoding easy to use. There is a nice parallel between "first-class functions" and "first-class behaviors".

It is very important to understand that objects can also represent what is normally understood as "data". They do so by representing "data" as "behavior". That is, the data is defined by what it can do not by any idea of structure that can be inspected. This is why many object-oriented languages do not have any concept of "data structures", for example structs/records and union in C, Pascal, ML or Haskell. Given dynamic dispatch, object-oriented languages have little need for "pattern matching", which tends to expose internal representations anyway. It is possible to define views on objects that support pattern matching without exposing the internal representation, e.g. "unapply" in Scala.

Dynamic Dispatch

Dynamic dispatch of operations is the essential characteristic of objects. It means that the operation to be invoked is a dynamic property of the object itself. Operations cannot be identified statically, and there is no way in general to exactly what operation will executed in response to a given request, except by running it. This is exactly the same as with first-class functions, which are always dynamically dispatched.

Dynamic dispatch is sometimes called "message passing", "late binding", "dynamic binding", or "polymorphism". Programming language researchers have identified several kinds of polymorphism, including subtype polymorphism, ad-hoc polymorphism and parametric polymorphism. Polymorphism is often associated with type systems, but in object-oriented programming it is a dynamic property. The Greek word "polymorphism" means roughly "having many forms". It does not mean that a particular object has many forms, but rather that a client can interact with many different forms of objects without having to know exactly which kind is being used. We call this object polymorphism.

Object-oriented languages introduced two forms of extensibility: dynamic dispatch and inheritance. They are often tied together, but they are in fact separate concepts.

Dynamic dispatch allows new kinds of objects to be defined that implement the same behavioral interfaces as existing objects. The new objects can then be mixed together with the existing objects, without rewriting existing code. Inheritance is a mechanism for incremental modification of self-referential structures. Dynamic dispatch is essential to objects, while inheritance is useful but not absolutely essential.

Discussion

The definitions are certainly simple and, I hope, clearly stated. Remember that the goal is to capture the essential characteristics at an appropriate level of abstraction, using familiar terminology. That is, the definitions specify what is required, not what is useful or familiar. I believe that the definition of "object", while presented using different terminology, corresponds very closely to the broad understanding of the concept, and to previous descriptions in the literature. Some examples from the literature are discussed at the very end of this note.

The proposed definition of "object oriented" does not match some previous definitions, because it leaves out topics that are often considered essential to objects, especially mutable state, classes, inheritance, and identity. I argue below that these related ideas, while widely used and very useful, are not absolutely essential.

The proposed definition does brings the term "object-oriented" into closer alignment with current usage. For example, under many previous definitions, JavaScript is not an object-oriented language, because it does not have classes and inheritance. The definition on Wikipedia is fairly close to the proposal given here.

Related Concepts

In seeking a fundamental, primitive concept of "object", there are several ways to identify non-essential features. One is that if a feature already has a good name and is orthogonal to the concept of "object", then there is no reason to include the feature in the definition of "object". Thus, for example, we might have "mutable objects" and "non-mutable objects" if the concept of mutation is orthogonal. Another technique is to imagine whether a significant and useful object-oriented program can be written without the feature, then that feature is probably not essential. We can also example the variety of languages that have been defined to identify their common features. I propose that the following features are not essential parts of the definition of an object.

It is interesting that much of the criticism of object-oriented programming focuses on these optional features.

Mutable State

Mutable state is not essential to the definition of objects. It is useful and common, but not essential. It is clearly possible to define objects that do not allow mutation. One kind of immutable object is the Value Objects, which are quite common and useful. Obviously mutable state can exist without objects. But the key point is that, at the level of language semantics, combining first-class behaviors and mutable state doesn't introduce any complex interactions. They just work together fine. There are some interactions between mutable state and inheritance. At the implementation level there can also be complex issues. But conceptually, objects and mutability don't have any fundamental interference between each other. Thus they are orthogonal.

Excluding mutable state as a required part of the definition of "object" is certainly radical. But our understanding of mutable state has increased considerably in the last 20 years. We have better ways to program without mutation and better ways to control it. Of the 23 patterns in "Design Patterns: Elements of Reusable Object-Oriented Software", only State, Memento, Observer, Decorator, and possibly Chain or Responsibility and Adapter, require mutability. Patterns involving representing data as behavior but without mutation are becoming more common.

Inheritance

Inheritance is neither essential nor only useful for object-oriented programming. You can define objects perfectly well without (implementation) inheritance in dynamic languages or static languages with interfaces. Unfortunately, in some languages (e.g. Simula and C++) implementation inheritance is the only way to achieve dynamic dispatch. In other languages (e.g. Java) is is less work to achieve dynamic dispatch using implementation inheritance than with interfaces. This problems only arise in statically typed languages that allow classes to be used as types. In C++, the convention is to use fully abstract classes as interfaces. Inheriting from such a class is really just a way to declare interface compatibility. To summarize, in Java "extends" is not essential but "implements" is. In other words, as long as the language supports dynamic dispatch, inheritance is not essential. On the other hand, Java has class inheritance but not interface inheritance. This is because the implicit self-reference in an interface is not modified when an interface extends another interface. Java allows extension of interfaces, but not true inheritance of interfaces.

Inheritance is not required for polymorphism/dynamic dispatch. For example, in Smalltalk two classes can implement the same methods, which allows their objects to be used interchangeably, even if neither class inherits the other. Go has dynamic dispatch but not inheritance.

On the other hand, inheritance can also be used for wrapping functions, deriving new data types, or extending ML-style modules. Functional programming papers that include a notion of data extensibility often use a form of "open recursion", which is the functional programmer's name for inheritance. The essence of inheritance is composition where the "self reference" of the inherited parts is modified to refer to the combined structure. As just one example, type and function inheritance is used extensively in Wouter Swierstra's paper on Data Type a la Carte.

Delegation is the dynamic analog of inheritance, which is usually defined statically.

Classes

There are successful and useful object-oriented languages that do not include a concept of "class". The most well known example is Self. To me, classes are best understood as factories for creating objects. As such, they can be implemented as ordinary functions with a nested object definition. Even in a statically typed language, there is no requirement that a class act as a type. The idea that classes are types has caused more confusion and poorly designed code than anything else I know, other than the use of "null" for references.

Identity

Identity is often listed as an essential property of objects. As mentioned above, objects have identity in the sense that they exist. This basic idea of identity allows object to be referenced by clients, and also refer to themselves.

However, identity can also mean the ability to determine if two references denote the same object. By "the same object" I mean that both references refer to the result of a single object creation event. In many languages, identity corresponds to the ability to compare object references for pointer equality. It is possible to have mutable state without supporting an operation to compare two references for equality. While it is possible to support identity on immutable objects, I'm not sure if that is useful. In either case, this demonstrates that identity and mutability are orthogonal concepts.

There are good reasons why objects should not be required to have identity. Objects should be able to impersonate other objects. This is necessary for the Wrapper pattern to work properly. Identity also conflicts with the principle that objects should have absolute control over what clients know about them. If identity is needed, it is easy to define and implement an interface that allows identity checking, although this does not prohibit impostors from faking an identity. It can certainly be useful to have non-forgeable identities, and this feature can be added to objects without unwanted interactions. For example, Mark Miller argues that true identity is required to solve the Grant Matcher Puzzle. One reason for including identity comparison is to be able to write low-level libraries that need some form of identity, for example serialization of a graph of objects. However, I believe that all such cases can be defined by adding a public operation to the object itself, rather than building identity into the system.

Multi-methods

Mutli-methods do not fit very well into the definition of objects given above. A multi-method call is not a request on a particular object to perform an operation. It is a request to the system to select an appropriate method to operate on a combination of arguments. However, multi-methods include the kind of object-oriented dynamic dispatch described above as a special case, where the dispatch is performed on just one argument. The question of whether multi-methods are better than other approaches is still a matter of debate. Proponents argue that multi-methods give significant additional expressive power, while others argue that multi-methods reduce the modularity of solutions. The bottom line is that multi-methods support object-oriented programming, but multi-methods in their full generality are different from objects as defined here.

Purity

In a pure object system, everything is an object. This is an idea that is difficult to achieve, but Smalltalk and Self come close.

Other features

Other features, including reflection, static typing, interfaces, first-class classes, concurrent objects, synchronization, etc. are included in some OO languages but not others. This means they are not essential to the definition of "object", even if they are very useful.

Implications for Languages

The proposed definitions of object and object-oriented programming are more liberal than previous definitions. In particular, it allows more languages to be called "object-oriented" than previous, more restrictive definitions. The definition does not preclude languages from supporting other features.

All familiar object-oriented languages are included in the definition, including Simula, Smalltalk, C++, Eiffel, Beta, Objective-C, Visual Basic, Self, Java, C#, Perl, Matlab, PHP, Python, Scala, OCaml, and Ruby. I'm not sure if Lua is object oriented. Of the top 20 languages in the TIOBE Programming Community, 14 are object oriented.

One important point is that using an object-oriented language does not automatically imply that one is doing object-oriented programming. The definition of "object oriented" states that the language must support the creation and use of objects, but it does not require that they be used. Most languages include a wide range of features, and even object-oriented features can be used for other purposes besides objects.

For example, classes in C++ and Java do not necessarily create objects. In other words, it is also possible to not do object-oriented programming in Java. If you work at it, you can. Just as there were programmers who wrote Lisp as if it were FORTAN, there are Java programmers who write as if Java were C or ML. They use classes as if they were structs, define lots of static methods, and use "instanceof" for representation-based pattern matching. There is even an undergraduate PL textbook that uses Java this way. As I have said before, it is possible to do object-oriented programming in Java. Since Java is so widely known, there is a tendency to assume that object-oriented programming is defined to be "anything you can do in Java". But just because you are using Java doesn't mean your program is object-oriented.

ML, Modula-2 and the original version of Ada support modules, but are not object oriented because the modules are not first-class. Modula-3 is object oriented, and Ada 95 added support for objects to Ada.

JavaScript was not object oriented by most previous definitions, because it lacks classes and inheritance. It is object-oriented under the proposed definition, because it supports both dynamic dispatch and delegation for extensibility.

Erlang is an interesting case. It is object oriented, (pure) functional, and imperative all at the same time. Sequential Erlang is pure functional it does not support mutable data. Concurrent Erlang is imperative because it has aliasing of mutable processes. This mutable state does not exist within a process, but rather exists in the system at the level of references to a process. In other words, the behavior of an process reference can change over time. The transitions between states of an process are pure functional. Finally, by sending structured messages that specify an operation request, an Erlang process can act as an object. The process is a behavior that uses a dispatch function to invoke different operation within the object. One might argue that this is an encoding of objects, not true support, but the practice seems to be reasonably common in real Erlang programs. Erlang can also implement objects using parameterized modules. There is also a convention for describing interfaces, called behaviors in Erlang. Erlang can be understood as an object-oriented language that is locally functional and globally imperative.

As mentioned above, multi-methods support objects as as special case, by defining multi-methods that dispatch on only one argument. CLOS, Dylan, and Cecil are object oriented. However, it is also possible to program multi-method abstractions that are not objects.

All languages in the Lisp family, including Scheme and Racket are object-oriented because it is possible to define a few simple macros that allow objects to be created and used as if they were built-in features of the languages. These macros count as "support" even though they are in some sense an encoding. Usually objects are defined as closures that dispatch to the appropriate operation based on an operation name in the argument list. The result is very similar to the way objects are implement in Erlang. Such macros exist for all Lisp-based languages, and in many cases they macros are included as part of the base language.

Haskell is not object-oriented. It includes a feature called type classes that involves creating "classes" and "instances" of those classes. However, type classes, as normally used, are statically bound. Hence they do not meet the requirement that operations are late-bound, or polymorphic, in objects. One might also argue that the instances are not first-class. There are many ways to encode objects in Haskell, including some based on combining type classes and existentials.



There is a recent trend in functional programming to adopt behavioral representations for data, especially when tackling the Expression Problem. One great example is Carette, Kiselyov and Shan's "Tagless, Finally" representation, which is related to the work on "Polymorphic Embedding of DSLs" in Scala. The influences between OO and FP go in both directions.

Go is object-oriented because it supports dynamic dispatch, even though it does not have inheritance. Protocols in Clojure provide support for object-oriented programming. [if you can provide more details on these, please write to me].

As I have suggested before, the untyped lambda calculus is also object oriented, because the only way to implement "data" is with behavior. If this is true, then object-oriented programming and functional programming were born at the same moment. The story of their fall from grace and potential future reconciliation is a long and fascinating one.

Discussion

David Barbour argues that any definition of object must include state. He tweeted "Specific procedures are pure. Procedures include functions. Does this mean effects aren't the essential nature of procedures?" I responded "It means that Procedure = Function + Mutable State. We decompose ideas into their primitive, orthogonal components." Exactly how we decompose concepts, and what names we choose to use for these concepts, is not predefined. He responded "Similarly, I would suggest that 'Object' is not a primitive, orthogonal concept. Try: Object = Existential + Mutable State." But this is incorrect, because existentials can be used for other things, especially ADTs. Pierce's book presents two standard encodings for objects, but only one of them uses existentials. I believe that existential encodings of objects are provably non-essential, because the existentials can always be removed by simple transformations. In addition, Abadi and Cardelli's object calculus does not require existentials.

He also questions whether "object" should be considered a fundamental, orthogonal concept. It could be explained as a form of codata, for example. Or we could simply use "first-class module" instead. Some people clearly have an agenda to preclude using "object" as a name for a fundamental concept. Other people, including myself, prefer to use the term "object" in a fundamental way. My guess is that the winner gets to pick the terminology, just as the winner gets to write the history.

Thanks to all those who provided comments and suggestions for previous versions, and especially the members of IFIP WG 2.16 on Language Design for their input.



There is no way to prove that terminology is correct. Deciding what words to use for what concepts is a social problem, not a mathematical one. However, I have talked with hundreds of people about these issues over the years, and I believe that my proposal is reasonable given the current usage of terminology in the community.

Historical Notes

Kristen Nygaard, one of the designers of Simula, wrote that object-oriented "program execution is regarded as a physical model, simulating the behavior of either a real or imaginary part of the world." As stated, this is a way of regarding or viewing an object-oriented program. It is generally understood that Nygaard intended it as a definition, that the execution of an object is a physical model of the part of a world. While this is perfectly fine as a statement of the purpose or goal of using objects, I do not think it is suitable as a definition of object-oriented programming. The problem is that it provides very little guidance about what an object actually is. In other words, its not implementable just based on the description. What I take as the key point is that focus on behavior, which is central to the definition of objects proposed here.

In Jul 2003 Alan Kay wrote "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them." The proposed definition shares a focus on polymorphic invocation of behavior, which Kay called "messaging". He also mentions state, which we should assume to be mutable. But the focus of that part of the discussion is on "hiding", which is also important in the proposed definition. The final requirement, for "extreme late-binding of all things" is less easy to interpret. It is this requirement that leads Kay to reject all other programming languages that were widely known at the time. I'm curious if he would allow Ruby or Python to be object oriented.

In another email he wrote "The big idea is 'messaging' - that is what the kernal of Smalltalk/Squeak is all about... The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be."

Peter Wegner's write-up of his OOPSLA 89 Keynote, "Concepts and Paradigms of Object-Oriented Programming Expansion," gives what I think is a widely accepted definition:

"Objects are collections of operations that share a state. The operations determine the messages (calls) to which the object can respond, while the shared state is hidden from the outside world and is accessible only to the object's operations (see Figure 1). Variables representing the internal state of an object are called instance variables and its operations are called methods. Its collection of methods determines its interface and its behavior."

Note that his first paragraph does not explicitly require the state to be mutable. His first example does include mutable state, but it is not highlighted strongly in the definition.

He goes on to say "The object's behavior is entirely determined by its responses to acceptable messages and is independent of the data representation of its instance variables. Moreover, the object's knowledge of its callers is entirely determined by its messages. Object-oriented message passing facilitates two-way abstraction: senders have an abstract view of receivers and receivers have an abstract view of senders."

Peter Wegner proposed a widely used classification system: "object-based languages: the class of all languages that support objects class-based languages: the subclass that requires all objects to belong to a class object-oriented languages: the subclass that requires classes to support inheritance"

The proposal given here simplifies this and removes the requirements to support classes and inheritance.

Alan Snyder: The Essence of Objects: Concepts and Terms [IEEE Software 10(1): 31-42 (1993)] says "An object embodies an abstraction characterized by services." He goes on to explain each of these concepts in detail and mentions that the objects in question can have mutable state. But the essential concept is the idea of clients requesting behavior of an object via its public services.

There many more early important papers that define "object" in similar ways. Mutable state is always included in the definition, but it is not the primary characteristic.


References

Revision History

Revised July 13, 12:34pm CDT: Fixed treatment of "functional" and discussion of language implications.
Revised July 13, 6:34pm CDT: Revised discussion of Common Lisp.
Revised July 14, 12:22pm CDT: Added Alan Kay quote. Modified module section slightly.
Revised July 14, 3:33pm CDT: Revised multi-method section.
Revised July 19, 10:46am CDT: Revised extensively to use "behavior" instead of "module".

62 comments:

Daniel Moisset said...

I like how your defition took out a lot of non-essentials, however I think there is an aspect of OO which was left out in the definition: I would argue that there has to be a mechanism to create new modules (i.e. instances) to be OO.

If you create a language without this property (for example, take Python without the class construct, and use imported modules polymorphically, which fits your definition) you lose most of the patterns of the OO style, and I think most programs you create would look very non-OO.

Will Cook said...

This is part of the definition: "A language or system is object-oriented if it supports the creation and use of objects." Is there a problem?

Sandro Magi said...

Is a module with only public functions an object?

Will Cook said...

Yes, if it first class. There is no requirement for objects to have fields or state. And Clients can't tell if the info is stored in a field or method body if it doesnt change.

Sandro Magi said...

Isn't your definition then inconsistent? You say that a module with only public fields is not an object, but a module with only public fields of type function is an object. You seem to be distinguishing functions from other types of values in a way that isn't explicitly stated. You hint at this distinction here and there via mentioning "operations" and "behaviour", but it's not really part of your core definition.

I agree with your characterization of objects as first-class modules, I'm just trying to be precise.

Anonymous said...

This is my reductionist definition of object orientation:
A struct type, and a collection of functions that take a pointer to a struct as a parameter (usually but not necessarily passed with different syntax than normal parameters)
The pointer-to-struct is always a first class object, and in more enlightened languages the collection of functions is also.

Matt said...

> "Unfortunately, in some languages (e.g. Simula and C++) implementation inheritance is the only way to achieve polymorphism. "
Not really, in C++ inheritance is only needed for inclusion polymorphism (dynamic / run-time polymorphism). However, C++ supports all four forms of polymorphism described by Cardelli and Wegner [CW85], see: Polymorphism In Object-Oriented Languages

[CW85] Luca Cardelli and Peter Wegner: On Understanding Types, Data Abstraction and Polymorphism. Computing Surveys, vol. 17, no. 4, December 1985, pp. 471-522

Will Cook said...

You are right that not all the technical terms are explicitly defined. I'm depending on everyone to have a typical understanding of "field" and "operation". A "field" is normally a mutable variable that is part of a record or class definition. An operation is a procedure or function. Your previous question did not mention fields, by the way.

You are right that it is possible for a field to contain an operation, but this is not the typical case. I'm not sure it helps my goal to consider this kind of strange special case. The reason it is strange is that normally we do not want clients to be able to change the operations of an object. But if you want to analyze it, I would say that and module with operations in it (even if they are stored in fields) is an object. Given that, i would have to ammend my rule to "a module with only public (non-operation) fields is not an object."

Will Cook said...

Hi Matt. I thought it would be clear that I was talking only about the kind of polymorphism that is the focus of the paper, or inclusion polymorphism. I'll see if I can fix it.

Matt said...

Well, why such a narrow focus, then? :-) Other forms are very useful and relevant, also in the context of object orientation!

BTW, your definition proposal is perhaps close to the one by ChrisDate mentioned here:
DefinitionsForOo

Will Cook said...

Matt, I only discuss inclusion polymorphism because that is the only kind of polymorphism that is required for objects. Parametric polymorphism (generics) is very useful, but nobody has said that it is required as part of the definition of object. It is an orthogonal feature that is out of scope for my topic.

And no, I can't see any connection between my definition and Chris Date's.

Bruno Oliveira said...

William,

One comment:

Instead of saying that "a module that only contains public fields is not an object"

you could maybe say

"a module that does not contain public operations is not an object"

This rules out the "module with only public fields" without fiddling with with the notion of fields or whether they can be functions or not. Then you could illustrate with your example.

A couple of typos/typesetting issues:

"Module 3" --> "Modula 3"

Can you use a space before the main titles?

Randy A MacDonald said...

In APL, you define 'new' and 'to' and it all seems to be there:

anObject←aParameter new'class'
aResult←someParameters 'aMethodName' to anObject

⍝⍝ and 'anObject' doesn't actually need to be defined...

aResult←someParameters 'aMethodName' to aParameter new'class'

Will Cook said...

Thanks Bruno, i've used your changes. But I'm still fighting WordPress to get it to add more space.

Randy, I can believe APL can define objects, but you are going to have to explain things a lot more. I don't understand your code at all, because I don't know APL.

Jeremy Siek said...

To get a feel for your definition William, would a record calculus count as object oriented? As in the language in Fig 15-3 of Types and Programming Languages? Looks to me as if it does. It supports collections of operations (functions as members of records), it can hide members via width subtyping, and it supports subtype polymorphism. Did you intend to include such a language as object oriented? It's missing direct support for some things one might think of as object-oriented, such as support for one operation accessing the other operations or fields, which is usually available through 'this'.

Anonymous said...

It's not clear to me why your definition rules out Haskell. You say it does, but it seems to me that Haskell meets your definition via typeclasses, which allow a form of inheritance, support information hiding, are polymorphic, are first class, etc. I would agree that Haskell isn't an OO language, but I don't see how your definition rules that out.

Will Cook said...

Haskell type classes are statically bound. They are not first class either, because the type class itself is not a first-class value. They also do not support inheritance (although they do support a different form of extension).

I've been struggling with how to state the requirement that objects are late bound or invoked via dynamic dispatch. Currently I'm using the word "polymorphism" for this. But I'm looking for a way to simplify and make it more clear.

Will Cook said...

Hi Jeremy, I don't have TaPL with me right now. I claim that the use of records containing functions is a perfectly reasonable for of object-oriented programming. Can't the record be defined in a lexical scope that allows the functions to share information?

Will Cook said...

Frank Atanassow blogged that identity is not essential for OO, but several people disagreed. I'm curious what they will say given my attempt to cut out even more things that are often considered essential.

Spider said...

Cool. So, basically, an object is some kind of artefact with an interface and possibly multiple implementations, which is tangible enough to use as a value in any context. I don't immediately see any problems with this definition, so let's try to use it from now on as "objects in a broad sense" (just as "grammars in a broad sense" were defined in "Toward an Engineering Discipline of Grammarware"). Any chance you can package this as a PDF and hurl it to arxiv?

Minor: "there were programming" => "... programmers"; "neither ... or" => "neither ... nor"

Blackheart said...

To recap, I wrote this on G+:

--
1. One of the things in my mind which distinguishes modules from objects is the form of operations. Objects with state S have operations of type S -> ... Modules' operations can have any form.

2. Apparently you want your definition to also cover objects in untyped languages. Since there is only one type in those languages, the "polymorphic" requirement is trivial in that case. Maybe that is your intention.

3. "A language or system is object-oriented if it supports the creation and use of objects." As far as I'm concerned, if you can encode it, the language supports it, for example because it's easy to add new constructs to hide an encoding. But I don't consider C or Haskell OO languages either, so your definition doesn't work for me. (In contrast there is a hard distinction between languages that do and don't "support" higher-order functions.)
--

and William replied:

--
1. There are two ways to interpret your commet. If you are thinking of the coalgebraic signature, then you have lifted the discussion to a theoretical level way about the familiar notion of "module". In other words, we would have to have a semantics of modules to make sense of the comparison. If you are talking about self-reference, then the "extra argument" is not essential. The extra argument arises due to a particular handling of fixed-points. In particular, using the self-application encoding:
fact(n) = if n==0 then 1 else n*fact(n-1)
fixedpoint version:
fact = Fix(\f. \n. if n==0 then 1 else n*f(n-1) )
self-application version:
fact(f, n) = if n==0 then 1 else n*f(f, n-1)
The last explanation is not fundamental, although it appears in some object encodings and the implementation of C++.

2. I'm talking about OO polymorphism, sometimes called inclusion polymorphism or dynamic dispatch, which does not depend upon static types. So no, your interpretation is not what I meant.

3. Your comment is self-contradictory. You can encode first-class functions in any language, including C, by closure conversion. The ability to encode does depend upon the power of the underlying language. I would say Scheme is OO because its macros enable a clean encoding of objects. But C macros are not powerful enough for a clean encoding.
--

My response is below because of the character limit.

Blackheart said...

1. I was thinking of the former, coalgebraic signature aspect. You are right that it changes the question from "what is an object?" to "what is a module?", but that is part of my point. Of course, it is perfectly legitimate to define one thing as a special case of another, but that works best when the more general thing has a well-defined, unambiguous definition.

I can think of several distinct notions of modules, and I don't think there is any broad agreement that one of them is the "correct" notion. For some people a module is just something to control names and name conflicts. For others, it also has to support type abstraction. The particular form of type abstraction is also debated. Some people think modules should also be parametrizable. Consequently, to me the notion of a module is a bit nebulous: it covers many incompatible formalizations and requirements. (The notion of first-class module is even more varied.)

My problem is that the notion of object is also rather nebulous, but less so than that of modules, so I would rather see a stronger definition.

The other half of my complaint is simply the form of the operations. I did not want to use the word "coalgebraic" here because I know we don't see eye to eye on that, but I am only speaking about the type signature. Everything I recognize as an object has operations where the object (or its state) is an explicit or implicit argument; but module operations don't require that. At one extreme, a module could have exclusively algebraic operations as well. If such a module were first-class, would you call it an "object"?

You called Modula-3 object-oriented, and as I recall it doesn't say anything about the form of operations, so maybe you are OK with this, but to me it is too general.

2. Whatever sort of polymorphism it is, parametric or subtyping or what-have-you, it seems intrinsic to me that it deals with multiple types.

3. C is not even in the universe of things I would venture to have my definitions apply to. Do you want your definition to apply to assembly language as well? When I say "some languages have higher-order functions and some do not", I am admittedly restricting myself in the domain of things under consideration.

I guess you are looking for a definition of "object" which is informal and necessarily imprecise. That's fine, I guess, but given that proviso I'm not sure yours is an improvement precision-wise on Rumbaugh's identity + classification + inheritance + polymorphism one, though it is rather different.

(That's one reason I like the coalgebra model; one might disagree that it characterizes object-orientation, but it does make a formal distinction between languages in the same way that higher-order functions do.)

Will Cook said...

1. Actually, I am perfectly happy with the coalgebraic explanation of objects. Where did you get the idea that I am against it?

Most people seem pretty happy with my use of "module" because it has a fairly precise informal meaning, which is not affected by all the complex module-foo that goes on in the PL community.

The extra argument comes from either state or recursion, but I am avoiding both these topics, because they are orthogonal to the issues. As you have yourself suggested, we can talk about pure objects that return new objects (instead of mutating). In this case, is there an extra argument?

2. I disagree the polymorphism requires a type system. The "types" are implicit, and the "form" in question is just the particular protocol or interface that the object implements. Think about smalltalk.

I am not sure that Bob's dynamic typing = one type formula is helpful in understanding this. While it may be true in some theoretical sense, it somehow misses the point.

3. If I substitute "module" with "collection of operations", are you happy? I still think your idea of "encoding" is inconsistent. Would you say ACL2 supports higher order functions because it can encode them with closure conversion?

Jeremy Siek said...

Regarding operations referring to each other, lexical scope alone doesn't quite do it. You also need some recursion. The recursion can be open or closed. Now you've said that open recursion isn't part of your definition. So another way to ask my question is whether you think recursion (never mind which kind) is an essential part of OO.

Blackheart said...

1. I think you said you found the coalgebraic model insufficient because it didn't handle binary methods soundly.

At the risk of sounding too arch about it, I suspect those people are happy with your use of "module" are happy because they only think about it in an untyped context, where the denotation of a module is just a tuple and the only way to approximate an abstract type is to hide operations.

Typing reveals the real issues that modularization poses and all that complex module-fu is there to make it sound and safe.

2. We're not going at agree on this.

3. I'm not familiar with ACL2, and the pages I read didn't offer a concise summary.

The idea of encoding is pretty standard, for example Church numerals, or the encoding of datatypes in System F. Closure conversion doesn't count because it's not a local transformation.

If you allow that a language "supports" X just because you can use it as the target for compilation (global transformations), then the term becomes almost meaningless. At best it devolves to the proposition that support = syntactic sugar.

But for Turing-complete languages it's already meaningless because every computable type is encodeable by definition. What I am really talking about is the language minus recursion: whether its models are cartesian-closed or not.

Of course, you are right that, just because something is encodeable, that does not mean it is practicable. I would not want to do arithmetic in Haskell with Church numerals -- it would take ages. But the point is rather that if you can encode something, then it is a conservative extension and so you can add new primitives which make it efficient. (Since the extension is conservative, you can continue to reason about the primitives using your existing denotational model.)

Sukant Hajra said...

I'm glad you wrote this up because for some audiences, it's more accessible than your "On Understanding Data Abstractions, Revisited" essay.

I think what might help is to get some active feedback from seminal people involved in pulling the semantics of "object" in other directions.

For instance, on the theoretical side, I'd be curious if both Cardelli and Pierce would agree in full to this approach to defining objects.

On a less theoretic side, I think Alan Kay's discussion points are often used as substrate to couple object-orientation to the encapsulation of mutable state. You've quoted aspects of Kay's discussions that seem very compatible with your approach, but Kay has left so much more discussion that's been interpreted differently, and he's even been so bold to claim ownership of the phrase "object orientation," which seems unchallenged. From a social perspective this gives his thoughts on this matter some (perhaps illogical) weight, especially in a matter that is more or less a semantic debate.

It's probably impossible to bring together all parties. But I see no reason that your approach can not be built/refined into a majority consensus. But you're absolutely right that this is a social problem and not a technical one. As such, I think it would help if there were more consensus among those regarded with more authority on the matter.

I've become somewhat attached to this problem of defining "objects" recently. This is mostly because I work in an industry that I believe actively justifies horrible architectural decisions pursuing aspects of object-orientation you've explicitly defined as nonessential. The value of mutable state, subtype inheritance, and object identity is very debatable. Dynamic dispatch through Curry-Howard may disallow certain classifications of proofs about our programs, but this seems to be a convenience we should use intentionally. Why prove things we don't really care about? In this regard, I find your essential qualities of objects very useful. Most attacks on OO are really just on what you've called nonessential (tricky, proof-fracturing invariance violations).

Accepting this approach to defining objects as I see it augments utility and decreases controversy (for instance, the false dichotomy between OO and FP becomes even more clear). I pursuing a more impacting consensus is worth the effort.

rdm said...

I would restate Randy MacDonald's point like this:

If object creation and message passing can be defined as words, the implementation might not matter.

Names can be strings (though the validity of this implementation choice might vary depending on language -- for example, some languages have complexities which are difficult to manage without type checking and some languages cannot take advantage of distinctions between strings in the processing they use for type checking).

You need a way of finding a module, given an object name, to implement object creation ('new').

You need a way of finding a method within a module, given an a method name to implement message passing ('to').

A remaining constraint is that the system of definition for objects (and probably classes?) must be easy to use. For APL that should be trivial to implement -- and it's probably easy to implement for many other languages also.

Will Cook said...

Jeremy, the short answer is that I don't think recursion is essential to the definition of "object". It might be essential to actually creating useful objects, but I'm trying to define the term, not characterize utility. Also, I have tried to remove any well-defined concepts that are orthogonal to "object"

Will Cook said...

Hi Frank (Blackheart),
1. I'm certain that "module" is not the perfect word to use, but its the best one I've found.
2. I'm OK with that. But see http://en.wikipedia.org/wiki/Polymorphism_in_object-oriented_programming
3. Felleisen's "Macro Expressibility" is probably the right thing: http://dl.acm.org/citation.cfm?id=651590 . Are you actually unhappy with my use of the term "supports" or are you just picking nits?

Will Cook said...

Sukant, thanks for pointing out that attacks on OO typically go after the nonessential parts. I'll mention that in the paper.

Tracy Harms said...

While the code posted by Randy MacDonald and the discussion extended by RDM bring out how readily object-techniques can be used in APL, I think it does not fit Will Cook's criteria. In particular consider this: "First-class means that the object can be used in any context where a value is expected"

Almost everything is a "value" in APL. No value can be used in *any* context because syntactic contexts constrain what parts of speech can be valid, and thus which values might be used. I see this as precluding APL and similar languages by definition. As to whether that exclusion is desireable, I have no opinion.

Tracy Harms said...

In your discussion of functional programming you say why you see C as not in that category. This sentence was part of that:

"To be truly first-class, a declaration form must be allowed anywhere and also have access to all the lexical context of enclosing scopes."

Could you elaborate why this scoping power seems vital? Scoping model seems orthoganal to me.

Will Cook said...

Normally if an expression or declaration is inside a function, then the function arguments can be used. But with C++, a nested class cannot access the lexically enclosing variables:

int test(int x) {
class A {
int lookup() { return x; } <<< error, cannot access x!
}
return x + (new A().lookup()); << valid use of x
}

Thus class definitions are not truly first-class.

Warren Harris said...

Ocaml recently added first-class modules in addition to its object system, and I was wondering how these may or may not fit into your definition of objects.

Anonymous said...

> All languages in the Lisp family, including Scheme, Racket, and Common Lisp are object-oriented because it is possible to define a few simple macros that allow objects to be created and used as if they were built-in features of the languages. Usually objects are defined as closures that dispatch to the appropriate operation based on an operation name in the argument list. The result is very similar to the way objects are implement in Erlang. Such macros exist for all Lisp-based languages that I know of, and in many cases they macros are included as part of the base language. As mentioned above, CLOS is another way to do object-oriented programming in Common Lisp.

CLOS is not implemented by a bunch of macros, nor does it use closures. CLOS is also not 'another' way to do object-oriented programming in Common Lisp. It is the built-in and standardized object-oriented subsystem of Common Lisp.

Matt said...

> Matt, I only discuss inclusion polymorphism because that is the only kind of polymorphism that is required for objects. Parametric polymorphism (generics) is very useful, but nobody has said that it is required as part of the definition of object. It is an orthogonal feature that is out of scope for my topic.

What I'm asking is -- why, in your view, is inclusion polymorphism the only kind of polymorphism that is required for objects?

I understand that we can treat parametric polymorphism as an orthogonal feature.

And say I assume the requirement of yours as given: "It does not mean that a particular object has many forms, but rather that a client can interact with many different forms of objects without having to know exactly which kind is being used."

I understand that would rule out overloading (classified as one of the forms of ad hoc polymorphism in Cardelli and Wegner [CW85]), since the client (say, an operation/function acting on an object) would have to know the kind -- do I understand you correctly / is this your argument here?

However, I don't think this applies to coercion (classified as another form of ad hoc polymorphism in Cardelli and Wegner [CW85]) -- see the example in the "Polymorphism In Object-Oriented Languages" article I've linked to. If you disagree, could you explain why?

> And no, I can't see any connection between my definition and Chris Date's.

To recap: "An object is essentially just a value (if immutable) or a variable (otherwise)." (24.1 Introduction to Database Systems)

He seems to agree with you that the notion of inheritance is non-essential (no mention of it above) and that objects are first class citizens (and he also agrees that mutable state is not essential, hence both values & variables are allowed forms of first class citizenship).

Allen Wirfs-Brock said...

Your definitions resonate strongly with me. Perhaps this is not too surprising as you essentially present an behavioralist view of objects and the W-B household has always been a nest of object behavioralist.

I do have a possible addition. You include a definition of object-oriented languages/systems. I recommend also considering defining what "object-oriented analysis/modeling/design/programming" means. My first cut at such a definition would be along the lines of: Software development practices where objects are used as the (or perhaps "a") primary abstraction mechanism. From there, it directly follows that OOA/M/D/P focuses on the behavioral aspects of software systems.

I agree with all of the ways you have minimized the essential object characteristics and your rationales for them. While concepts such as identify and compositional definition mechanisms (eg, inheritance) are important and useful in the design space of OO languages they are not essential to the understanding of "objects". If anything, they are a distraction, at this level. In particular, as you address, identity is difficult to define above the implementation level and pragmatic languages have had to provide mechanisms (eg Smalltalk becomes, proxies in various languages) that support situations where it is necessary or useful to circumvent their deeply ingrained identity concepts.

I agree with your use of "module" in defining object. However, I'm a bit worried that your use of the term may not be widely appreciated outside the academic PL community. Many of our commonly used languages have weak module abstractions or conflate modules with the physical structuring of program source code. In particular, first class modules are rare in current mainstream languages except for their manifestation as objects. For example, the ECMAScript standards committee has committed to a static, non-first class set of modularity extensions. One of the many motivations is that we already have a dynamic, first-class modularity faciliy. We call it "objects". From the perspective of a programmer who is only familiar with such languages your definition is almost equivalent to "An object is a polymorphic first-class object". But I still like your usage better than the more traditional terms you considered.. It brings a fresh emphasis to the perspective that objects are a means to modularize the behavioral aspects of a system. Perhaps, there is some other term that would work as well or a new term that could be invented However, I like thinking about what you are doing here as a small step towards a modern normalization of the PL terminology that has emerged over the last 40 years. It's not clear that inventing new terms would be a desirable part of such an effort.

This brings me to a larger challenge for you. I found you definitions to really come to life when in the commentary where you define the terms your object/OO definitions depend upon. You can't really define the object terminology without also define related terms such as "module", "polymorphism", "first class" , "operation", etc. These, of course, have other dependencies and share terminology and concepts with other programming paradigms. Please don't stop here. I'm guessing that the complete set of core concept definitions that PLs and programming in general build upon comes to somewhere between 15 to 30 terms. I'd love to see a modernized PL vocabulary that integrates the full breadth of PLs concepts and purges legacy implementation and factional biases.

Will Cook said...

Thank you very much for the comments, Allen. Very thoughtful, constructive, and motivating. I'm still not sure about "module", but it seems to be working for me right now. We'll see.

Anonymous said...

Your definition is complex,over detailing. Definitely,object-oriented is not hard to comprehensive.the central ideas is :objects are intelligent beings,they are working with each other,if a object can't have capabilities to the task,it would be telling another object:"hi,can you help me?",not that,"can you give me your information?". The terminologies of object-oriented is not so important I think.

Unknown said...

I don't think the use of the term "module" is a weakness but "hidden details" looks like a hole in the definition and actually the object concept is all about those hidden details and the responsibility for them: CRUD for hidden details. Modularity is an afterthought. It comes after the hidden details and encapsulates them together with the operations that support CRUD on them.

Are the hidden details mutable? If not one has to ask if there is another entity which can mutate them, something which is more deeply hidden, like an engine or a runtime environment, "implementation details" which go out of sight and are abstracted away to "keep the semantics clean". If this is so, the system is not object oriented but conforms to Wadlers half ironic depiction of the schizoid Cartesian subject in one of his essays about monadic effects. One has to somehow engage a pineal gland to the "impure world" or many of them to cure the mind-body split.

Avoiding the split and don't curing it at all can cause all sorts of confusions because one moves from objects in a software system to objects everywhere else and connects all of them through messages and distinguishes them only by their encodings. No one is interested in software in itself.

In a way I'm quite content with "objects have failed" and OO being a failed paradigm in a radical sense: a vision and a hype which didn't live up to its promises. Technical definitions which are meant to appease the FP crowd just carry forward the failure and the FP people don't care about OO anyway but have their own agenda.

Benjamin Jordan said...

Very interesting article. I saw this a few days ago, and today I ran across Dr. Alan Kay's definition (the inventor of Smalltalk). I thought I'd post it here:

"OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them."

Anonymous said...

Bjarne Stroustrup in his paper "What is Object-Oriented
Programming" said: OOP is programming with inheritance.

asavinov said...

Identity is of primary importance because it manifests the fact of existence. In other words, an object without identity does not exist. However, in classical object-orientation, the mechanism of identification itself is not part of the model. Only objects can be modeled while references are provided automatically by the compiler. In contrast, concept-oriented programming assumes that identities and objects are two parts of one thing and therefore should both have the same rights (hence a new construct generalizing classes is introduced, called concept). What you describe as identity is more reference resolution mechanism where many references can be resolved in the same lower level reference representing one object. Yet, identity is how objects are represented (relation between reference and object) rather than how references are resolved (relation between references).

Tracy Harms said...

Ah, so it is a matter of having all the qualities otherwise available. That makes complete sense to me. (It isn't about scope, per se, it's only about scope when that proves to be a way to identify that something is not first-class.)

Tony Morris said...

The extraordinary effort that OOeationists go to in order to save their thesis from its well-deserved dismissal amazes me every time.

You might be getting tired of hearing facts, but this is unpersuasive as is the proposed epistemological spaghetti.

Keep it coming I say.

Will Cook said...

What thesis do you refer to? I made no claims for OO being better for any particular purpose. I believe it is useful, but that is not the purpose of this document. What is the problem?

Sandro Magi said...

Some interesting discussion here. I've given it some more thought, and I think the core of OO is really about procedural abstraction. Your characterization of objects as first-class modules is exactly this: the primary means of modelling problem domains is via procedural abstraction, where the primary means of modelling problem domains in functional languages is data abstraction.

I believe your paper on OO that was discussed on LtU really captures this concept well.

Matt Havener said...

What did you mean by "The idea that classes are types has caused more confusion and poorly designed code than anything else I know, other than the use of "null" for references"? Any opinions on the use of value objects for storing pure data?

Rob said...

I was writing a comment to the next post, and I came across the following problem:

I have no particular problem with the use of "polymorphism" to mean the thing that you mean, because I know if I'm talking to you I can just use "parametric polymorphism" to mean the thing that I usually mean by "polymorphism."

But what are the correct/acceptable modifiers if I'm talking to you and want to clarify that I mean the thing *you* usually mean. You mention "subtype polymorphism," but unless I'm wrong that's a different thing. "Behaviorial polymorphism?" Just calling it "objected-oriented polymorphism" seems a bit self-referential.

Will Cook said...

Yes, you are right about that. Unfortunately "subtype polymorphism" doesn't quite capture the idea of polymoprhism as used in objects. Informal discussions of object polymorphism are usually quite clear, but the theoretical formalism missed the point slightly. The key idea is the relation between objects and interfaces, or the elements and their types, rather than being a relationship between types. I am revising the definition and may avoid use of the term "polymorphism" in the definition, to get around this problem.

Sandro Magi said...

Will, row polymorphism in record systems accurately captures the "object polymorphism" semantics. Row polymorphism has been used to encode first-class messaging before.

Will Cook said...

Sandro, that is true, but not necessary. A simple record calculus, like the kind described in Cardelli's original "Semantics of Multiple Inheritance" will do, even without subtyping! (BTW, that paper is about subtyping, not inheritance)

Will Cook said...

Hi Matt. A type describes properties of a value. A class defines a particular implementation of an object. When you use a class as a type, you are specifying that the objects must have that implementation (or a subclass). This exposes implementation information about objects, which make the program less extensible.

As for value classes, I'm not sure exactly what you mean. A class that only has data fields is not an object according to the proposed definition. I'm working on another note that will give some examples of OO and compare them to functional approaches.

rdm said...

@Tracy Harms -- I am not sure I agree with you about the "any place where a value is expected" issue, in the context of APL, mostly because I can't think of any meaningful examples where that is a problem.

If I have a value context that wants a string, and I provide a reference which represents something which is not a string but a number? Does that mean that that reference is not first class?

Anyways, it seems to me that we have to be careful about statements that describe "any context". If I have an object which has a method which provides a numeric value, that is not sufficient for the object itself to be valid in numeric contexts.

Perhaps there's a point here, to be made about contexts and polymorphism and guarantees provided by the language. For example: perhaps we would say that a language is only object oriented if [for example] we can extend any "type specific context" to allow previously undeclared types? Thus, for example, if our language supports selecting a value from a list based on its index and it does not allow the index operator to be extended to the use objects which represent collections of unit quaterions as indices, then that language could be said to be "not object oriented"? Here, in this hypothetical case, if the language allows the programmer to include any declarations which forbid such things then that language would not be object oriented...

rdm said...

@Will Cook -- I think that a type is, at its core, about characterizing the code which handles that type. I think that any properties associated with the type are abstractions about the code which handles values of that type.

This is an important distinction, to me, because numbers are examples of things which have multiple types. And I deal with numbers. The same value (let's take the number 1, for example) can be an integer, a real number or a complex numbrer (or any of a variety of other types). And this entity can be represented in any of a variety of ways. So, for me, saying that "type describes my data" does not lead me any place useful -- that number 1 is the same data, to me, regardless of the type that I use to represent it.

So, instead, for me, "type" classifies my code -- for me, "type" is telling me how I have represented this number in this context.

And, polymorphism is only useful to me as a way of managing this kind of issue. I often see polymorphism used in other ways -- but I feel that such practices are a mistake.

Will Cook said...

The idea of "first-class" is a general statement, that objects can be used in the kind of contexts where other values are used. It does not mean that a particular value must be allowed to be used in every place in a particular program. That requirement doesn't make sense, because it would end up leading to (runtime or static) type errors. So think of it as a general statement about kinds of values and kinds of uses in the language, not a specific statement about particular values and programs.

Tony Clark said...

I strongly support the proposed definitions for OO. However I would have expected the basic definitions of 'object' and 'behaviour' to be tied together somehow with the notion of 'self'. Although I understand the relationship of self-reference and inheritance through fixed-points, I think there ought to be a link between object, identity, behaviour (as a collection of operations) and self reference during behaviour invocation. Somehow together these define an 'object'.

Jeremy Siek said...

The new definition based on what you can do with an object is much improved! I also like that the definition is more self contained.

Tom Schrijvers said...

The definition of "object oriented" seems rather vague or arbitrary. Can you be more precise?

From your examples it seems that a language is object oriented because it either intends to be object oriented or because it allows encoding objects and provides a macro system to facilitate using the encoding. It is not clear to me why a macro system makes the difference in the latter case? One would expect that the encoding is still leaky in ways
that the macro system cannot hide.
Moreover, an encoding is not necessarily more verbose than the syntactic forms in self-declared object-oriented languages like Java. So what does support mean?

Anonymous said...

"I'm not sure if Lua is object oriented."

Because you don't know it sufficiently to judge or any other reason?

http://lua-users.org/wiki/ObjectOrientedProgramming
Lua is very similar to JavaScript.

Will Cook said...

@Tom Schrijvers: I think the definition of "object oriented" is simple and elegant, based on the definition of "object". The problem is that it does not correspond to the widely accepted definition, which usually requires many more things (e.g. inheritance).

The idea of "supports" is fairly informal. This may be considered a matter of degree rather than a binary choice. Rather than ask if your language is or is not object oriented, ask how object oriented it is. The notion of "expressiveness" and "encoding" are pervasive and hard to pin down. If you push too hard, then we will have to say that all languages are "functional" because it is possible to encode pure functions (and even laziness) in them. But at that point our definitions are useless.

polux said...

A small meta-comment: seems like something went wrong with the formatting of this post (there's no vertical space before and after the section headers).

I often point people to this post and I generally consider it as a reference post. It's silly but I think that the formatting can sometimes deter them from reading it.