In certain computer programming languages, data types are classified as either value types or reference types, where reference types are always implicitly accessed via references, whereas value type variables directly contain the values themselves.[1] [2]
Even among languages that have this distinction, the exact properties of value and reference types vary from language to language, but typical properties include:
Even when function arguments are passed using "call by value" semantics (which is always the case in Java, and is the case by default in C#), a value of a reference type is intrinsically a reference; so if a parameter belongs to a reference type, the resulting behavior bears some resemblance to "call by reference" semantics. This behavior is sometimes called call by sharing.
Call by sharing resembles call by reference in the case where a function mutates an object that it received as an argument: when that happens, the mutation will be visible to the caller as well, because the caller and the function have references to the same object. It differs from call by reference in the case where a function assigns its parameter to a different reference; when that happens, this assignment will not be visible to the caller, because the caller and the function have separate references, even though both references initially point to the same object.
Many languages have explicit pointers or references. Reference types differ from these in that the entities they refer to are always accessed via references; for example, whereas in C++ it's possible to have either a and a, where the former is a mutable string and the latter is an explicit pointer to a mutable string (unless it's a null pointer), in Java it is only possible to have a, which is implicitly a reference to a mutable string (unless it's a null reference).
While C++'s approach is more flexible, use of non-references can lead to problems such as object slicing, at least when inheritance is used; in languages where objects belong to reference types, these problems are automatically avoided, at the cost of removing some options from the programmer.
In most programming languages, it is possible to change the variable of a reference type to refer to another object, i.e. to rebind the variable to another object.
For example, in the following Java code:
Foo a = new Foo;Foo b = a;a.prop = 3;a = new Foo;a.prop = 1;
Foo
is a reference type, where a
is initially assigned a reference of a new object, and b
is assigned to the same object reference, i.e. bound to the same object as a
, therefore, changes through a
is also visible to b
as well. Afterwards, a
is assigned a reference (rebound) to another new object, and now a
and b
refer to different objects. At the end, a
refers to the second object with its prop
field having the value 1
, while b
refers to the first object with its prop
field having the value 3
.
However, such as C++, the term "reference type" is used to mean an alias, and it is not possible to rebind a variable of a reference type once it is created, as it is an alias to the original object.
Foo a;a.prop = 1;Foo &b = a;Foo c = a;a.prop = 3;
In C++, all non-reference class types have value semantics. In the above example, b
is declared to be a reference (alias) of a
, and for all purposes, a
and b
are the same thing. It is impossible to rebind b
to become something else. After the above example is run, a
and b
are the same Foo
object with prop
being 3
, while c
is a copy of the original a
with prop
being 1
.
In C#, apart from the distinction between value types and reference types, there is also a separate concept called reference variables.[3] A reference variable, once declared and bound, behaves as an alias of the original variable, but it can also be rebounded to another variable by using the reference assignment operator = ref
. The variable itself can be of any type, including value types and reference types, i.e. by passing a variable of a reference type by reference (alias) to a function, the object where the reference-type variable points to can also be changed, in addition to the object itself (if it is mutable).
If an object is immutable and object equality is tested on content rather than identity, the distinction between value type and reference types is no longer clear, because the object itself cannot be modified, but only replaced as a whole (for value type) / with the reference pointed to another object (for reference type). Passing such immutable objects between variables have no observable differences if the object is copied or passed by reference, unless the object identity is taken. In a functional programming language where nothing is mutable (such as Haskell), such distinction does not exist at all and becomes an implementation detail.
Language | Value type | Reference type | |
---|---|---|---|
Java[4] | all non-object types, including (e.g.) booleans and numbers | all object types, including (e.g.) arrays | |
all data types, except reference types, array types and function types | arrays and functions | ||
C#[5] | all non-object types, including structures and enumerations as well as primitive types | all object-types, including both classes and interfaces | |
Swift[6] [7] | structures (including e.g. booleans, numbers, strings, and sets) and enumerations (including e.g. optionals) | functions, closures, classes | |
Python[8] | all types | ||
JavaScript[9] | all non-objects, including booleans, floating-point numbers, and strings, among others | all objects, including functions and arrays, among others | |
OCaml[10] [11] | immutable characters, immutable integer numbers, immutable floating-point numbers, immutable tuples, immutable enumerations (including immutable units, immutable booleans, immutable lists, immutable optionals), immutable exceptions, immutable formatting strings | arrays, immutable strings, byte strings, dictionaries | |
Non-object types, such as primitives and arrays. | All object (class) types [12] |