Routing in delay-tolerant networking concerns itself with the ability to transport, or route, data from a source to a destination, which is a fundamental ability all communication networks must have. Delay- and disruption-tolerant networks (DTNs) are characterized by their lack of connectivity, resulting in a lack of instantaneous end-to-end paths. In these challenging environments, popular ad hoc routing protocols such as AODV[1] and DSR[2] fail to establish routes. This is due to these protocols trying to first establish a complete route and then, after the route has been established, forward the actual data. However, when instantaneous end-to-end paths are difficult or impossible to establish, routing protocols must take to a "store and forward" approach, where data is incrementally moved and stored throughout the network in hopes that it will eventually reach its destination.[3] [4] [5] A common technique used to maximize the probability of a message being successfully transferred is to replicate many copies of the message in hopes that one will succeed in reaching its destination.[6]
There are many characteristics DTN protocols, including routing, must take into consideration. A first consideration is if information about future contacts is readily available. For example, in interplanetary communications, many times a planet or moon is the cause of contact disruption, and large distance is the cause of communication delay. However, due to the laws of physics, it is possible to predict the future in terms of the times contacts will be available, and how long they will last. These types of contacts are known as scheduled or predictable contacts.[7] On the contrary, in disaster recovery networks the future location of communicating entities, such as emergency responders, may not be known. These types of contacts are known as intermittent or opportunistic contacts.
A second consideration is if mobility can be exploited and, if so, which nodes are mobile. There are three major cases, classifying the level of mobility in the network. First, it is possible that there are no mobile entities. In this case, contacts appear and disappear based solely on the quality of the communication channel between them. For instance, in interplanetary networks, large objects in space, such as planets, can block communicating nodes for a set period of time. Second, it is possible that some, but not all, nodes in the network are mobile. These nodes, sometimes referred to as Data Mules,[8] [9] are exploited for their mobility. Since they are the primary source of transitive communication between two non-neighboring nodes in the network, an important routing question is how to properly distribute data among these nodes. Third, it is possible that the vast majority, if not all, nodes in the network are mobile. In this case, a routing protocol will most likely have more options available during contact opportunities, and may not have to utilize each one.[10] [11] [12] An example of this type of network is a disaster recovery network where all nodes (generally people and vehicles) are mobile.[13] A second example is a vehicular network where mobile cars, trucks, and buses act as communicating entities.
A third consideration is the availability of network resources. Many nodes, such as mobile phones, are limited in terms of storage space, transmission rate, and battery life. Others, such as buses on the road, may not be as limited. Routing protocols can utilize this information to best determine how messages should be transmitted and stored to not over-burden limited resources. As of April 2008, only recently has the scientific community started taking resource management into consideration, and this is still an active area of research.
While there are many characteristics of routing protocols, one of the most immediate ways to create a taxonomy is based on whether or not the protocol creates replicas of messages. Routing protocols that never replicate a message are considered forwarding-based, whereas protocols that do replicate messages are considered replication-based.
There are both advantages and disadvantages to each approach, and the appropriate approach to use is probably dependent on the scenario at hand. Forwarding-based approaches are generally much less wasteful of network resources, as only a single copy of a message exists in storage in the network at any given time.[14] Furthermore, when the destination receives the message, no other node can have a copy. This eliminates the need for the destination to provide feedback to the network (except for, perhaps, an acknowledgments sent to the sender), to indicate outstanding copies can be deleted. Unfortunately, forwarding-based approaches do not allow for sufficient message delivery rates in many DTNs. Replication-based protocols, on the other hand, allow for greater message delivery rates, since multiple copies exist in the network, and only one (or in some cases, as with erasure coding, a few) must reach the destination. However, the tradeoff here is that these protocols can waste valuable network resources. Furthermore, many flooding-based protocols are inherently not scalable. Some protocols, such as Spray and Wait, attempt to compromise by limiting the number of possible replicas of a given message.
It is important to note that the vast majority of DTN routing protocols are heuristic-based, and non-optimal. This is due to optimality being, in the general DTN case, NP-hard. More specifically "online algorithms without complete future knowledge and with unlimited computational power, or computationally limited algorithms with complete future knowledge, can be arbitrarily far from optimal".
Replication-based protocols have recently obtained much attention in the scientific community, as they can allow for substantially better message delivery ratios than in forwarding-based protocols. These types of routing protocols allow for a message to be replicated; each of the replicas, as well as the original message itself, are generally referred to as message copies or message replicas. Possible issues with replication-based routing include:
Since network resources may quickly become constrained, deciding which messages to transmit first and which messages to drop first play critical roles in many routing protocols.
Epidemic routing is flooding-based in nature, as nodes continuously replicate and transmit messages to newly discovered contacts that do not already possess a copy of the message. In the most simple case, epidemic routing is flooding; however, more sophisticated techniques can be used to limit the number of message transfers. Epidemic routing has its roots in ensuring distributed databases remain synchronized, and many of these techniques, such as rumor mongering, can be directly applied to routing.
Epidemic routing is particularly resource hungry because it deliberately makes no attempt to eliminate replications that would be unlikely to improve the delivery probability of messages. This strategy is effective if the opportunistic encounters between nodes are purely random, but in realistic situations, encounters are rarely totally random. Data Mules (mostly associated with a human) move in a society and accordingly tend to have greater probabilities of meeting certain Mules than others. The Probabilistic Routing Protocol using History of Encounters and Transitivity (PRoPHET) protocol uses an algorithm that attempts to exploit the non-randomness of real-world encounters by maintaining a set of probabilities for successful delivery to known destinations in the DTN (delivery predictabilities) and replicating messages during opportunistic encounters only if the Mule that does not have the message appears to have a better chance of delivering it. This strategy was first documented in a paper from 2003.[15]
An adaptive algorithm is used to determine the delivery predictabilities in each Mule. The Mule M stores delivery predictabilities P(M,D) for each known destination D. If the Mule has not stored a predictability value for a destination P(M,D) is assumed to be zero. The delivery predictabilities used by each Mule are recalculated at each opportunistic encounter according to three rules:
The protocol has been incorporated into the reference implementation maintained by the IRTF DTN Research Group and the current version is documented in RFC 6693. The protocol has been trialled in real world situations during the Sámi Network Connectivity (SNC) project and is being further developed during the EU Framework Programme 7 project Networking for Communications Challenged Communities (N4C).
MaxProp was developed at the University of Massachusetts, Amherst and was, in part, funded by DARPA and the National Science Foundation. The original paper is found in the IEEE INFOCOM 2006 conference. MaxProp is flooding-based in nature, in that if a contact is discovered, all messages not held by the contact will attempt to be replicated and transferred. The intelligence of MaxProp comes in determining which messages should be transmitted first and which messages should be dropped first. In essence, MaxProp maintains an ordered-queue based on the destination of each message, ordered by the estimated likelihood of a future transitive path to that destination.
To obtain these estimated path likelihoods, each node maintains a vector of size
n-1
n
n-1
1 | |
|n|-1 |
j
jth
When two nodes meet, they first exchange their estimated node-meeting likelihood vectors. Ideally, every node will have an up-to-date vector from every other node. With these n vectors at hand, the node can then compute a shortest path via a depth-first search where path weights indicate the probability that the link does not occur (note that this is 1 minus the value found in the appropriate vector). These path weights are summed to determine the total path cost, and are computed over all possible paths to the destinations desired (destinations for all messages currently being held). The path with the least total weight is chosen as the cost for that particular destination. The messages are then ordered by destination costs, and transmitted and dropped in that order.
In conjunction with the core routing described above, MaxProp allows for many complementary mechanisms, each helping the message delivery ratio in general. First, acknowledgements are injected into the network by nodes that successfully receive a message (and are the final destination of that message). These acknowledgements are 128-bit hashes of the message that are flooded into the network, and instruct nodes to delete extra copies of the message from their buffers. This helps free space so outstanding messages are not dropped as often. Second, packets with low hop-counts are given higher priority. This helps promote initial rapid message replication to give new messages a "head start". Without this head start, newer messages can be quickly starved by older messages, since there are generally less copies of new messages in the network. Third, each message maintains a "hop list" indicating nodes it has previously visited to ensure that it does not revisit a node.
RAPID, which is an acronym for Resource Allocation Protocol for Intentional DTN routing, was developed at the University of Massachusetts, Amherst. It was first introduced in the SIGCOMM 2007 publication, DTN Routing as a Resource Allocation Problem. The authors of RAPID argue as a base premise that prior DTN routing algorithms incidentally effect performance metrics, such as average delay and message delivery ratio. The goal of RAPID is to intentionally effect a single routing metric. At the time of publication, RAPID has been instrumented to intentionally minimize one of three metrics: average delay, missed deadlines, and maximum delay.
The core of the RAPID protocol is based around the concept of a utility function. A utility function assigns a utility value,
Ui
i
Ui
i
Ui=-D(i)
The overall protocol is composed of four steps:
Spray and Wait is a routing protocol that attempts to gain the delivery ratio benefits of replication-based routing as well as the low resource utilization benefits of forwarding-based routing. Spray and Wait was developed by researchers at the University of Southern California. It was first presented at the 2005 ACM SIGCOMM conference, under the publication "Spray and Wait: An Efficient Routing Scheme for Intermittently Connected Mobile Networks". Spray and Wait achieves resource efficiency by setting a strict upper bound on the number of copies per message allowed in the network.
The Spray and Wait protocol is composed of two phases: the spray phase and the wait phase. When a new message is created in the system, a number
L
L
There are two main versions of Spray and Wait: vanilla and binary. The two versions are identical except for how the
L
L
L-1
A second version, referred to as Binary Spray and Wait. Here, the source starts, as before, with
L
floor(L/2)
Bubble Rap[16] first introduces the understanding of human mobility into the DTN design. They study the social structures of the between devices and leverage them in the design of forwarding algorithms for Pocket Switched Networks(PSNs). With experiments of real world traces, they discover that human interaction is heterogeneous both in terms of hubs and groups or communities. According to this finding, they propose Bubble Rap, a social-based forwarding algorithm, to improve the forwarding efficiency significantly compared to history-based PROPHET and social-based SimBet algorithms. This algorithm also shows how it can be implemented in a distributed way, which demonstrates that it is applicable in the decentralized environment of PSNs.
CafRep[17] is a fully localised adaptive forwarding & replication protocol with congestion control and avoidance to enable congestion-aware mobile social framework in heterogeneous DTNs. CafRep uses a combined social, buffer and delay metrics for congestion-aware message forwarding and replication that maximises message delivery ratio and availability of nodes while minimising latency and packet loss rates at times of increasing congestion levels. At the core of CafRep is a combined relative utility driven heuristics that allow highly adaptive forwarding and replication policies by managing to detect and offload congested parts of the network and adapting the sending/forwarding rates based on resource and contact predictions.
RACOD: Routing Using Ant Colony Optimization in DTN [18] introduces learning of paths using ACO and also intelligently decides which message to drop and which message to transfer. In DTN, there is no exact knowledge of destination and thus we need to spread messages in all direction to search for the destination. ACO helps in wandering and building shortest path effectively. Protocol uses light-weight messages called ant to build shortest paths, the ant’s movement in ACO can be mapped with propagation of messages that are replicated in DTN and look for their destination. Moreover, this protocol also gives a better buffer management technique, it introduces a 3-way sort technique which helps in dropping old-aged or malicious messages and thus, reduces buffer overhead.
DTLSR is implemented in the DTN2 BP implementation and aims to provide a straightforward extension of link-state routing.[19] With DTLSR, link state announcements are sent as in OLSR, but links that are deemed 'down' are not immediately removed from the graph. Instead, 'downed' links are aged out by increasing their metrics until some maximum is reached, at which point they are removed from the graph. The intent of this is to cause data to continue to flow along paths that used to be supported in the hope that they will be supported again in the future.
The SABR protocol is an extension of Contact Graph Routing [20] that seeks to provide a routing solution for a wide range of scenarios that include both scheduled and discovered connectivity. For the scheduled connectivity regime, SABR uses a 'contact plan' provided by network management describing the current connectivity and future connectivity schedule. SABR then makes forwarding decisions based on an earliest-arrival-time metric where bundles are routed over the time-varying connectivity graph. SABR uses historical contact information and neighbor discovery to address routing over non-scheduled links. The SABR protocol is being standardized by the Consultative Committee for Space Data Systems.
The majority of existing routing and data delivery protocols for DTNs assume that mobile nodes willingly participate in data delivery, share their resources with each other, and follow the rules of underlying networking protocols. Nevertheless, rational nodes in real-world scenarios have strategic interactions and may exhibit selfish behaviours due to various reasons (such as resource limitations, the lack of interest in data, or social preferences).[21] For example, in case a node has limited battery resources or the cost of the network bandwidth delivered by mobile network operators is high, it would not be willingly to relay data for others until appropriate incentives are provided. Meanwhile, malicious nodes may attack the network in different ways to disturb the normal operation of the data transmission process. An adversary, for example, may drop received messages but produce forged routing metrics or false information with the aim of either attracting more messages or decreasing its detection probability. This issue becomes more challenging when some colluding attackers boost their metrics to deceive the attack detection systems. However, dealing with the non-cooperative behaviours of mobile nodes in DTNs is very challenging because of the distributed network model and intermittent access of nodes to centralised authorities.