DESIGN AND ANALYSIS OF DISTRIBUTED ALGORITHMS phần 10 docx

528 DETECTING STABLE PROPERTIES access to its own local clock c x , so the value c x (t)ofx’s clock at real time t might be different from that of other entities at the same time, and all of them different from t. Furthermore, unless the additional restrictions of full synchronicity hold, the local clocks might have different speeds, the distance between consecutive ticks of the same clock might change over time, there are no time bounds on communication delays, and so forth. In other words, within the system, there is no common notion of time. Fortunately, practically in all cases, although useful, a common notion of time is not needed. To understand what is sufficient for our purposes, observe that “real time” gives a total order to all the events and the actions that occur in the system: We can say whether two events occur at the same time, whether an action is performed before an event takes place, and so forth. In other words, given any two actions or events that occurred in the system, we (external observers) can say (using real time) whether one occurred before, at the same time as, or after the other. The entities in the system, with just access to their local clocks, have much less knowledge about the temporal relationships of actions and events; however, they do have some. In particular,  each entity has a complete temporal knowledge of the events and actions occurring locally;  when a message arrives, it also knows that the action of transmitting this message happened before its reception. It turns out that this knowledge is indeed sufficient for obtaining a consistent snapshot. To see how, let us first of all generalize the notion of snapshot and introduce that of a cut. Let t 1 ,t 2 , ,t n be instants of real time, not necessarily distinct, and let x 0 ,x 2 , ,x n be the entities; then C(x i )[t i ] denotes the state of entity x i in computation C at time t i . The set T ={t 1 ,t 2 , ,t n } is called a time cut, and the set C[T ] ={C(x 1 )[t 1 ],C(x 2 )[t 2 ], ,C(x n )[t n ]} of the associated entities’ states is called the snapshot of C at time cut T. Notice that if all t i are the same, the corresponding snapshot is perfect. A cut partitions a computation into three temporal sides: before the cut, at the cut, and after the cut. This is very clear if one looks at the Time ×Event Diagram (TED) (introduced in Chapter 1) of the computation C. For example, Figure 8.10 shows the TED of a simple computation C and three cuts (in bold) T 1 ,T 2 , and T 3 for C. Anything before the cut is called past, the cut is called present, and anything after the cut is called future. GLOBAL STABLE PROPERTY DETECTION 529 T 1 x y z w x y z w x y z w T 3 T 2 FIGURE 8.10: Cut T 1 generates a perfect snapshot; T 2 gives a consistent snapshot; T 3 generates an inconsistent snapshot. 530 DETECTING STABLE PROPERTIES Consider an event e occurring in C; this event was either generated by some action (i.e., sending a message or setting the alarm clock) or happened spontaneously (i.e., an impulse). Clearly, the real time of the generating action is before the real time of the generated event. Informally, the snapshot generated by a cut is consistent if it preserves this temporal relationship. Let us express this concept more precisely. Let x i and x j denote the entity where the action and the event occurred, respectively (in the case of a spontaneous event, x i = x j ); and let t − and t + denote the time when the action and the event occurred, respectively (in the case of a spontaneous event, t − = t + ). Consider now a snapshot C[T ] corresponding to a cut T ={t 1 ,t 2 , ,t n }; the snapshot C[T ]isconsistent if for every event e occurring in C the following condition holds if t − ≥ t i then t + >t j . (8.5) In other words, the snapshot generated by a cut is consistent if, in the cut, a message is not received before sending that message. For example, of the snapshots generated by the three cuts shown in Figure 8.10, the ones generated by T 1 and T 2 are consistent; indeed, the former is a perfect snapshot. On the contrary, the snapshot generated by cut T 3 is not consistent: The message by x to w is sent in the future of T 3 , but it is received in the past. Summarizing, our strategy to resolve the personal query problem is to collect at the initiator x a consistent snapshot C[T ] by having each entity x j send its internal state C(x j )[t j ]tox. We must now show that consistent snapshots are sufficient for answering a personal query. This is indeed the case (Exercise ??): Property 8.4.1 Let C(T ) be a consistent snapshot. If P(C) holds for the cut T, it holds for every T  ≥ T . As a consequence, Property 8.4.2 Let x = x i start the collection of the snapshot C[T ] at time t and terminate at time t  , t ≤ t i ≤ t  ; then 1. if P(C) holds at time t, then P(C) holds for the cut T; 2. if P(C) does not hold for the cut T, then P(C) does not hold at time t. Thus, our problem is now how to compute a consistent snapshot, which we will examine next. 8.4.3 Computing a Consistent Snapshot Our task is to design a protocol to compute a consistent snapshot. To achieve this task, each entity x i must select a time t i , and these local choices must be such that the snapshot generated by the resulting cut is consistent. Specifically, each t i must be GLOBAL STABLE PROPERTY DETECTION 531 such that if x i sent a C-message to a neighbor x j at or after t i , this message must arrive at x j after time t j . The difficulty is that as communication delays are unpredictable, x i does not know when its message arrives. Fortunately, there is a very simple way to achieve our goal. Notice that as we have assumed FIFO links, when an entity y receives a message from a neighbor x, y knows that all messages sent by x to y before transmitting this one have already arrived. We can use this fact as follows. Consider the following generalization of WFlood from Chapter 2: Protocol WFlood+: 1. an initiator sends a wake-up to all neighbors; 2. a noninitiator, upon receiving a wake-up message for the first time, sends a wake-up to all its neighbors. Notice that the only difference between WFlood+ and WFlood is that now a noninitiator sends a wake-up message also to the entity that woke it up. Let t i be the time when x i becomes “awake” (i.e., it initiates WFlood+or receives the first “wake-up” message). An interesting and important property is the following: Property 8.4.3 If x i sends a C-message to x j at time t>t i , then this message will arrive at x j at a time t  > t j . Proof. Consider a “wake-up” message sent by an entity x i to a neighbor x j at time t> t i ; this message will arrive at x j at some time t  . Recall that x i at time t i sent a “wake-up” message to all its neighbors, including x j ; as links are FIFO, this “wake- up” message arrived to x j at some time t  before the C-message, that is, t  >t  . When x j receives the “wake-up” message from x i , either it is already awake or it is woken up by it. In either case, t  ≥ t j ;ast  >t  , it follows that t  > t j . ᭿ This means that in the time cut T ={t 1 , t 2 , ,t n } defined by these time values, no C-message is sent at T and every C-message sent after T also arrives after T .In other words, Property 8.4.4 The snapshot C[ T ] is consistent. Thus the problem of constructing a consistent snapshot is solved by simply executing a wake-up using WFlood+. The cost is easy to determine: In the execution of WFlood+ regardless of the number of initiators, exactly two messages are sent on each link, one in each direction. Thus, a total of 2m messages are sent. 8.4.4 Summary: Putting All Together We have just seen how to determine a consistent snapshot C[ T ] (Protocol WFlood+) with multiple initiators. Once this is done, the entities still have to determine whether or not property P holds for C[ T ]. 532 DETECTING STABLE PROPERTIES This can be accomplished by having each x i send its local state C(x i )[t i ] to some predefined entity (e.g., the initiator in case of a single-initiator, or the saturated nodes over an existing spanning tree, or a previously elected leader); this entity will collect these fragments of the snapshot, construct from them snapshot C[ T ], determine locally whether or not property P holds for C[ T ], and (if required) notify all other entities of the result of the local query. Depending on the size of the local fragments of the snapshot, the amount of information transmitted can be prohibitive. An alternative to this centralized solution is to compute P(C)at T distributively. This, however, requires knowledge of the nature of property P, something that we neither have nor want to require; recall: Our original goal is to design a protocol to detect a stable property P regardless of its nature. At this point, we have a (centralized or decentralized) protocol Q for solving the personal query problem. We can then follow strategy RepeatQuery and repeatedly execute Q until the stable property P(C) is detected to hold. As already mentioned, the overall cost is the cost of Q times the number of times Q is invoked; as we already observed in the case of termination detection, without any control, this cost is unbounded. Summarizing, we have seen how to solve the global detection problem for stable properties by repeatedly taking consistent snapshots of the system; such a snapshot is sometimes called a global state of the system. This solution is independent of the stable property and thus can be applied to any. We have also seen that the cost of the solution we have designed can be prohibitive and, without some other control, it is possibly unbounded. Clearly, for specific properties we can use knowledge of the property to reduce the costs (e.g., the number of times Q is executed, as we did in the case of termination) or to develop different ad hoc solutions (as we did in the case of deadlock); for example, see Problem 8.6.16. 8.5 BIBLIOGRAPHICAL NOTES The problem of distributed deadlock detection has been extensively studied and a very large number of solutions have been designed, proposed, and analyzed. However, not all these attempts have been successful, some failing to work correctly, either detecting false deadlocks or failing to detect existing deadlocks, others exhibiting very poor performance. As deadlock can occur in almost any application area, solutions have been developed from researchers in all these areas (from distributed databases to systems of finite state machines, from distributed operating systems to distributed transactions to distributed simulation), many times unaware of (and sometimes repro- ducing) each other’s efforts and results. Also, deadlocks in different types of request systems (single request, AND, OR, etc.) have oftentimes been studied in isolation as differentproblems, overlooking the similarities and the commonalities and sometimes proposing the same techniques. In addition, because of its link with cycle detection and with knot detection, some aspects of deadlock detection have also been studied by investigators in distributed graph algorithms. BIBLIOGRAPHICAL NOTES 533 Interestingly, one of the earliest algorithms, LockGrant, is not only the most efficient (in the order of magnitude) protocol for personal detection with a single initiator in a static graph but also the most general as it can be used (efficiently) in all types of request systems. It has been designed by Gabriel Bracha and Sam Toueg [2], and their static protocol can be modified to work efficiently also on dynamic graphs in all request systems (Problem 8.6.9). The number of messages has been subsequently reduced from 4m to 2m by Ajay Kshemkalyani and Mukesh Singhal [11]. In the presence of multiple initiators, the idea of integrating a leader-election process into the detection protocol (Problem 8.6.2) was first proposed by Israel Cidon [6]. The simpler problem of personal knot detection was first solved by Mani Chandy and Jayadev Misra [4] for a single initiator with 4m messages and later with 2m messages by Azzedine Boukerche and Carl Tropper [1]. A protocol for multiple initiators that uses only 3m +O(n log n) messages has been designed by Israel Cidon [6]. The problem of detecting global termination of a computation was first posed by Nissim Francez [9] and Edsger Dijkstra and Carel Scholten [8]. Protocol TerminationQuery for the personal termination query problem was designed by Rodney Topor [21] and used in strategy RepeatQuery for the personal termination detection problem. The more efficient protocol Shrink for single initiator is due to Edsger Dijkstra and Carel Scholten [8]; its extension to multiple initiators, protocol MultiShrink, has been designed by Nir Shavit and Nissim Francez [18]. The idea of message counting was first employed by Mani Chandy and Jayadev Misra [5] and refined by Friedmann Mattern [13]. Other mechanisms and ideas employed to detect termination include the following: “markers,” proposed by Jayadev Misra [16]; “credits,” suggested by Friedmann Mattern [14]; and “timestamps,” proposed by S. Rana [17]. The relationship between the problems of garbage collection and that of global termination detection was first observed by Carel Scholten [unpublished], made explicit (in one direction) by Gerard Tel, Richard Tan, and Jan van Leeuwen [20], and analyzed (in the other direction: Problem 8.6.14) by Gerard Tel and Friedmann Mattern [19]. The fact that Protocol WFlood+ constructs a consistent snapshot was first observed by Mani Chandy and Leslie Lamport [3]. Protocols to construct a consistent snapshot when the links are not FIFO were designed by Ten Lai and Tao Yang [12] and Friedmann Mattern [15]; they, however, require C-messages to contain control information. The strategy of constructing and checking a consistent snapshot has been used by Gabriel Bracha and Sam Toueg for deadlock detection in dynamic graphs [2], and by Shing-Tsaan Huang [10] and Friedmann Mattern [13] for termination detection. 534 DETECTING STABLE PROPERTIES 8.6 EXERCISES, PROBLEMS, AND ANSWERS 8.6.1 Exercises Exercise 8.6.1 Prove that protocol GeneralSimpleCheck would solve the personal and component deadlock detection problem. Exercise 8.6.2 Show the existence of wait-for graphs of n nodes in which protocol GeneralSimpleCheck would require a number of messages exponential in n. Exercise 8.6.3 Show a situation where, when executing protocol LockGrant,an entity receives a “Grant” message after it has terminated its execution of Shout. Exercise 8.6.4 Prove that in protocol LockGrant, if an entity sends a “Grant” message to a neighbor, it will receive a “Grant-Ack” from that neighbor within finite time. Exercise 8.6.5 Prove that in protocol LockGrant, if an entity sends a “Shout” message to a neighbor, it will receive a “Reply” from that neighbor within finite time. Exercise 8.6.6 Prove that in protocol LockGrant, if a “Grant” message has not been acknowledged at time t, the initiator x 0 has not yet received a “Reply” from all its neighbors at that time. Exercise 8.6.7 Prove that in protocol LockGrant, if an entity receives a “Grant” message from all its out-neighbors then it is not deadlocked. Exercise 8.6.8 Prove that in protocol LockGrant, if an entity is not deadlocked, it will receive a “Grant” message from all its out-neighbors within finite time. Exercise 8.6.9 Modify the definition of a solution protocol for the collective deadlock detection problem in the dynamic case. Exercise 8.6.10 Prove that in the dynamic single-request model, once formed the core of a crown will remain unchanged. Exercise 8.6.11 Prove that in the dynamic single-request model, if the initiator x 0 is in a rooted tree that is not going to become (part of) a crown, then its message is eventually going to reach the root of the tree. Exercise 8.6.12 Prove that in the dynamic single-request model, if a new crown is formed while the “Check” message started by x 0 is still traveling, the protocol will correctly notify x 0 that it is involved in a deadlock. EXERCISES, PROBLEMS, AND ANSWERS 535 Exercise 8.6.13 Prove that in the dynamic single-request model, if a new crown is formed while the “Check” message started by x 0 is still traveling, the protocol will correctly notify x 0 that it is involved in a deadlock. Exercise 8.6.14 Modify protocol LockGrant so that it solves the personal and the collective deadlock detection problem in the OR-Request model. Assume a single initiator. Prove the correctness and analyze the cost of the resulting protocol. Imple- ment and throughly test your protocol. Compare the experimental results with the theoretical bounds. Exercise 8.6.15 Implement and throughly test the protocol designed in Exercise 8.6.14. Compare the experimental results with the theoretical bounds. Exercise 8.6.16 Modify protocol LockGrant so that it solves the personal and the collective deadlock detection problem in the p-OF-q Request model. Assume a single initiator. Prove the correctness and analyze the cost of the resulting protocol. Imple- ment and throughly test your protocol. Compare the experimental results with the theoretical bounds. Exercise 8.6.17 Implement and throughly test the protocol designed in Exercise 8.6.16. Compare the experimental results with the theoretical bounds. Exercise 8.6.18 Modify protocol LockGrant so that it solves the personal and the collective deadlock detection problem in the Generalized Request model. Assume a single initiator. Prove the correctness and analyze the cost of the resulting protocol. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Exercise 8.6.19 Implement and throughly test the protocol designed in Exercise 8.6.18. Compare the experimental results with the theoretical bounds. Exercise 8.6.20 Prove that protocol TerminationQuery is a correct personal query protocol, that is, show that Property 8.3.1 holds. Exercise 8.6.21 Prove that using strategy RepeatQuery+, protocol Q is executed at most T ≤ M(C) times. Show an example in which T = M(C). Exercise 8.6.22 Let Q be a multiple-initiators personal query protocol. Modify strategy RepeatQuery+ to work with multiple initiators. Exercise 8.6.23 Consider strategy Shrink for personal termination detection with a single initiator. Show that at any time, all black nodes form a tree rooted in the initiator and all white nodes are singletons. 536 DETECTING STABLE PROPERTIES Exercise 8.6.24 Consider strategy Shrink for personal termination detection with a single initiator. Prove that if all nodes are white at time t, then C is terminated at that time. Exercise 8.6.25 Consider strategy Shrink for personal termination detection with a single initiator. Prove that if C is terminated at time t, then there is a t  ≥ t such that all nodes are white at time t  . Exercise 8.6.26 Consider strategy Shrink for personal termination detection with multiple initiators. Show that at any time, the black nodes form a forest of trees, each rooted in one of the initiators, and the white nodes are singletons. Exercise 8.6.27 Consider strategy Shrink for personal termination detection with multiple initiators. Prove that, if all nodes are white at time t, then C is terminated at that time. Exercise 8.6.28 Consider strategy Shrink for personal termination detection with multiple initiators. Prove that if C is terminated at time t, then there is a t  ≥ t such that all nodes are white at time t  . Exercise 8.6.29 Consider protocol MultiShrink for personal termination detection with multiple initiators. Prove that when a saturated node becomes white all other nodes are also white. Exercise 8.6.30 Consider protocol MultiShrink for personal termination detection with multiple initiators. Explain why it is possible that only one entity becomes saturated. Show an example. Exercise 8.6.31 () Prove that for every computation C, every protocol must send at least 2n −1 messages in the worst case to detect the global termination of C. 8.6.2 Problems Problem 8.6.1 Write the set of rules of protocol Dead Check implementing the simple check strategy for personal and for collective deadlock detection in the single resource model. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.2 () For the problem of personal deadlock detection with multiple initiators consider the strategy to integrate into the solution an election process among the initiators. Design a protocol for the single-request model to implement efficiently this strategy; its total cost should be o(kn) messages in the worst case, where k is the number of initiators and n is the number of entities. Prove the correctness and analyze EXERCISES, PROBLEMS, AND ANSWERS 537 the cost of your design. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.3 Implement protocol LockGrant, both for personal and for collective deadlock detections. Throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.4 () In protocol LockGrant employ Shout+ instead of Shout,soasto use at most 4|E(x 0 )|messages in the worst case. Write the corresponding set of rules. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.5 () For the problem of personal deadlock detection with multiple initiators consider the strategy to integrate into the solution an election process among the initiators. Design a protocol for the AND-request model to implement efficiently this strategy; its total cost should be o(km) messages in the worst case, where k is the number of initiators and m is the number of links in the wait-for graph. Prove the correctness and analyze the cost of your design. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.6 () Modify protocol LockGrant so that, with a single initiator, it works correctly also in a dynamic wait-for graph. Prove the correctness and analyze the cost of the modified protocol. Problem 8.6.7 () For the problem of personal deadlock detection with multiple initiators consider the strategy to integrate into the solution an election process among the initiators. Design a protocol for the OR-request model to implement efficiently this strategy; its total cost should be o(km) messages in the worst case, where k is the number of initiators and m is the number of links in the wait-for graph. Prove the correctness and analyze the cost of your design. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.8 () For the problem of personal deadlock detection with multiple initiators consider the strategy to integrate into the solution an election process among the initiators. Design a protocol for the p-OF-q request model to implement efficiently this strategy; its total cost should be o(km) messages in the worst case, where k is the number of initiators and m is the number of links in the wait-for graph. Prove the correctness and analyze the cost of your design. Implement and throughly test your protocol. Compare the experimental results with the theoretical bounds. Problem 8.6.9 () For the problem of personal deadlock detection with multiple initiators consider the strategy to integrate into the solution an election process among the initiators. Design a protocol for the Generalized request model to implement efficiently this strategy; its total cost should be o(km) messages in the worst case, where k is the number of initiators and m is the number of links in the wait-for graph. [...]... BIBLIOGRAPHY [1] A Boukerche and C Tropper A distributed graph algorithm for the detection of local cycles and knots IEEE Transactions on Parallel and Distributed Systems, 9(8):748–757, August 1998 [2] G Bracha and S Toueg Distributed deadlock detection Distributed Computing, 2: 127–138, 1987 [3] K M Chandy and L Lamport Distributed snapshots: Determining global states of distributed systems ACM Transactions... entered between xi and xi+1 , that is, Q[t ] = x1 , x2 , , xi , y, xi+1 , , xk In other words, the execution of protocol OnDemandTraversal can be viewed as the management of a distributed ordered queue This point of view opens an interesting and surprising connection between the problem of distributed mutual exclusion and that of fair management of a distributed queue: Any fair distributed queue-management... of distributed mutual exclusion and that of managing a distributed queue (another continuous computation) In particular, we will see how any protocol for fair management of a distributed queue can be used to solve the problem of distributed mutual exclusion Throughout, we will assume restrictions IR 9.3.2 A Simple and Efficient Solution The problem of distributed mutual exclusion has a very simple and. .. Efficient algorithms for distributed snapshots and global virtual time approximation Journal of Parallel and Distributed Computing, 18(4):423–434, August 1993 540 DETECTING STABLE PROPERTIES [16] J Misra Detecting termination of distributed computations using markers In 2nd Symposium on Principles of Distributed Computing, pages 290–294, Montreal, 1983 [17] S P Rana A distributed solution of the distributed. .. [4] K M Chandy and J Misra A distributed graph algorithm: knot detection ACM Transactions on Programming Languages and Systems, 4:144–156, 1982 [5] K M Chandy and J Misra A paradigm for detecting quiescent properties in distributed computations In K.R Apt (Ed.), Logic and models of concurrent systems, 1985 [6] I Cidon An efficient distributed knot-detection algorithm IEEE Transactions on Software Engineering,... clocks, controlling access to a shared resource or service, maintaining a distributed queue, and detecting and resolving deadlocks Design and Analysis of Distributed Algorithms, by Nicola Santoro Copyright © 2007 John Wiley & Sons, Inc 541 542 CONTINUOUS COMPUTATIONS Some continuous problems are just the (endless) repetition of a terminating problem (plus adjustments); others could be solved in that... event b, and denote this fact by a → b, if one of the following conditions holds: 1 both a and b occur at the same entity and t(a) < t(b); 2 a is the event at x whose reaction is the transmission of a message to neighbor y, and b is the arrival at y of that message; 3 there exists a sequence e1 , e2 , , ek of events such that e1 = a, ek = b, and ei → ei+1 We will say that two events a and b are... , regardless of the location of y in the network It uses a spanning tree of the network; it also requires the existence and availability of a correct routing mechanism (possibly, using only edges of the tree) The strategy of Arrow is based on two ideas: (i) the entity holding the token knows the identity of the first entity in the queue, and every entity in the queue knows the identity of the next one... the queue, and it already knows the identity of the entity x2 that should receive the token when it has finished In other words, the handling of the token is done independently of the handling of the requests and is implemented using a correct routing protocol; thus, as long as every entity in the queue knows the identity of the next, token transfers pose no problems Consider now how to handle the requests... needed structure and information is in place, a single request for the token can be easily and simply handled, correctly maintaining and updating the structure and information If there are several concurrent requests for the token, the handling of one could interfere with the handling of another, for example, when trying to root the tree in the “last” entity in the queue: Indeed, which of them is going . access to a shared resource or service, maintaining a distributed queue, and detecting and resolving deadlocks. Design and Analysis of Distributed Algorithms, by Nicola Santoro Copyright © 2007 John. has been designed by Nir Shavit and Nissim Francez [18]. The idea of message counting was first employed by Mani Chandy and Jayadev Misra [5] and refined by Friedmann Mattern [13]. Other mechanisms and. case, where k is the number of initiators and m is the number of links in the wait-for graph. Prove the correctness and analyze the cost of your design. Implement and throughly test your protocol.