diff options
Diffstat (limited to 'src/testbed/barriers.README.org')
-rw-r--r-- | src/testbed/barriers.README.org | 95 |
1 files changed, 0 insertions, 95 deletions
diff --git a/src/testbed/barriers.README.org b/src/testbed/barriers.README.org deleted file mode 100644 index 159e1c355..000000000 --- a/src/testbed/barriers.README.org +++ /dev/null | |||
@@ -1,95 +0,0 @@ | |||
1 | * Description | ||
2 | The testbed subsystem's barriers API facilitates coordination among the peers | ||
3 | run by the testbed and the experiment driver. The concept is similar to the | ||
4 | barrier synchronisation mechanism found in parallel programming or | ||
5 | multi-threading paradigms - a peer waits at a barrier upon reaching it until the | ||
6 | barrier is reached by a predefined number of peers. This predefined number of | ||
7 | peers required to cross a barrier is also called quorum. We say a peer has | ||
8 | reached a barrier if the peer is waiting for the barrier to be crossed. | ||
9 | Similarly a barrier is said to be reached if the required quorum of peers reach | ||
10 | the barrier. A barrier which is reached is deemed as crossed after all the | ||
11 | peers waiting on it are notified. | ||
12 | |||
13 | The barriers API provides the following functions: | ||
14 | 1) GNUNET_TESTBED_barrier_init(): function to initialise a barrier in the | ||
15 | experiment | ||
16 | 2) GNUNET_TESTBED_barrier_cancel(): function to cancel a barrier which has been | ||
17 | initialised before | ||
18 | 3) GNUNET_TESTBED_barrier_wait(): function to signal barrier service that the | ||
19 | caller has reached a barrier and is waiting for it to be crossed | ||
20 | 4) GNUNET_TESTBED_barrier_wait_cancel(): function to stop waiting for a barrier | ||
21 | to be crossed | ||
22 | |||
23 | Among the above functions, the first two, namely GNUNET_TESTBED_barrier_init() | ||
24 | and GNUNET_TESTBED_barrier_cacel() are used by experiment drivers. All barriers | ||
25 | should be initialised by the experiment driver by calling | ||
26 | GNUNET_TESTBED_barrier_init(). This function takes a name to identify the | ||
27 | barrier, the quorum required for the barrier to be crossed and a notification | ||
28 | callback for notifying the experiment driver when the barrier is crossed. The | ||
29 | GNUNET_TESTBED_function barrier_cancel() cancels an initialised barrier and | ||
30 | frees the resources allocated for it. This function can be called upon a | ||
31 | initialised barrier before it is crossed. | ||
32 | |||
33 | The remaining two functions GNUNET_TESTBED_barrier_wait() and | ||
34 | GNUNET_TESTBED_barrier_wait_cancel() are used in the peer's processes. | ||
35 | GNUNET_TESTBED_barrier_wait() connects to the local barrier service running on | ||
36 | the same host the peer is running on and registers that the caller has reached | ||
37 | the barrier and is waiting for the barrier to be crossed. Note that this | ||
38 | function can only be used by peers which are started by testbed as this function | ||
39 | tries to access the local barrier service which is part of the testbed | ||
40 | controller service. Calling GNUNET_TESTBED_barrier_wait() on an uninitialised | ||
41 | barrier results in failure. GNUNET_TESTBED_barrier_wait_cancel() cancels the | ||
42 | notification registered by GNUNET_TESTBED_barrier_wait(). | ||
43 | |||
44 | |||
45 | * Implementation | ||
46 | Since barriers involve coordination between experiment driver and peers, the | ||
47 | barrier service in the testbed controller is split into two components. The | ||
48 | first component responds to the message generated by the barrier API used by the | ||
49 | experiment driver (functions GNUNET_TESTBED_barrier_init() and | ||
50 | GNUNET_TESTBED_barrier_cancel()) and the second component to the messages | ||
51 | generated by barrier API used by peers (functions GNUNET_TESTBED_barrier_wait() | ||
52 | and GNUNET_TESTBED_barrier_wait_cancel()). | ||
53 | |||
54 | Calling GNUNET_TESTBED_barrier_init() sends a BARRIER_INIT message to the master | ||
55 | controller. The master controller then registers a barrier and calls | ||
56 | GNUNET_TESTBED_barrier_init() for each its subcontrollers. In this way barrier | ||
57 | initialisation is propagated to the controller hierarchy. While propagating | ||
58 | initialisation, any errors at a subcontroller such as timeout during further | ||
59 | propagation are reported up the hierarchy back to the experiment driver. | ||
60 | |||
61 | Similar to GNUNET_TESTBED_barrier_init(), GNUNET_TESTBED_barrier_cancel() | ||
62 | propagates BARRIER_CANCEL message which causes controllers to remove an | ||
63 | initialised barrier. | ||
64 | |||
65 | The second component is implemented as a separate service in the binary | ||
66 | `gnunet-service-testbed' which already has the testbed controller service. | ||
67 | Although this deviates from the gnunet process architecture of having one | ||
68 | service per binary, it is needed in this case as this component needs access to | ||
69 | barrier data created by the first component. This component responds to | ||
70 | BARRIER_WAIT messages from local peers when they call | ||
71 | GNUNET_TESTBED_barrier_wait(). Upon receiving BARRIER_WAIT message, the service | ||
72 | checks if the requested barrier has been initialised before and if it was not | ||
73 | initialised, an error status is sent through BARRIER_STATUS message to the local | ||
74 | peer and the connection from the peer is terminated. If the barrier is | ||
75 | initialised before, the barrier's counter for reached peers is incremented and a | ||
76 | notification is registered to notify the peer when the barrier is reached. The | ||
77 | connection from the peer is left open. | ||
78 | |||
79 | When enough peers required to attain the quorum send BARRIER_WAIT messages, the | ||
80 | controller sends a BARRIER_STATUS message to its parent informing that the | ||
81 | barrier is crossed. If the controller has started further subcontrollers, it | ||
82 | delays this message until it receives a similar notification from each of those | ||
83 | subcontrollers. Finally, the barriers API at the experiment driver receives the | ||
84 | BARRIER_STATUS when the barrier is reached at all the controllers. | ||
85 | |||
86 | The barriers API at the experiment driver responds to the BARRIER_STATUS message | ||
87 | by echoing it back to the master controller and notifying the experiment | ||
88 | controller through the notification callback that a barrier has been crossed. | ||
89 | The echoed BARRIER_STATUS message is propagated by the master controller to the | ||
90 | controller hierarchy. This propagation triggers the notifications registered by | ||
91 | peers at each of the controllers in the hierarchy. Note the difference between | ||
92 | this downward propagation of the BARRIER_STATUS message from its upward | ||
93 | propagation -- the upward propagation is needed for ensuring that the barrier is | ||
94 | reached by all the controllers and the downward propagation is for triggering | ||
95 | that the barrier is crossed. | ||