diff options
author | Sree Harsha Totakura <totakura@in.tum.de> | 2013-09-13 13:01:09 +0000 |
---|---|---|
committer | Sree Harsha Totakura <totakura@in.tum.de> | 2013-09-13 13:01:09 +0000 |
commit | b218a20819932c4c594ea2933ff3f4f28b398325 (patch) | |
tree | cda48d07bf2050bfcd1012d40093f9f4fbf8794f /src/testbed | |
parent | 30af0b72dd40084d2fa48cffe5e49521b46b4136 (diff) | |
download | gnunet-b218a20819932c4c594ea2933ff3f4f28b398325.tar.gz gnunet-b218a20819932c4c594ea2933ff3f4f28b398325.zip |
- doc
Diffstat (limited to 'src/testbed')
-rw-r--r-- | src/testbed/barriers.README.org | 94 |
1 files changed, 59 insertions, 35 deletions
diff --git a/src/testbed/barriers.README.org b/src/testbed/barriers.README.org index 4547009e2..ed39903c0 100644 --- a/src/testbed/barriers.README.org +++ b/src/testbed/barriers.README.org | |||
@@ -1,14 +1,15 @@ | |||
1 | * Description | 1 | * Description |
2 | The barriers component of testbed facilitates coordination among the peers run | 2 | The testbed's barriers API facilitates coordination among the peers run by the |
3 | by the testbed and the experiment driver. The concept is similar to the barrier | 3 | testbed and the experiment driver. The concept is similar to the barrier |
4 | synchronisation mechanism found in parallel programming or multithreading | 4 | synchronisation mechanism found in parallel programming or multithreading |
5 | paradigms - a peer waits at a barrier upon reaching it until it is crossed i.e, | 5 | paradigms - a peer waits at a barrier upon reaching it until the barrier is |
6 | reached by a predefined number of peers. This predefined number peers required | 6 | crossed i.e, the barrier is reached by a predefined number of peers. This |
7 | to cross a barrier is also called quorum. | 7 | predefined number peers required to cross a barrier is also called quorum. We |
8 | say a peer has reached a barrier if the peer is waiting for the barrier to be | ||
9 | crossed. Similarly a barrier is said to be reached if the required quorum of | ||
10 | peers reach the barrier. | ||
8 | 11 | ||
9 | Coordination among the peers and the experiment driver is achieved through the | 12 | The barriers API provides the following functions: |
10 | barriers service and its respective barriers API. The barriers API provides the | ||
11 | following functions: | ||
12 | 13 | ||
13 | 1) barrier_init(): function to initialse a barrier in the experiment | 14 | 1) barrier_init(): function to initialse a barrier in the experiment |
14 | 2) barrier_cancel(): function to cancel a barrier which has been initialised | 15 | 2) barrier_cancel(): function to cancel a barrier which has been initialised |
@@ -20,28 +21,30 @@ following functions: | |||
20 | Among the above functions, the first two, namely barrier_init() and | 21 | Among the above functions, the first two, namely barrier_init() and |
21 | barrier_cacel() are used by experiment drivers. All barriers should be | 22 | barrier_cacel() are used by experiment drivers. All barriers should be |
22 | initialised by the experiment driver by calling barrier_init(). This function | 23 | initialised by the experiment driver by calling barrier_init(). This function |
23 | takes a name to identify the barrier and a notification callback for notifying | 24 | takes a name to identify the barrier, the quorum required for the barrier to be |
24 | the experiment driver when the barrier is crossed. The function | 25 | crossed and a notification callback for notifying the experiment driver when the |
25 | barrier_cancel() cancels an initialised barrier and frees the resources | 26 | barrier is crossed. The function barrier_cancel() cancels an initialised |
26 | allocated for it. This function can be called upon a initialised barrier before | 27 | barrier and frees the resources allocated for it. This function can be called |
27 | it is crossed. | 28 | upon a initialised barrier before it is crossed. |
28 | 29 | ||
29 | The remaining two functions barrier_wait() and barrier_wait_cancel() are used in | 30 | The remaining two functions barrier_wait() and barrier_wait_cancel() are used in |
30 | the peer's processes. barrier_wait() connects to the local barrier service and | 31 | the peer's processes. barrier_wait() connects to the local barrier service |
31 | registers that the caller has reached the barrier and is waiting for the barrier | 32 | running on the same host the peer is running on and registers that the caller |
32 | to be crossed. Note that this function can only be used by peers which are | 33 | has reached the barrier and is waiting for the barrier to be crossed. Note that |
33 | started by testbed as this function tries to access the local barrier service | 34 | this function can only be used by peers which are started by testbed as this |
34 | which is part of the testbed controller service. Calling barrier_wait() on an | 35 | function tries to access the local barrier service which is part of the testbed |
35 | uninitialised barrier (or not-yet-initialised) barrier results in failure. | 36 | controller service. Calling barrier_wait() on an uninitialised barrier barrier |
36 | barrier_wait_cancel() cancels the notification registered by barrier_wait(). | 37 | results in failure. barrier_wait_cancel() cancels the notification registered |
38 | by barrier_wait(). | ||
37 | 39 | ||
38 | 40 | ||
39 | * Implementation | 41 | * Implementation |
40 | Since barriers involve coordination between experiment driver and peers the | 42 | Since barriers involve coordination between experiment driver and peers, the |
41 | barrier service is split into two components. The first component responds to | 43 | barrier service in the testbed controller is split into two components. The |
42 | the barrier API used by the experiment driver (functions barrier_init() and | 44 | first component responds to the message generated by the barrier API used by the |
43 | barrier_cancel()) and the second component to the barrier API used by peers | 45 | experiment driver (functions barrier_init() and barrier_cancel()) and the second |
44 | (functions barrier_wait() and barrier_wait_cancel()) | 46 | component to the messages generated by barrier API used by peers (functions |
47 | barrier_wait() and barrier_wait_cancel()) | ||
45 | 48 | ||
46 | Calling barrier_init() sends a BARRIER_INIT message to the master controller. | 49 | Calling barrier_init() sends a BARRIER_INIT message to the master controller. |
47 | The master controller then registers a barrier and calls barrier_init() for each | 50 | The master controller then registers a barrier and calls barrier_init() for each |
@@ -53,13 +56,34 @@ hierarchy back to the experiment driver. | |||
53 | Similar to barrier_init(), barrier_cancel() propagates BARRIER_CANCEL message | 56 | Similar to barrier_init(), barrier_cancel() propagates BARRIER_CANCEL message |
54 | which causes controllers to remove an initialised barrier. | 57 | which causes controllers to remove an initialised barrier. |
55 | 58 | ||
56 | The second component, according to gnunet architecture, is actually an another | 59 | The second component is implemented as a separate service in the binary |
57 | service but runs in the same binary `gnunet-service-testbed'; the reason is | 60 | `gnunet-service-testbed' which already has the testbed controller service. |
58 | that it requires access to barrier data created by the first component. This | 61 | Although this deviates from the gnunet process architecture of having one |
59 | component responds to BARRIER_WAIT messages from local peers when they call | 62 | service per binary, it is needed in this case as this component needs access to |
60 | barrier_wait(). Upon receiving BARRIER_WAIT message, the service checks if the | 63 | barrier data created by the first component. This component responds to |
61 | requested barrier has been initialised before and it was not initialised the | 64 | BARRIER_WAIT messages from local peers when they call barrier_wait(). Upon |
62 | an error status is sent through BARRIER_STATUS message to the local peer and the | 65 | receiving BARRIER_WAIT message, the service checks if the requested barrier has |
63 | connection from the peer is terminated. If the barrier is initialised before, | 66 | been initialised before and if it was not initialised, an error status is sent |
64 | the barrier's counter for reached peers is incremented and a notification is | 67 | through BARRIER_STATUS message to the local peer and the connection from the |
65 | registered to notify this peer when the barrier is reached. | 68 | peer is terminated. If the barrier is initialised before, the barrier's counter |
69 | for reached peers is incremented and a notification is registered to notify the | ||
70 | peer when the barrier is reached. The connection from the peer is left open. | ||
71 | |||
72 | When enough peers required to attain the quorum send BARRIER_WAIT messages, the | ||
73 | controller sends a BARRIER_STATUS message to its parent informing that the | ||
74 | barrier is crossed. If the controller has started further subcontrollers, it | ||
75 | delays this message until it receives a notification from each of those | ||
76 | subcontrollers that the barrier is crossed. Finally, the barriers API at the | ||
77 | experiment driver receives the BARRIER_STATUS when the barrier is reached at all | ||
78 | the controllers. | ||
79 | |||
80 | The barriers API at the experiment driver responds to the BARRIER_STATUS message | ||
81 | by echoing it back to the master controller and notifying the experiment | ||
82 | controller through the notification callback that a barrier has been crossed. | ||
83 | The echoed BARRIER_STATUS message is propagated by the master controller to the | ||
84 | controller hierarchy. This progation triggers the notifications registered by | ||
85 | peers at each of the controllers in the hierarchy. Note the difference between | ||
86 | this downward propagation of the BARRIER_STATUS message from its upward | ||
87 | propagation -- the upward propagation is needed for ensuring that the barrier is | ||
88 | reached by all the controllers and the downward propagation is for triggering | ||
89 | that the barrier is crossed. | ||