aboutsummaryrefslogtreecommitdiff
path: root/src/testbed
diff options
context:
space:
mode:
authorSree Harsha Totakura <totakura@in.tum.de>2013-09-13 13:01:09 +0000
committerSree Harsha Totakura <totakura@in.tum.de>2013-09-13 13:01:09 +0000
commitb218a20819932c4c594ea2933ff3f4f28b398325 (patch)
treecda48d07bf2050bfcd1012d40093f9f4fbf8794f /src/testbed
parent30af0b72dd40084d2fa48cffe5e49521b46b4136 (diff)
downloadgnunet-b218a20819932c4c594ea2933ff3f4f28b398325.tar.gz
gnunet-b218a20819932c4c594ea2933ff3f4f28b398325.zip
- doc
Diffstat (limited to 'src/testbed')
-rw-r--r--src/testbed/barriers.README.org94
1 files changed, 59 insertions, 35 deletions
diff --git a/src/testbed/barriers.README.org b/src/testbed/barriers.README.org
index 4547009e2..ed39903c0 100644
--- a/src/testbed/barriers.README.org
+++ b/src/testbed/barriers.README.org
@@ -1,14 +1,15 @@
1* Description 1* Description
2The barriers component of testbed facilitates coordination among the peers run 2The testbed's barriers API facilitates coordination among the peers run by the
3by the testbed and the experiment driver. The concept is similar to the barrier 3testbed and the experiment driver. The concept is similar to the barrier
4synchronisation mechanism found in parallel programming or multithreading 4synchronisation mechanism found in parallel programming or multithreading
5paradigms - a peer waits at a barrier upon reaching it until it is crossed i.e, 5paradigms - a peer waits at a barrier upon reaching it until the barrier is
6reached by a predefined number of peers. This predefined number peers required 6crossed i.e, the barrier is reached by a predefined number of peers. This
7to cross a barrier is also called quorum. 7predefined number peers required to cross a barrier is also called quorum. We
8say a peer has reached a barrier if the peer is waiting for the barrier to be
9crossed. Similarly a barrier is said to be reached if the required quorum of
10peers reach the barrier.
8 11
9Coordination among the peers and the experiment driver is achieved through the 12The barriers API provides the following functions:
10barriers service and its respective barriers API. The barriers API provides the
11following functions:
12 13
131) barrier_init(): function to initialse a barrier in the experiment 141) barrier_init(): function to initialse a barrier in the experiment
142) barrier_cancel(): function to cancel a barrier which has been initialised 152) barrier_cancel(): function to cancel a barrier which has been initialised
@@ -20,28 +21,30 @@ following functions:
20Among the above functions, the first two, namely barrier_init() and 21Among the above functions, the first two, namely barrier_init() and
21barrier_cacel() are used by experiment drivers. All barriers should be 22barrier_cacel() are used by experiment drivers. All barriers should be
22initialised by the experiment driver by calling barrier_init(). This function 23initialised by the experiment driver by calling barrier_init(). This function
23takes a name to identify the barrier and a notification callback for notifying 24takes a name to identify the barrier, the quorum required for the barrier to be
24the experiment driver when the barrier is crossed. The function 25crossed and a notification callback for notifying the experiment driver when the
25barrier_cancel() cancels an initialised barrier and frees the resources 26barrier is crossed. The function barrier_cancel() cancels an initialised
26allocated for it. This function can be called upon a initialised barrier before 27barrier and frees the resources allocated for it. This function can be called
27it is crossed. 28upon a initialised barrier before it is crossed.
28 29
29The remaining two functions barrier_wait() and barrier_wait_cancel() are used in 30The remaining two functions barrier_wait() and barrier_wait_cancel() are used in
30the peer's processes. barrier_wait() connects to the local barrier service and 31the peer's processes. barrier_wait() connects to the local barrier service
31registers that the caller has reached the barrier and is waiting for the barrier 32running on the same host the peer is running on and registers that the caller
32to be crossed. Note that this function can only be used by peers which are 33has reached the barrier and is waiting for the barrier to be crossed. Note that
33started by testbed as this function tries to access the local barrier service 34this function can only be used by peers which are started by testbed as this
34which is part of the testbed controller service. Calling barrier_wait() on an 35function tries to access the local barrier service which is part of the testbed
35uninitialised barrier (or not-yet-initialised) barrier results in failure. 36controller service. Calling barrier_wait() on an uninitialised barrier barrier
36barrier_wait_cancel() cancels the notification registered by barrier_wait(). 37results in failure. barrier_wait_cancel() cancels the notification registered
38by barrier_wait().
37 39
38 40
39* Implementation 41* Implementation
40Since barriers involve coordination between experiment driver and peers the 42Since barriers involve coordination between experiment driver and peers, the
41barrier service is split into two components. The first component responds to 43barrier service in the testbed controller is split into two components. The
42the barrier API used by the experiment driver (functions barrier_init() and 44first component responds to the message generated by the barrier API used by the
43barrier_cancel()) and the second component to the barrier API used by peers 45experiment driver (functions barrier_init() and barrier_cancel()) and the second
44(functions barrier_wait() and barrier_wait_cancel()) 46component to the messages generated by barrier API used by peers (functions
47barrier_wait() and barrier_wait_cancel())
45 48
46Calling barrier_init() sends a BARRIER_INIT message to the master controller. 49Calling barrier_init() sends a BARRIER_INIT message to the master controller.
47The master controller then registers a barrier and calls barrier_init() for each 50The master controller then registers a barrier and calls barrier_init() for each
@@ -53,13 +56,34 @@ hierarchy back to the experiment driver.
53Similar to barrier_init(), barrier_cancel() propagates BARRIER_CANCEL message 56Similar to barrier_init(), barrier_cancel() propagates BARRIER_CANCEL message
54which causes controllers to remove an initialised barrier. 57which causes controllers to remove an initialised barrier.
55 58
56The second component, according to gnunet architecture, is actually an another 59The second component is implemented as a separate service in the binary
57service but runs in the same binary `gnunet-service-testbed'; the reason is 60`gnunet-service-testbed' which already has the testbed controller service.
58that it requires access to barrier data created by the first component. This 61Although this deviates from the gnunet process architecture of having one
59component responds to BARRIER_WAIT messages from local peers when they call 62service per binary, it is needed in this case as this component needs access to
60barrier_wait(). Upon receiving BARRIER_WAIT message, the service checks if the 63barrier data created by the first component. This component responds to
61requested barrier has been initialised before and it was not initialised the 64BARRIER_WAIT messages from local peers when they call barrier_wait(). Upon
62an error status is sent through BARRIER_STATUS message to the local peer and the 65receiving BARRIER_WAIT message, the service checks if the requested barrier has
63connection from the peer is terminated. If the barrier is initialised before, 66been initialised before and if it was not initialised, an error status is sent
64the barrier's counter for reached peers is incremented and a notification is 67through BARRIER_STATUS message to the local peer and the connection from the
65registered to notify this peer when the barrier is reached. 68peer is terminated. If the barrier is initialised before, the barrier's counter
69for reached peers is incremented and a notification is registered to notify the
70peer when the barrier is reached. The connection from the peer is left open.
71
72When enough peers required to attain the quorum send BARRIER_WAIT messages, the
73controller sends a BARRIER_STATUS message to its parent informing that the
74barrier is crossed. If the controller has started further subcontrollers, it
75delays this message until it receives a notification from each of those
76subcontrollers that the barrier is crossed. Finally, the barriers API at the
77experiment driver receives the BARRIER_STATUS when the barrier is reached at all
78the controllers.
79
80The barriers API at the experiment driver responds to the BARRIER_STATUS message
81by echoing it back to the master controller and notifying the experiment
82controller through the notification callback that a barrier has been crossed.
83The echoed BARRIER_STATUS message is propagated by the master controller to the
84controller hierarchy. This progation triggers the notifications registered by
85peers at each of the controllers in the hierarchy. Note the difference between
86this downward propagation of the BARRIER_STATUS message from its upward
87propagation -- the upward propagation is needed for ensuring that the barrier is
88reached by all the controllers and the downward propagation is for triggering
89that the barrier is crossed.