aboutsummaryrefslogtreecommitdiff
path: root/src/testbed/barriers.README.org
diff options
context:
space:
mode:
Diffstat (limited to 'src/testbed/barriers.README.org')
-rw-r--r--src/testbed/barriers.README.org95
1 files changed, 0 insertions, 95 deletions
diff --git a/src/testbed/barriers.README.org b/src/testbed/barriers.README.org
deleted file mode 100644
index 159e1c355..000000000
--- a/src/testbed/barriers.README.org
+++ /dev/null
@@ -1,95 +0,0 @@
1* Description
2The testbed subsystem's barriers API facilitates coordination among the peers
3run by the testbed and the experiment driver. The concept is similar to the
4barrier synchronisation mechanism found in parallel programming or
5multi-threading paradigms - a peer waits at a barrier upon reaching it until the
6barrier is reached by a predefined number of peers. This predefined number of
7peers required to cross a barrier is also called quorum. We say a peer has
8reached a barrier if the peer is waiting for the barrier to be crossed.
9Similarly a barrier is said to be reached if the required quorum of peers reach
10the barrier. A barrier which is reached is deemed as crossed after all the
11peers waiting on it are notified.
12
13The barriers API provides the following functions:
141) GNUNET_TESTBED_barrier_init(): function to initialise a barrier in the
15 experiment
162) GNUNET_TESTBED_barrier_cancel(): function to cancel a barrier which has been
17 initialised before
183) GNUNET_TESTBED_barrier_wait(): function to signal barrier service that the
19 caller has reached a barrier and is waiting for it to be crossed
204) GNUNET_TESTBED_barrier_wait_cancel(): function to stop waiting for a barrier
21 to be crossed
22
23Among the above functions, the first two, namely GNUNET_TESTBED_barrier_init()
24and GNUNET_TESTBED_barrier_cacel() are used by experiment drivers. All barriers
25should be initialised by the experiment driver by calling
26GNUNET_TESTBED_barrier_init(). This function takes a name to identify the
27barrier, the quorum required for the barrier to be crossed and a notification
28callback for notifying the experiment driver when the barrier is crossed. The
29GNUNET_TESTBED_function barrier_cancel() cancels an initialised barrier and
30frees the resources allocated for it. This function can be called upon a
31initialised barrier before it is crossed.
32
33The remaining two functions GNUNET_TESTBED_barrier_wait() and
34GNUNET_TESTBED_barrier_wait_cancel() are used in the peer's processes.
35GNUNET_TESTBED_barrier_wait() connects to the local barrier service running on
36the same host the peer is running on and registers that the caller has reached
37the barrier and is waiting for the barrier to be crossed. Note that this
38function can only be used by peers which are started by testbed as this function
39tries to access the local barrier service which is part of the testbed
40controller service. Calling GNUNET_TESTBED_barrier_wait() on an uninitialised
41barrier results in failure. GNUNET_TESTBED_barrier_wait_cancel() cancels the
42notification registered by GNUNET_TESTBED_barrier_wait().
43
44
45* Implementation
46Since barriers involve coordination between experiment driver and peers, the
47barrier service in the testbed controller is split into two components. The
48first component responds to the message generated by the barrier API used by the
49experiment driver (functions GNUNET_TESTBED_barrier_init() and
50GNUNET_TESTBED_barrier_cancel()) and the second component to the messages
51generated by barrier API used by peers (functions GNUNET_TESTBED_barrier_wait()
52and GNUNET_TESTBED_barrier_wait_cancel()).
53
54Calling GNUNET_TESTBED_barrier_init() sends a BARRIER_INIT message to the master
55controller. The master controller then registers a barrier and calls
56GNUNET_TESTBED_barrier_init() for each its subcontrollers. In this way barrier
57initialisation is propagated to the controller hierarchy. While propagating
58initialisation, any errors at a subcontroller such as timeout during further
59propagation are reported up the hierarchy back to the experiment driver.
60
61Similar to GNUNET_TESTBED_barrier_init(), GNUNET_TESTBED_barrier_cancel()
62propagates BARRIER_CANCEL message which causes controllers to remove an
63initialised barrier.
64
65The second component is implemented as a separate service in the binary
66`gnunet-service-testbed' which already has the testbed controller service.
67Although this deviates from the gnunet process architecture of having one
68service per binary, it is needed in this case as this component needs access to
69barrier data created by the first component. This component responds to
70BARRIER_WAIT messages from local peers when they call
71GNUNET_TESTBED_barrier_wait(). Upon receiving BARRIER_WAIT message, the service
72checks if the requested barrier has been initialised before and if it was not
73initialised, an error status is sent through BARRIER_STATUS message to the local
74peer and the connection from the peer is terminated. If the barrier is
75initialised before, the barrier's counter for reached peers is incremented and a
76notification is registered to notify the peer when the barrier is reached. The
77connection from the peer is left open.
78
79When enough peers required to attain the quorum send BARRIER_WAIT messages, the
80controller sends a BARRIER_STATUS message to its parent informing that the
81barrier is crossed. If the controller has started further subcontrollers, it
82delays this message until it receives a similar notification from each of those
83subcontrollers. Finally, the barriers API at the experiment driver receives the
84BARRIER_STATUS when the barrier is reached at all the controllers.
85
86The barriers API at the experiment driver responds to the BARRIER_STATUS message
87by echoing it back to the master controller and notifying the experiment
88controller through the notification callback that a barrier has been crossed.
89The echoed BARRIER_STATUS message is propagated by the master controller to the
90controller hierarchy. This propagation triggers the notifications registered by
91peers at each of the controllers in the hierarchy. Note the difference between
92this downward propagation of the BARRIER_STATUS message from its upward
93propagation -- the upward propagation is needed for ensuring that the barrier is
94reached by all the controllers and the downward propagation is for triggering
95that the barrier is crossed.