aboutsummaryrefslogtreecommitdiff
path: root/src/testbed
diff options
context:
space:
mode:
authorSree Harsha Totakura <totakura@in.tum.de>2013-09-16 14:05:48 +0000
committerSree Harsha Totakura <totakura@in.tum.de>2013-09-16 14:05:48 +0000
commit77139316eeecda84e45710d4dfd7c74dcff47942 (patch)
tree79902e46fb92f29523bbaad730b15679ca31b10f /src/testbed
parentb9a26a3891508c9891ca2ceb142caef2e3b08f31 (diff)
downloadgnunet-77139316eeecda84e45710d4dfd7c74dcff47942.tar.gz
gnunet-77139316eeecda84e45710d4dfd7c74dcff47942.zip
- fix crash when disconnecting from misbehaving client
Diffstat (limited to 'src/testbed')
-rw-r--r--src/testbed/barriers.README.org118
-rw-r--r--src/testbed/gnunet-service-testbed_barriers.c10
-rw-r--r--src/testbed/test_testbed_api_barriers.conf2
3 files changed, 66 insertions, 64 deletions
diff --git a/src/testbed/barriers.README.org b/src/testbed/barriers.README.org
index ed39903c0..40488a0cc 100644
--- a/src/testbed/barriers.README.org
+++ b/src/testbed/barriers.README.org
@@ -1,87 +1,93 @@
1* Description 1* Description
2The testbed's barriers API facilitates coordination among the peers run by the 2The testbed subsystem's barriers API facilitates coordination among the peers
3testbed and the experiment driver. The concept is similar to the barrier 3run by the testbed and the experiment driver. The concept is similar to the
4synchronisation mechanism found in parallel programming or multithreading 4barrier synchronisation mechanism found in parallel programming or
5paradigms - a peer waits at a barrier upon reaching it until the barrier is 5multi-threading paradigms - a peer waits at a barrier upon reaching it until the
6crossed i.e, the barrier is reached by a predefined number of peers. This 6barrier is reached by a predefined number of peers. This predefined number of
7predefined number peers required to cross a barrier is also called quorum. We 7peers required to cross a barrier is also called quorum. We say a peer has
8say a peer has reached a barrier if the peer is waiting for the barrier to be 8reached a barrier if the peer is waiting for the barrier to be crossed.
9crossed. Similarly a barrier is said to be reached if the required quorum of 9Similarly a barrier is said to be reached if the required quorum of peers reach
10peers reach the barrier. 10the barrier. A barrier which is reached is deemed as crossed after all the
11peers waiting on it are notified.
11 12
12The barriers API provides the following functions: 13The barriers API provides the following functions:
141) GNUNET_TESTBED_barrier_init(): function to initialse a barrier in the
15 experiment
162) GNUNET_TESTBED_barrier_cancel(): function to cancel a barrier which has been
17 initialised before
183) GNUNET_TESTBED_barrier_wait(): function to signal barrier service that the
19 caller has reached a barrier and is waiting for it to be crossed
204) GNUNET_TESTBED_barrier_wait_cancel(): function to stop waiting for a barrier
21 to be crossed
13 22
141) barrier_init(): function to initialse a barrier in the experiment 23Among the above functions, the first two, namely GNUNET_TESTBED_barrier_init()
152) barrier_cancel(): function to cancel a barrier which has been initialised 24and GNUNET_TESTBED_barrier_cacel() are used by experiment drivers. All barriers
16 before 25should be initialised by the experiment driver by calling
173) barrier_wait(): function to signal barrier service that the caller has reached 26GNUNET_TESTBED_barrier_init(). This function takes a name to identify the
18 a barrier and is waiting for it to be crossed 27barrier, the quorum required for the barrier to be crossed and a notification
194) barrier_wait_cancel(): function to stop waiting for a barrier to be crossed 28callback for notifying the experiment driver when the barrier is crossed. The
29GNUNET_TESTBED_function barrier_cancel() cancels an initialised barrier and
30frees the resources allocated for it. This function can be called upon a
31initialised barrier before it is crossed.
20 32
21Among the above functions, the first two, namely barrier_init() and 33The remaining two functions GNUNET_TESTBED_barrier_wait() and
22barrier_cacel() are used by experiment drivers. All barriers should be 34GNUNET_TESTBED_barrier_wait_cancel() are used in the peer's processes.
23initialised by the experiment driver by calling barrier_init(). This function 35GNUNET_TESTBED_barrier_wait() connects to the local barrier service running on
24takes a name to identify the barrier, the quorum required for the barrier to be 36the same host the peer is running on and registers that the caller has reached
25crossed and a notification callback for notifying the experiment driver when the 37the barrier and is waiting for the barrier to be crossed. Note that this
26barrier is crossed. The function barrier_cancel() cancels an initialised 38function can only be used by peers which are started by testbed as this function
27barrier and frees the resources allocated for it. This function can be called 39tries to access the local barrier service which is part of the testbed
28upon a initialised barrier before it is crossed. 40controller service. Calling GNUNET_TESTBED_barrier_wait() on an uninitialised
29 41barrier results in failure. GNUNET_TESTBED_barrier_wait_cancel() cancels the
30The remaining two functions barrier_wait() and barrier_wait_cancel() are used in 42notification registered by GNUNET_TESTBED_barrier_wait().
31the peer's processes. barrier_wait() connects to the local barrier service
32running on the same host the peer is running on and registers that the caller
33has reached the barrier and is waiting for the barrier to be crossed. Note that
34this function can only be used by peers which are started by testbed as this
35function tries to access the local barrier service which is part of the testbed
36controller service. Calling barrier_wait() on an uninitialised barrier barrier
37results in failure. barrier_wait_cancel() cancels the notification registered
38by barrier_wait().
39 43
40 44
41* Implementation 45* Implementation
42Since barriers involve coordination between experiment driver and peers, the 46Since barriers involve coordination between experiment driver and peers, the
43barrier service in the testbed controller is split into two components. The 47barrier service in the testbed controller is split into two components. The
44first component responds to the message generated by the barrier API used by the 48first component responds to the message generated by the barrier API used by the
45experiment driver (functions barrier_init() and barrier_cancel()) and the second 49experiment driver (functions GNUNET_TESTBED_barrier_init() and
46component to the messages generated by barrier API used by peers (functions 50GNUNET_TESTBED_barrier_cancel()) and the second component to the messages
47barrier_wait() and barrier_wait_cancel()) 51generated by barrier API used by peers (functions GNUNET_TESTBED_barrier_wait()
52and GNUNET_TESTBED_barrier_wait_cancel()).
48 53
49Calling barrier_init() sends a BARRIER_INIT message to the master controller. 54Calling GNUNET_TESTBED_barrier_init() sends a BARRIER_INIT message to the master
50The master controller then registers a barrier and calls barrier_init() for each 55controller. The master controller then registers a barrier and calls
51its subcontrollers. In this way barrier initialisation is propagated to the 56GNUNET_TESTBED_barrier_init() for each its subcontrollers. In this way barrier
52controller hierarchy. While propagating initialisation, any errors at a 57initialisation is propagated to the controller hierarchy. While propagating
53subcontroller such as timeout during further propagation are reported up the 58initialisation, any errors at a subcontroller such as timeout during further
54hierarchy back to the experiment driver. 59propagation are reported up the hierarchy back to the experiment driver.
55 60
56Similar to barrier_init(), barrier_cancel() propagates BARRIER_CANCEL message 61Similar to GNUNET_TESTBED_barrier_init(), GNUNET_TESTBED_barrier_cancel()
57which causes controllers to remove an initialised barrier. 62propagates BARRIER_CANCEL message which causes controllers to remove an
63initialised barrier.
58 64
59The second component is implemented as a separate service in the binary 65The second component is implemented as a separate service in the binary
60`gnunet-service-testbed' which already has the testbed controller service. 66`gnunet-service-testbed' which already has the testbed controller service.
61Although this deviates from the gnunet process architecture of having one 67Although this deviates from the gnunet process architecture of having one
62service per binary, it is needed in this case as this component needs access to 68service per binary, it is needed in this case as this component needs access to
63barrier data created by the first component. This component responds to 69barrier data created by the first component. This component responds to
64BARRIER_WAIT messages from local peers when they call barrier_wait(). Upon 70BARRIER_WAIT messages from local peers when they call
65receiving BARRIER_WAIT message, the service checks if the requested barrier has 71GNUNET_TESTBED_barrier_wait(). Upon receiving BARRIER_WAIT message, the service
66been initialised before and if it was not initialised, an error status is sent 72checks if the requested barrier has been initialised before and if it was not
67through BARRIER_STATUS message to the local peer and the connection from the 73initialised, an error status is sent through BARRIER_STATUS message to the local
68peer is terminated. If the barrier is initialised before, the barrier's counter 74peer and the connection from the peer is terminated. If the barrier is
69for reached peers is incremented and a notification is registered to notify the 75initialised before, the barrier's counter for reached peers is incremented and a
70peer when the barrier is reached. The connection from the peer is left open. 76notification is registered to notify the peer when the barrier is reached. The
77connection from the peer is left open.
71 78
72When enough peers required to attain the quorum send BARRIER_WAIT messages, the 79When enough peers required to attain the quorum send BARRIER_WAIT messages, the
73controller sends a BARRIER_STATUS message to its parent informing that the 80controller sends a BARRIER_STATUS message to its parent informing that the
74barrier is crossed. If the controller has started further subcontrollers, it 81barrier is crossed. If the controller has started further subcontrollers, it
75delays this message until it receives a notification from each of those 82delays this message until it receives a similar notification from each of those
76subcontrollers that the barrier is crossed. Finally, the barriers API at the 83subcontrollers. Finally, the barriers API at the experiment driver receives the
77experiment driver receives the BARRIER_STATUS when the barrier is reached at all 84BARRIER_STATUS when the barrier is reached at all the controllers.
78the controllers.
79 85
80The barriers API at the experiment driver responds to the BARRIER_STATUS message 86The barriers API at the experiment driver responds to the BARRIER_STATUS message
81by echoing it back to the master controller and notifying the experiment 87by echoing it back to the master controller and notifying the experiment
82controller through the notification callback that a barrier has been crossed. 88controller through the notification callback that a barrier has been crossed.
83The echoed BARRIER_STATUS message is propagated by the master controller to the 89The echoed BARRIER_STATUS message is propagated by the master controller to the
84controller hierarchy. This progation triggers the notifications registered by 90controller hierarchy. This propagation triggers the notifications registered by
85peers at each of the controllers in the hierarchy. Note the difference between 91peers at each of the controllers in the hierarchy. Note the difference between
86this downward propagation of the BARRIER_STATUS message from its upward 92this downward propagation of the BARRIER_STATUS message from its upward
87propagation -- the upward propagation is needed for ensuring that the barrier is 93propagation -- the upward propagation is needed for ensuring that the barrier is
diff --git a/src/testbed/gnunet-service-testbed_barriers.c b/src/testbed/gnunet-service-testbed_barriers.c
index 68011065d..5668d03cf 100644
--- a/src/testbed/gnunet-service-testbed_barriers.c
+++ b/src/testbed/gnunet-service-testbed_barriers.c
@@ -360,13 +360,7 @@ remove_barrier (struct Barrier *barrier)
360 while (NULL != (ctx = barrier->head)) 360 while (NULL != (ctx = barrier->head))
361 { 361 {
362 GNUNET_CONTAINER_DLL_remove (barrier->head, barrier->tail, ctx); 362 GNUNET_CONTAINER_DLL_remove (barrier->head, barrier->tail, ctx);
363 GNUNET_SERVER_client_drop (ctx->client); 363 cleanup_clientctx (ctx);
364 ctx->client = NULL;
365 if (NULL != ctx->tx)
366 {
367 GNUNET_SERVER_notify_transmit_ready_cancel (ctx->tx);
368 ctx->tx = NULL;
369 }
370 } 364 }
371 GNUNET_free (barrier->name); 365 GNUNET_free (barrier->name);
372 GNUNET_SERVER_client_drop (barrier->mc); 366 GNUNET_SERVER_client_drop (barrier->mc);
@@ -532,6 +526,8 @@ disconnect_cb (void *cls, struct GNUNET_SERVER_Client *client)
532 if (NULL == client) 526 if (NULL == client)
533 return; 527 return;
534 client_ctx = GNUNET_SERVER_client_get_user_context (client, struct ClientCtx); 528 client_ctx = GNUNET_SERVER_client_get_user_context (client, struct ClientCtx);
529 if (NULL == client_ctx)
530 return;
535 cleanup_clientctx (client_ctx); 531 cleanup_clientctx (client_ctx);
536} 532}
537 533
diff --git a/src/testbed/test_testbed_api_barriers.conf b/src/testbed/test_testbed_api_barriers.conf
index 758cbd8c9..056cfac22 100644
--- a/src/testbed/test_testbed_api_barriers.conf
+++ b/src/testbed/test_testbed_api_barriers.conf
@@ -13,7 +13,7 @@ PORT = 12366
13[test-barriers] 13[test-barriers]
14AUTOSTART = YES 14AUTOSTART = YES
15PORT = 12114 #not really used 15PORT = 12114 #not really used
16BINARY = gnunet-service-test-barriers 16BINARY = /home/totakura/gnunet/src/testbed/gnunet-service-test-barriers
17 17
18[fs] 18[fs]
19AUTOSTART = NO 19AUTOSTART = NO