diff options
Diffstat (limited to 'doc/chapters/developer.texi')
-rw-r--r-- | doc/chapters/developer.texi | 7486 |
1 files changed, 7486 insertions, 0 deletions
diff --git a/doc/chapters/developer.texi b/doc/chapters/developer.texi new file mode 100644 index 000000000..ce6b16087 --- /dev/null +++ b/doc/chapters/developer.texi | |||
@@ -0,0 +1,7486 @@ | |||
1 | @c *************************************************************************** | ||
2 | @node GNUnet Developer Handbook | ||
3 | @chapter GNUnet Developer Handbook | ||
4 | |||
5 | This book is intended to be an introduction for programmers that want to | ||
6 | extend the GNUnet framework. GNUnet is more than a simple peer-to-peer | ||
7 | application. For developers, GNUnet is: | ||
8 | |||
9 | @itemize @bullet | ||
10 | @item Free software under the GNU General Public License, with a community | ||
11 | that believes in the GNU philosophy | ||
12 | @item | ||
13 | A set of standards, including coding conventions and architectural rules | ||
14 | @item | ||
15 | A set of layered protocols, both specifying the communication between peers as | ||
16 | well as the communication between components of a single peer. | ||
17 | @item | ||
18 | A set of libraries with well-defined APIs suitable for writing extensions | ||
19 | @end itemize | ||
20 | |||
21 | In particular, the architecture specifies that a peer consists of many | ||
22 | processes communicating via protocols. Processes can be written in almost | ||
23 | any language. C and Java APIs exist for accessing existing services and for | ||
24 | writing extensions. It is possible to write extensions in other languages by | ||
25 | implementing the necessary IPC protocols. | ||
26 | |||
27 | GNUnet can be extended and improved along many possible dimensions, and anyone | ||
28 | interested in free software and freedom-enhancing networking is welcome to | ||
29 | join the effort. This developer handbook attempts to provide an initial | ||
30 | introduction to some of the key design choices and central components of the | ||
31 | system. This manual is far from complete, and we welcome informed | ||
32 | contributions, be it in the form of new chapters or insightful comments. | ||
33 | |||
34 | However, the website is experiencing a constant onslaught of sophisticated | ||
35 | link-spam entered manually by exploited workers solving puzzles and | ||
36 | customizing text. To limit this commercial defacement, we are strictly | ||
37 | moderating comments and have disallowed "normal" users from posting new | ||
38 | content. However, this is really only intended to keep the spam at bay. If | ||
39 | you are a real user or aspiring developer, please drop us a note (IRC, e-mail, | ||
40 | contact form) with your user profile ID number included. We will then relax | ||
41 | these restrictions on your account. We're sorry for this inconvenience; | ||
42 | however, few people would want to read this site if 99% of it was | ||
43 | advertisements for bogus websites. | ||
44 | |||
45 | |||
46 | |||
47 | @c *************************************************************************** | ||
48 | |||
49 | |||
50 | |||
51 | |||
52 | |||
53 | |||
54 | |||
55 | |||
56 | @menu | ||
57 | * Developer Introduction:: | ||
58 | * Code overview:: | ||
59 | * System Architecture:: | ||
60 | * Subsystem stability:: | ||
61 | * Naming conventions and coding style guide:: | ||
62 | * Build-system:: | ||
63 | * Developing extensions for GNUnet using the gnunet-ext template:: | ||
64 | * Writing testcases:: | ||
65 | * GNUnet's TESTING library:: | ||
66 | * Performance regression analysis with Gauger:: | ||
67 | * GNUnet's TESTBED Subsystem:: | ||
68 | * libgnunetutil:: | ||
69 | * The Automatic Restart Manager (ARM):: | ||
70 | * GNUnet's TRANSPORT Subsystem:: | ||
71 | * NAT library:: | ||
72 | * Distance-Vector plugin:: | ||
73 | * SMTP plugin:: | ||
74 | * Bluetooth plugin:: | ||
75 | * WLAN plugin:: | ||
76 | * The ATS Subsystem:: | ||
77 | * GNUnet's CORE Subsystem:: | ||
78 | * GNUnet's CADET subsystem:: | ||
79 | * GNUnet's NSE subsystem:: | ||
80 | * GNUnet's HOSTLIST subsystem:: | ||
81 | * GNUnet's IDENTITY subsystem:: | ||
82 | * GNUnet's NAMESTORE Subsystem:: | ||
83 | * GNUnet's PEERINFO subsystem:: | ||
84 | * GNUnet's PEERSTORE subsystem:: | ||
85 | * GNUnet's SET Subsystem:: | ||
86 | * GNUnet's STATISTICS subsystem:: | ||
87 | * GNUnet's Distributed Hash Table (DHT):: | ||
88 | * The GNU Name System (GNS):: | ||
89 | * The GNS Namecache:: | ||
90 | * The REVOCATION Subsystem:: | ||
91 | * GNUnet's File-sharing (FS) Subsystem:: | ||
92 | * GNUnet's REGEX Subsystem:: | ||
93 | @end menu | ||
94 | |||
95 | @node Developer Introduction | ||
96 | @section Developer Introduction | ||
97 | |||
98 | This developer handbook is intended as first introduction to GNUnet for new | ||
99 | developers that want to extend the GNUnet framework. After the introduction, | ||
100 | each of the GNUnet subsystems (directories in the src/ tree) is (supposed to | ||
101 | be) covered in its own chapter. In addition to this documentation, GNUnet | ||
102 | developers should be aware of the services available on the GNUnet server to | ||
103 | them. | ||
104 | |||
105 | New developers can have a look a the GNUnet tutorials for C and java available | ||
106 | in the src/ directory of the repository or under the following links: | ||
107 | |||
108 | @itemize @bullet | ||
109 | @item GNUnet C tutorial | ||
110 | @item GNUnet Java tutorial | ||
111 | @end itemize | ||
112 | |||
113 | In addition to this book, the GNUnet server contains various resources for | ||
114 | GNUnet developers. They are all conveniently reachable via the "Developer" | ||
115 | entry in the navigation menu. Some additional tools (such as static analysis | ||
116 | reports) require a special developer access to perform certain operations. If | ||
117 | you feel you need access, you should contact | ||
118 | @uref{http://grothoff.org/christian/, Christian Grothoff}, GNUnet's maintainer. | ||
119 | |||
120 | The public subsystems on the GNUnet server that help developers are: | ||
121 | |||
122 | @itemize @bullet | ||
123 | @item The Version control system keeps our code and enables distributed | ||
124 | development. Only developers with write access can commit code, everyone else | ||
125 | is encouraged to submit patches to the | ||
126 | @uref{http://mail.gnu.org/mailman/listinfo/gnunet-developers, developer | ||
127 | mailinglist}. | ||
128 | @item The GNUnet bugtracking system is used to track feature requests, open bug | ||
129 | reports and their resolutions. Anyone can report bugs, only developers can | ||
130 | claim to have fixed them. | ||
131 | @item A buildbot is used to check GNUnet builds automatically on a range of | ||
132 | platforms. Builds are triggered automatically after 30 minutes of no changes to | ||
133 | Git. | ||
134 | @item The current quality of our automated test suite is assessed using Code | ||
135 | coverage analysis. This analysis is run daily; however the webpage is only | ||
136 | updated if all automated tests pass at that time. Testcases that improve our | ||
137 | code coverage are always welcome. | ||
138 | @item We try to automatically find bugs using a static analysis scan. This scan | ||
139 | is run daily; however the webpage is only updated if all automated tests pass | ||
140 | at the time. Note that not everything that is flagged by the analysis is a bug, | ||
141 | sometimes even good code can be marked as possibly problematic. Nevertheless, | ||
142 | developers are encouraged to at least be aware of all issues in their code that | ||
143 | are listed. | ||
144 | @item We use Gauger for automatic performance regression visualization. Details | ||
145 | on how to use Gauger are here. | ||
146 | @item We use @uref{http://junit.org/, junit} to automatically test gnunet-java. | ||
147 | Automatically generated, current reports on the test suite are here. | ||
148 | @item We use Cobertura to generate test coverage reports for gnunet-java. | ||
149 | Current reports on test coverage are here. | ||
150 | @end itemize | ||
151 | |||
152 | |||
153 | |||
154 | @c *************************************************************************** | ||
155 | @menu | ||
156 | * Project overview:: | ||
157 | @end menu | ||
158 | |||
159 | @node Project overview | ||
160 | @subsection Project overview | ||
161 | |||
162 | The GNUnet project consists at this point of several sub-projects. This section | ||
163 | is supposed to give an initial overview about the various sub-projects. Note | ||
164 | that this description also lists projects that are far from complete, including | ||
165 | even those that have literally not a single line of code in them yet. | ||
166 | |||
167 | GNUnet sub-projects in order of likely relevance are currently: | ||
168 | |||
169 | @table @asis | ||
170 | |||
171 | @item svn/gnunet Core of the P2P framework, including file-sharing, VPN and | ||
172 | chat applications; this is what the developer handbook covers mostly | ||
173 | @item svn/gnunet-gtk/ Gtk+-based user interfaces, including gnunet-fs-gtk | ||
174 | (file-sharing), gnunet-statistics-gtk (statistics over time), | ||
175 | gnunet-peerinfo-gtk (information about current connections and known peers), | ||
176 | gnunet-chat-gtk (chat GUI) and gnunet-setup (setup tool for "everything") | ||
177 | @item svn/gnunet-fuse/ Mounting directories shared via GNUnet's file-sharing on Linux | ||
178 | @item svn/gnunet-update/ Installation and update tool | ||
179 | @item svn/gnunet-ext/ | ||
180 | Template for starting 'external' GNUnet projects | ||
181 | @item svn/gnunet-java/ Java | ||
182 | APIs for writing GNUnet services and applications | ||
183 | @item svn/gnunet-www/ Code | ||
184 | and media helping drive the GNUnet website | ||
185 | @item svn/eclectic/ Code to run | ||
186 | GNUnet nodes on testbeds for research, development, testing and evaluation | ||
187 | @item svn/gnunet-qt/ qt-based GNUnet GUI (dead?) | ||
188 | @item svn/gnunet-cocoa/ | ||
189 | cocoa-based GNUnet GUI (dead?) | ||
190 | |||
191 | @end table | ||
192 | |||
193 | We are also working on various supporting libraries and tools: | ||
194 | |||
195 | @table @asis | ||
196 | @item svn/Extractor/ GNU libextractor (meta data extraction) | ||
197 | @item svn/libmicrohttpd/ GNU libmicrohttpd (embedded HTTP(S) server library) | ||
198 | @item svn/gauger/ Tool for performance regression analysis | ||
199 | @item svn/monkey/ Tool for automated debugging of distributed systems | ||
200 | @item svn/libmwmodem/ Library for accessing satellite connection quality reports | ||
201 | @end table | ||
202 | |||
203 | Finally, there are various external projects (see links for a list of those | ||
204 | that have a public website) which build on top of the GNUnet framework. | ||
205 | |||
206 | @c *************************************************************************** | ||
207 | @node Code overview | ||
208 | @section Code overview | ||
209 | |||
210 | This section gives a brief overview of the GNUnet source code. Specifically, we | ||
211 | sketch the function of each of the subdirectories in the @code{gnunet/src/} | ||
212 | directory. The order given is roughly bottom-up (in terms of the layers of the | ||
213 | system). | ||
214 | @table @asis | ||
215 | |||
216 | @item util/ --- libgnunetutil Library with general utility functions, all | ||
217 | GNUnet binaries link against this library. Anything from memory allocation and | ||
218 | data structures to cryptography and inter-process communication. The goal is to | ||
219 | provide an OS-independent interface and more 'secure' or convenient | ||
220 | implementations of commonly used primitives. The API is spread over more than a | ||
221 | dozen headers, developers should study those closely to avoid duplicating | ||
222 | existing functions. | ||
223 | @item hello/ --- libgnunethello HELLO messages are used to | ||
224 | describe under which addresses a peer can be reached (for example, protocol, | ||
225 | IP, port). This library manages parsing and generating of HELLO messages. | ||
226 | @item block/ --- libgnunetblock The DHT and other components of GNUnet store | ||
227 | information in units called 'blocks'. Each block has a type and the type | ||
228 | defines a particular format and how that binary format is to be linked to a | ||
229 | hash code (the key for the DHT and for databases). The block library is a | ||
230 | wapper around block plugins which provide the necessary functions for each | ||
231 | block type. | ||
232 | @item statistics/ The statistics service enables associating | ||
233 | values (of type uint64_t) with a componenet name and a string. The main uses is | ||
234 | debugging (counting events), performance tracking and user entertainment (what | ||
235 | did my peer do today?). | ||
236 | @item arm/ The automatic-restart-manager (ARM) service | ||
237 | is the GNUnet master service. Its role is to start gnunet-services, to re-start | ||
238 | them when they crashed and finally to shut down the system when requested. | ||
239 | @item peerinfo/ The peerinfo service keeps track of which peers are known to | ||
240 | the local peer and also tracks the validated addresses for each peer (in the | ||
241 | form of a HELLO message) for each of those peers. The peer is not necessarily | ||
242 | connected to all peers known to the peerinfo service. Peerinfo provides | ||
243 | persistent storage for peer identities --- peers are not forgotten just because | ||
244 | of a system restart. | ||
245 | @item datacache/ --- libgnunetdatacache The datacache | ||
246 | library provides (temporary) block storage for the DHT. Existing plugins can | ||
247 | store blocks in Sqlite, Postgres or MySQL databases. All data stored in the | ||
248 | cache is lost when the peer is stopped or restarted (datacache uses temporary | ||
249 | tables). | ||
250 | @item datastore/ The datastore service stores file-sharing blocks in | ||
251 | databases for extended periods of time. In contrast to the datacache, data is | ||
252 | not lost when peers restart. However, quota restrictions may still cause old, | ||
253 | expired or low-priority data to be eventually discarded. Existing plugins can | ||
254 | store blocks in Sqlite, Postgres or MySQL databases. | ||
255 | @item template/ Template | ||
256 | for writing a new service. Does nothing. | ||
257 | @item ats/ The automatic transport | ||
258 | selection (ATS) service is responsible for deciding which address (i.e. which | ||
259 | transport plugin) should be used for communication with other peers, and at | ||
260 | what bandwidth. | ||
261 | @item nat/ --- libgnunetnat Library that provides basic | ||
262 | functions for NAT traversal. The library supports NAT traversal with manual | ||
263 | hole-punching by the user, UPnP and ICMP-based autonomous NAT traversal. The | ||
264 | library also includes an API for testing if the current configuration works and | ||
265 | the @code{gnunet-nat-server} which provides an external service to test the | ||
266 | local configuration. | ||
267 | @item fragmentation/ --- libgnunetfragmentation Some | ||
268 | transports (UDP and WLAN, mostly) have restrictions on the maximum transfer | ||
269 | unit (MTU) for packets. The fragmentation library can be used to break larger | ||
270 | packets into chunks of at most 1k and transmit the resulting fragments | ||
271 | reliabily (with acknowledgement, retransmission, timeouts, etc.). | ||
272 | @item transport/ The transport service is responsible for managing the basic P2P | ||
273 | communication. It uses plugins to support P2P communication over TCP, UDP, | ||
274 | HTTP, HTTPS and other protocols.The transport service validates peer addresses, | ||
275 | enforces bandwidth restrictions, limits the total number of connections and | ||
276 | enforces connectivity restrictions (i.e. friends-only). | ||
277 | @item peerinfo-tool/ | ||
278 | This directory contains the gnunet-peerinfo binary which can be used to inspect | ||
279 | the peers and HELLOs known to the peerinfo service. | ||
280 | @item core/ The core | ||
281 | service is responsible for establishing encrypted, authenticated connections | ||
282 | with other peers, encrypting and decrypting messages and forwarding messages to | ||
283 | higher-level services that are interested in them. | ||
284 | @item testing/ --- | ||
285 | libgnunettesting The testing library allows starting (and stopping) peers for | ||
286 | writing testcases.@ | ||
287 | It also supports automatic generation of configurations for | ||
288 | peers ensuring that the ports and paths are disjoint. libgnunettesting is also | ||
289 | the foundation for the testbed service | ||
290 | @item testbed/ The testbed service is | ||
291 | used for creating small or large scale deployments of GNUnet peers for | ||
292 | evaluation of protocols. It facilitates peer depolyments on multiple hosts (for | ||
293 | example, in a cluster) and establishing varous network topologies (both | ||
294 | underlay and overlay). | ||
295 | @item nse/ The network size estimation (NSE) service | ||
296 | implements a protocol for (securely) estimating the current size of the P2P | ||
297 | network. | ||
298 | @item dht/ The distributed hash table (DHT) service provides a | ||
299 | distributed implementation of a hash table to store blocks under hash keys in | ||
300 | the P2P network. | ||
301 | @item hostlist/ The hostlist service allows learning about | ||
302 | other peers in the network by downloading HELLO messages from an HTTP server, | ||
303 | can be configured to run such an HTTP server and also implements a P2P protocol | ||
304 | to advertise and automatically learn about other peers that offer a public | ||
305 | hostlist server. | ||
306 | @item topology/ The topology service is responsible for | ||
307 | maintaining the mesh topology. It tries to maintain connections to friends | ||
308 | (depending on the configuration) and also tries to ensure that the peer has a | ||
309 | decent number of active connections at all times. If necessary, new connections | ||
310 | are added. All peers should run the topology service, otherwise they may end up | ||
311 | not being connected to any other peer (unless some other service ensures that | ||
312 | core establishes the required connections). The topology service also tells the | ||
313 | transport service which connections are permitted (for friend-to-friend | ||
314 | networking) | ||
315 | @item fs/ The file-sharing (FS) service implements GNUnet's | ||
316 | file-sharing application. Both anonymous file-sharing (using gap) and | ||
317 | non-anonymous file-sharing (using dht) are supported. | ||
318 | @item cadet/ The CADET | ||
319 | service provides a general-purpose routing abstraction to create end-to-end | ||
320 | encrypted tunnels in mesh networks. We wrote a paper documenting key aspects of | ||
321 | the design. | ||
322 | @item tun/ --- libgnunettun Library for building IPv4, IPv6 | ||
323 | packets and creating checksums for UDP, TCP and ICMP packets. The header | ||
324 | defines C structs for common Internet packet formats and in particular structs | ||
325 | for interacting with TUN (virtual network) interfaces. | ||
326 | @item mysql/ --- | ||
327 | libgnunetmysql Library for creating and executing prepared MySQL statements and | ||
328 | to manage the connection to the MySQL database. Essentially a lightweight | ||
329 | wrapper for the interaction between GNUnet components and libmysqlclient. | ||
330 | @item dns/ Service that allows intercepting and modifying DNS requests of the | ||
331 | local machine. Currently used for IPv4-IPv6 protocol translation (DNS-ALG) as | ||
332 | implemented by "pt/" and for the GNUnet naming system. The service can also be | ||
333 | configured to offer an exit service for DNS traffic. | ||
334 | @item vpn/ The virtual | ||
335 | public network (VPN) service provides a virtual tunnel interface (VTUN) for IP | ||
336 | routing over GNUnet. Needs some other peers to run an "exit" service to work. | ||
337 | Can be activated using the "gnunet-vpn" tool or integrated with DNS using the | ||
338 | "pt" daemon. | ||
339 | @item exit/ Daemon to allow traffic from the VPN to exit this | ||
340 | peer to the Internet or to specific IP-based services of the local peer. | ||
341 | Currently, an exit service can only be restricted to IPv4 or IPv6, not to | ||
342 | specific ports and or IP address ranges. If this is not acceptable, additional | ||
343 | firewall rules must be added manually. exit currently only works for normal | ||
344 | UDP, TCP and ICMP traffic; DNS queries need to leave the system via a DNS | ||
345 | service. | ||
346 | @item pt/ protocol translation daemon. This daemon enables 4-to-6, | ||
347 | 6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It essentially | ||
348 | uses "DNS" to intercept DNS replies and then maps results to those offered by | ||
349 | the VPN, which then sends them using mesh to some daemon offering an | ||
350 | appropriate exit service. | ||
351 | @item identity/ Management of egos (alter egos) of a | ||
352 | user; identities are essentially named ECC private keys and used for zones in | ||
353 | the GNU name system and for namespaces in file-sharing, but might find other | ||
354 | uses later | ||
355 | @item revocation/ Key revocation service, can be used to revoke the | ||
356 | private key of an identity if it has been compromised | ||
357 | @item namecache/ Cache | ||
358 | for resolution results for the GNU name system; data is encrypted and can be | ||
359 | shared among users, loss of the data should ideally only result in a | ||
360 | performance degradation (persistence not required) | ||
361 | @item namestore/ Database | ||
362 | for the GNU name system with per-user private information, persistence required | ||
363 | @item gns/ GNU name system, a GNU approach to DNS and PKI. | ||
364 | @item dv/ A plugin | ||
365 | for distance-vector (DV)-based routing. DV consists of a service and a | ||
366 | transport plugin to provide peers with the illusion of a direct P2P connection | ||
367 | for connections that use multiple (typically up to 3) hops in the actual | ||
368 | underlay network. | ||
369 | @item regex/ Service for the (distributed) evaluation of | ||
370 | regular expressions. | ||
371 | @item scalarproduct/ The scalar product service offers an | ||
372 | API to perform a secure multiparty computation which calculates a scalar | ||
373 | product between two peers without exposing the private input vectors of the | ||
374 | peers to each other. | ||
375 | @item consensus/ The consensus service will allow a set | ||
376 | of peers to agree on a set of values via a distributed set union computation. | ||
377 | @item rest/ The rest API allows access to GNUnet services using RESTful | ||
378 | interaction. The services provide plugins that can exposed by the rest server. | ||
379 | @item experimentation/ The experimentation daemon coordinates distributed | ||
380 | experimentation to evaluate transport and ats properties | ||
381 | @end table | ||
382 | |||
383 | @c *************************************************************************** | ||
384 | @node System Architecture | ||
385 | @section System Architecture | ||
386 | |||
387 | GNUnet developers like legos. The blocks are indestructible, can be stacked | ||
388 | together to construct complex buildings and it is generally easy to swap one | ||
389 | block for a different one that has the same shape. GNUnet's architecture is | ||
390 | based on legos: | ||
391 | |||
392 | |||
393 | |||
394 | This chapter documents the GNUnet lego system, also known as GNUnet's system | ||
395 | architecture. | ||
396 | |||
397 | The most common GNUnet component is a service. Services offer an API (or | ||
398 | several, depending on what you count as "an API") which is implemented as a | ||
399 | library. The library communicates with the main process of the service using a | ||
400 | service-specific network protocol. The main process of the service typically | ||
401 | doesn't fully provide everything that is needed --- it has holes to be filled | ||
402 | by APIs to other services. | ||
403 | |||
404 | A special kind of component in GNUnet are user interfaces and daemons. Like | ||
405 | services, they have holes to be filled by APIs of other services. Unlike | ||
406 | services, daemons do not implement their own network protocol and they have no | ||
407 | API: | ||
408 | |||
409 | The GNUnet system provides a range of services, daemons and user interfaces, | ||
410 | which are then combined into a layered GNUnet instance (also known as a peer). | ||
411 | |||
412 | Note that while it is generally possible to swap one service for another | ||
413 | compatible service, there is often only one implementation. However, during | ||
414 | development we often have a "new" version of a service in parallel with an | ||
415 | "old" version. While the "new" version is not working, developers working on | ||
416 | other parts of the service can continue their development by simply using the | ||
417 | "old" service. Alternative design ideas can also be easily investigated by | ||
418 | swapping out individual components. This is typically achieved by simply | ||
419 | changing the name of the "BINARY" in the respective configuration section. | ||
420 | |||
421 | Key properties of GNUnet services are that they must be separate processes and | ||
422 | that they must protect themselves by applying tight error checking against the | ||
423 | network protocol they implement (thereby achieving a certain degree of | ||
424 | robustness). | ||
425 | |||
426 | On the other hand, the APIs are implemented to tolerate failures of the | ||
427 | service, isolating their host process from errors by the service. If the | ||
428 | service process crashes, other services and daemons around it should not also | ||
429 | fail, but instead wait for the service process to be restarted by ARM. | ||
430 | |||
431 | |||
432 | @c *************************************************************************** | ||
433 | @node Subsystem stability | ||
434 | @section Subsystem stability | ||
435 | |||
436 | This page documents the current stability of the various GNUnet subsystems. | ||
437 | Stability here describes the expected degree of compatibility with future | ||
438 | versions of GNUnet. For each subsystem we distinguish between compatibility on | ||
439 | the P2P network level (communication protocol between peers), the IPC level | ||
440 | (communication between the service and the service library) and the API level | ||
441 | (stability of the API). P2P compatibility is relevant in terms of which | ||
442 | applications are likely going to be able to communicate with future versions of | ||
443 | the network. IPC communication is relevant for the implementation of language | ||
444 | bindings that re-implement the IPC messages. Finally, API compatibility is | ||
445 | relevant to developers that hope to be able to avoid changes to applications | ||
446 | build on top of the APIs of the framework. | ||
447 | |||
448 | The following table summarizes our current view of the stability of the | ||
449 | respective protocols or APIs: | ||
450 | |||
451 | @multitable @columnfractions .20 .20 .20 .20 | ||
452 | @headitem Subsystem @tab P2P @tab IPC @tab C API | ||
453 | @item util @tab n/a @tab n/a @tab stable | ||
454 | @item arm @tab n/a @tab stable @tab stable | ||
455 | @item ats @tab n/a @tab unstable @tab testing | ||
456 | @item block @tab n/a @tab n/a @tab stable | ||
457 | @item cadet @tab testing @tab testing @tab testing | ||
458 | @item consensus @tab experimental @tab experimental @tab experimental | ||
459 | @item core @tab stable @tab stable @tab stable | ||
460 | @item datacache @tab n/a @tab n/a @tab stable | ||
461 | @item datastore @tab n/a @tab stable @tab stable | ||
462 | @item dht @tab stable @tab stable @tab stable | ||
463 | @item dns @tab stable @tab stable @tab stable | ||
464 | @item dv @tab testing @tab testing @tab n/a | ||
465 | @item exit @tab testing @tab n/a @tab n/a | ||
466 | @item fragmentation @tab stable @tab n/a @tab stable | ||
467 | @item fs @tab stable @tab stable @tab stable | ||
468 | @item gns @tab stable @tab stable @tab stable | ||
469 | @item hello @tab n/a @tab n/a @tab testing | ||
470 | @item hostlist @tab stable @tab stable @tab n/a | ||
471 | @item identity @tab stable @tab stable @tab n/a | ||
472 | @item multicast @tab experimental @tab experimental @tab experimental | ||
473 | @item mysql @tab stable @tab n/a @tab stable | ||
474 | @item namestore @tab n/a @tab stable @tab stable | ||
475 | @item nat @tab n/a @tab n/a @tab stable | ||
476 | @item nse @tab stable @tab stable @tab stable | ||
477 | @item peerinfo @tab n/a @tab stable @tab stable | ||
478 | @item psyc @tab experimental @tab experimental @tab experimental | ||
479 | @item pt @tab n/a @tab n/a @tab n/a | ||
480 | @item regex @tab stable @tab stable @tab stable | ||
481 | @item revocation @tab stable @tab stable @tab stable | ||
482 | @item social @tab experimental @tab experimental @tab experimental | ||
483 | @item statistics @tab n/a @tab stable @tab stable | ||
484 | @item testbed @tab n/a @tab testing @tab testing | ||
485 | @item testing @tab n/a @tab n/a @tab testing | ||
486 | @item topology @tab n/a @tab n/a @tab n/a | ||
487 | @item transport @tab stable @tab stable @tab stable | ||
488 | @item tun @tab n/a @tab n/a @tab stable | ||
489 | @item vpn @tab testing @tab n/a @tab n/a | ||
490 | @end multitable | ||
491 | |||
492 | Here is a rough explanation of the values: | ||
493 | |||
494 | @table @samp | ||
495 | @item stable | ||
496 | No incompatible changes are planned at this time; for IPC/APIs, if | ||
497 | there are incompatible changes, they will be minor and might only require | ||
498 | minimal changes to existing code; for P2P, changes will be avoided if at all | ||
499 | possible for the 0.10.x-series | ||
500 | |||
501 | @item testing | ||
502 | No incompatible changes are | ||
503 | planned at this time, but the code is still known to be in flux; so while we | ||
504 | have no concrete plans, our expectation is that there will still be minor | ||
505 | modifications; for P2P, changes will likely be extensions that should not break | ||
506 | existing code | ||
507 | |||
508 | @item unstable | ||
509 | Changes are planned and will happen; however, they | ||
510 | will not be totally radical and the result should still resemble what is there | ||
511 | now; nevertheless, anticipated changes will break protocol/API compatibility | ||
512 | |||
513 | @item experimental | ||
514 | Changes are planned and the result may look nothing like | ||
515 | what the API/protocol looks like today | ||
516 | |||
517 | @item unknown | ||
518 | Someone should think about where this subsystem headed | ||
519 | |||
520 | @item n/a | ||
521 | This subsystem does not have an API/IPC-protocol/P2P-protocol | ||
522 | @end table | ||
523 | |||
524 | @c *************************************************************************** | ||
525 | @node Naming conventions and coding style guide | ||
526 | @section Naming conventions and coding style guide | ||
527 | |||
528 | Here you can find some rules to help you write code for GNUnet. | ||
529 | |||
530 | |||
531 | |||
532 | @c *************************************************************************** | ||
533 | @menu | ||
534 | * Naming conventions:: | ||
535 | * Coding style:: | ||
536 | @end menu | ||
537 | |||
538 | @node Naming conventions | ||
539 | @subsection Naming conventions | ||
540 | |||
541 | |||
542 | @c *************************************************************************** | ||
543 | @menu | ||
544 | * include files:: | ||
545 | * binaries:: | ||
546 | * logging:: | ||
547 | * configuration:: | ||
548 | * exported symbols:: | ||
549 | * private (library-internal) symbols (including structs and macros):: | ||
550 | * testcases:: | ||
551 | * performance tests:: | ||
552 | * src/ directories:: | ||
553 | @end menu | ||
554 | |||
555 | @node include files | ||
556 | @subsubsection include files | ||
557 | |||
558 | @itemize @bullet | ||
559 | @item _lib: library without need for a process | ||
560 | @item _service: library that needs a service process | ||
561 | @item _plugin: plugin definition | ||
562 | @item _protocol: structs used in network protocol | ||
563 | @item exceptions: | ||
564 | @itemize @bullet | ||
565 | @item gnunet_config.h --- generated | ||
566 | @item platform.h --- first included | ||
567 | @item plibc.h --- external library | ||
568 | @item gnunet_common.h --- fundamental routines | ||
569 | @item gnunet_directories.h --- generated | ||
570 | @item gettext.h --- external library | ||
571 | @end itemize | ||
572 | @end itemize | ||
573 | |||
574 | @c *************************************************************************** | ||
575 | @node binaries | ||
576 | @subsubsection binaries | ||
577 | |||
578 | @itemize @bullet | ||
579 | @item gnunet-service-xxx: service process (has listen socket) | ||
580 | @item gnunet-daemon-xxx: daemon process (no listen socket) | ||
581 | @item gnunet-helper-xxx[-yyy]: SUID helper for module xxx | ||
582 | @item gnunet-yyy: command-line tool for end-users | ||
583 | @item libgnunet_plugin_xxx_yyy.so: plugin for API xxx | ||
584 | @item libgnunetxxx.so: library for API xxx | ||
585 | @end itemize | ||
586 | |||
587 | @c *************************************************************************** | ||
588 | @node logging | ||
589 | @subsubsection logging | ||
590 | |||
591 | @itemize @bullet | ||
592 | @item services and daemons use their directory name in GNUNET_log_setup (i.e. | ||
593 | 'core') and log using plain 'GNUNET_log'. | ||
594 | @item command-line tools use their full name in GNUNET_log_setup (i.e. | ||
595 | 'gnunet-publish') and log using plain 'GNUNET_log'. | ||
596 | @item service access libraries log using 'GNUNET_log_from' and use | ||
597 | 'DIRNAME-api' for the component (i.e. 'core-api') | ||
598 | @item pure libraries (without associated service) use 'GNUNET_log_from' with | ||
599 | the component set to their library name (without lib or '.so'), which should | ||
600 | also be their directory name (i.e. 'nat') | ||
601 | @item plugins should use 'GNUNET_log_from' with the directory name and the | ||
602 | plugin name combined to produce the component name (i.e. 'transport-tcp'). | ||
603 | @item logging should be unified per-file by defining a LOG macro with the | ||
604 | appropriate arguments, along these lines:@ #define LOG(kind,...) | ||
605 | GNUNET_log_from (kind, "example-api",__VA_ARGS__) | ||
606 | @end itemize | ||
607 | |||
608 | @c *************************************************************************** | ||
609 | @node configuration | ||
610 | @subsubsection configuration | ||
611 | |||
612 | @itemize @bullet | ||
613 | @item paths (that are substituted in all filenames) are in PATHS (have as few | ||
614 | as possible) | ||
615 | @item all options for a particular module (src/MODULE) are under [MODULE] | ||
616 | @item options for a plugin of a module are under [MODULE-PLUGINNAME] | ||
617 | @end itemize | ||
618 | |||
619 | @c *************************************************************************** | ||
620 | @node exported symbols | ||
621 | @subsubsection exported symbols | ||
622 | |||
623 | @itemize @bullet | ||
624 | @item must start with "GNUNET_modulename_" and be defined in "modulename.c" | ||
625 | @item exceptions: those defined in gnunet_common.h | ||
626 | @end itemize | ||
627 | |||
628 | @c *************************************************************************** | ||
629 | @node private (library-internal) symbols (including structs and macros) | ||
630 | @subsubsection private (library-internal) symbols (including structs and macros) | ||
631 | |||
632 | @itemize @bullet | ||
633 | @item must NOT start with any prefix | ||
634 | @item must not be exported in a way that linkers could use them or@ other | ||
635 | libraries might see them via headers; they must be either@ declared/defined in | ||
636 | C source files or in headers that are in@ the respective directory under | ||
637 | src/modulename/ and NEVER be@ declared in src/include/. | ||
638 | @end itemize | ||
639 | |||
640 | @node testcases | ||
641 | @subsubsection testcases | ||
642 | |||
643 | @itemize @bullet | ||
644 | @item must be called "test_module-under-test_case-description.c" | ||
645 | @item "case-description" maybe omitted if there is only one test | ||
646 | @end itemize | ||
647 | |||
648 | @c *************************************************************************** | ||
649 | @node performance tests | ||
650 | @subsubsection performance tests | ||
651 | |||
652 | @itemize @bullet | ||
653 | @item must be called "perf_module-under-test_case-description.c" | ||
654 | @item "case-description" maybe omitted if there is only one performance test | ||
655 | @item Must only be run if HAVE_BENCHMARKS is satisfied | ||
656 | @end itemize | ||
657 | |||
658 | @c *************************************************************************** | ||
659 | @node src/ directories | ||
660 | @subsubsection src/ directories | ||
661 | |||
662 | @itemize @bullet | ||
663 | @item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm) | ||
664 | @item gnunet-service-NAME: service processes with accessor library (i.e., | ||
665 | gnunet-service-arm) | ||
666 | @item libgnunetNAME: accessor library (_service.h-header) or standalone library | ||
667 | (_lib.h-header) | ||
668 | @item gnunet-daemon-NAME: daemon process without accessor library (i.e., | ||
669 | gnunet-daemon-hostlist) and no GNUnet management port | ||
670 | @item libgnunet_plugin_DIR_NAME: loadable plugins (i.e., | ||
671 | libgnunet_plugin_transport_tcp) | ||
672 | @end itemize | ||
673 | |||
674 | @c *************************************************************************** | ||
675 | @node Coding style | ||
676 | @subsection Coding style | ||
677 | |||
678 | @itemize @bullet | ||
679 | @item GNU guidelines generally apply | ||
680 | @item Indentation is done with spaces, two per level, no tabs | ||
681 | @item C99 struct initialization is fine | ||
682 | @item declare only one variable per line, so@ | ||
683 | |||
684 | @example | ||
685 | int i; int j; | ||
686 | @end example | ||
687 | |||
688 | instead of | ||
689 | |||
690 | @example | ||
691 | int i,j; | ||
692 | @end example | ||
693 | |||
694 | This helps keep diffs small and forces developers to think precisely about the | ||
695 | type of every variable. Note that @code{char *} is different from @code{const | ||
696 | char*} and @code{int} is different from @code{unsigned int} or @code{uint32_t}. | ||
697 | Each variable type should be chosen with care. | ||
698 | |||
699 | @item While @code{goto} should generally be avoided, having a @code{goto} to | ||
700 | the end of a function to a block of clean up statements (free, close, etc.) can | ||
701 | be acceptable. | ||
702 | |||
703 | @item Conditions should be written with constants on the left (to avoid | ||
704 | accidental assignment) and with the 'true' target being either the 'error' case | ||
705 | or the significantly simpler continuation. For example:@ | ||
706 | |||
707 | @example | ||
708 | if (0 != stat ("filename," &sbuf)) @{ error(); @} else @{ | ||
709 | /* handle normal case here */ | ||
710 | @} | ||
711 | @end example | ||
712 | |||
713 | |||
714 | instead of | ||
715 | @example | ||
716 | if (stat ("filename," &sbuf) == 0) @{ | ||
717 | /* handle normal case here */ | ||
718 | @} else @{ error(); @} | ||
719 | @end example | ||
720 | |||
721 | |||
722 | If possible, the error clause should be terminated with a 'return' (or 'goto' | ||
723 | to some cleanup routine) and in this case, the 'else' clause should be omitted: | ||
724 | @example | ||
725 | if (0 != stat ("filename," &sbuf)) @{ error(); return; @} | ||
726 | /* handle normal case here */ | ||
727 | @end example | ||
728 | |||
729 | |||
730 | This serves to avoid deep nesting. The 'constants on the left' rule applies to | ||
731 | all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}), NULL, and enums). | ||
732 | With the two above rules (constants on left, errors in 'true' branch), there is | ||
733 | only one way to write most branches correctly. | ||
734 | |||
735 | @item Combined assignments and tests are allowed if they do not hinder code | ||
736 | clarity. For example, one can write:@ | ||
737 | |||
738 | @example | ||
739 | if (NULL == (value = lookup_function())) @{ error(); return; @} | ||
740 | @end example | ||
741 | |||
742 | |||
743 | @item Use @code{break} and @code{continue} wherever possible to avoid deep(er) | ||
744 | nesting. Thus, we would write:@ | ||
745 | |||
746 | @example | ||
747 | next = head; while (NULL != (pos = next)) @{ next = pos->next; if (! | ||
748 | should_free (pos)) continue; GNUNET_CONTAINER_DLL_remove (head, tail, pos); | ||
749 | GNUNET_free (pos); @} | ||
750 | @end example | ||
751 | |||
752 | |||
753 | instead of | ||
754 | @example | ||
755 | next = head; while (NULL != (pos = next)) @{ next = | ||
756 | pos->next; if (should_free (pos)) @{ | ||
757 | /* unnecessary nesting! */ | ||
758 | GNUNET_CONTAINER_DLL_remove (head, tail, pos); GNUNET_free (pos); @} @} | ||
759 | @end example | ||
760 | |||
761 | |||
762 | @item We primarily use @code{for} and @code{while} loops. A @code{while} loop | ||
763 | is used if the method for advancing in the loop is not a straightforward | ||
764 | increment operation. In particular, we use:@ | ||
765 | |||
766 | @example | ||
767 | next = head; | ||
768 | while (NULL != (pos = next)) | ||
769 | @{ | ||
770 | next = pos->next; | ||
771 | if (! should_free (pos)) | ||
772 | continue; | ||
773 | GNUNET_CONTAINER_DLL_remove (head, tail, pos); | ||
774 | GNUNET_free (pos); | ||
775 | @} | ||
776 | @end example | ||
777 | |||
778 | |||
779 | to free entries in a list (as the iteration changes the structure of the list | ||
780 | due to the free; the equivalent @code{for} loop does no longer follow the | ||
781 | simple @code{for} paradigm of @code{for(INIT;TEST;INC)}). However, for loops | ||
782 | that do follow the simple @code{for} paradigm we do use @code{for}, even if it | ||
783 | involves linked lists: | ||
784 | @example | ||
785 | /* simple iteration over a linked list */ | ||
786 | for (pos = head; NULL != pos; pos = pos->next) | ||
787 | @{ | ||
788 | use (pos); | ||
789 | @} | ||
790 | @end example | ||
791 | |||
792 | |||
793 | @item The first argument to all higher-order functions in GNUnet must be | ||
794 | declared to be of type @code{void *} and is reserved for a closure. We do not | ||
795 | use inner functions, as trampolines would conflict with setups that use | ||
796 | non-executable stacks.@ The first statement in a higher-order function, which | ||
797 | unusually should be part of the variable declarations, should assign the | ||
798 | @code{cls} argument to the precise expected type. For example: | ||
799 | @example | ||
800 | int callback (void *cls, char *args) @{ | ||
801 | struct Foo *foo = cls; int other_variables; | ||
802 | |||
803 | /* rest of function */ | ||
804 | @} | ||
805 | @end example | ||
806 | |||
807 | |||
808 | @item It is good practice to write complex @code{if} expressions instead of | ||
809 | using deeply nested @code{if} statements. However, except for addition and | ||
810 | multiplication, all operators should use parens. This is fine:@ | ||
811 | |||
812 | @example | ||
813 | if ( (1 == foo) || ((0 == bar) && (x != y)) ) | ||
814 | return x; | ||
815 | @end example | ||
816 | |||
817 | |||
818 | However, this is not: | ||
819 | @example | ||
820 | if (1 == foo) | ||
821 | return x; | ||
822 | if (0 == bar && x != y) | ||
823 | return x; | ||
824 | @end example | ||
825 | |||
826 | |||
827 | Note that splitting the @code{if} statement above is debateable as the | ||
828 | @code{return x} is a very trivial statement. However, once the logic after the | ||
829 | branch becomes more complicated (and is still identical), the "or" formulation | ||
830 | should be used for sure. | ||
831 | |||
832 | @item There should be two empty lines between the end of the function and the | ||
833 | comments describing the following function. There should be a single empty line | ||
834 | after the initial variable declarations of a function. If a function has no | ||
835 | local variables, there should be no initial empty line. If a long function | ||
836 | consists of several complex steps, those steps might be separated by an empty | ||
837 | line (possibly followed by a comment describing the following step). The code | ||
838 | should not contain empty lines in arbitrary places; if in doubt, it is likely | ||
839 | better to NOT have an empty line (this way, more code will fit on the screen). | ||
840 | @end itemize | ||
841 | |||
842 | @c *************************************************************************** | ||
843 | @node Build-system | ||
844 | @section Build-system | ||
845 | |||
846 | If you have code that is likely not to compile or build rules you might want to | ||
847 | not trigger for most developers, use "if HAVE_EXPERIMENTAL" in your | ||
848 | Makefile.am. Then it is OK to (temporarily) add non-compiling (or | ||
849 | known-to-not-port) code. | ||
850 | |||
851 | If you want to compile all testcases but NOT run them, run configure with the@ | ||
852 | @code{--enable-test-suppression} option. | ||
853 | |||
854 | If you want to run all testcases, including those that take a while, run | ||
855 | configure with the@ @code{--enable-expensive-testcases} option. | ||
856 | |||
857 | If you want to compile and run benchmarks, run configure with the@ | ||
858 | @code{--enable-benchmarks} option. | ||
859 | |||
860 | If you want to obtain code coverage results, run configure with the@ | ||
861 | @code{--enable-coverage} option and run the coverage.sh script in contrib/. | ||
862 | |||
863 | @c *************************************************************************** | ||
864 | @node Developing extensions for GNUnet using the gnunet-ext template | ||
865 | @section Developing extensions for GNUnet using the gnunet-ext template | ||
866 | |||
867 | |||
868 | For developers who want to write extensions for GNUnet we provide the | ||
869 | gnunet-ext template to provide an easy to use skeleton. | ||
870 | |||
871 | gnunet-ext contains the build environment and template files for the | ||
872 | development of GNUnet services, command line tools, APIs and tests. | ||
873 | |||
874 | First of all you have to obtain gnunet-ext from SVN: | ||
875 | |||
876 | @code{svn co https://gnunet.org/svn/gnunet-ext} | ||
877 | |||
878 | The next step is to bootstrap and configure it. For configure you have to | ||
879 | provide the path containing GNUnet with @code{--with-gnunet=/path/to/gnunet} | ||
880 | and the prefix where you want the install the extension using | ||
881 | @code{--prefix=/path/to/install}@ @code{@ ./bootstrap@ ./configure | ||
882 | --prefix=/path/to/install --with-gnunet=/path/to/gnunet@ } | ||
883 | |||
884 | When your GNUnet installation is not included in the default linker search | ||
885 | path, you have to add @code{/path/to/gnunet} to the file @code{/etc/ld.so.conf} | ||
886 | and run @code{ldconfig} or your add it to the environmental variable | ||
887 | @code{LD_LIBRARY_PATH} by using | ||
888 | |||
889 | @code{export LD_LIBRARY_PATH=/path/to/gnunet/lib} | ||
890 | |||
891 | @c *************************************************************************** | ||
892 | @node Writing testcases | ||
893 | @section Writing testcases | ||
894 | |||
895 | Ideally, any non-trivial GNUnet code should be covered by automated testcases. | ||
896 | Testcases should reside in the same place as the code that is being tested. The | ||
897 | name of source files implementing tests should begin with "test_" followed by | ||
898 | the name of the file that contains the code that is being tested. | ||
899 | |||
900 | Testcases in GNUnet should be integrated with the autotools build system. This | ||
901 | way, developers and anyone building binary packages will be able to run all | ||
902 | testcases simply by running @code{make check}. The final testcases shipped with | ||
903 | the distribution should output at most some brief progress information and not | ||
904 | display debug messages by default. The success or failure of a testcase must be | ||
905 | indicated by returning zero (success) or non-zero (failure) from the main | ||
906 | method of the testcase. The integration with the autotools is relatively | ||
907 | straightforward and only requires modifications to the @code{Makefile.am} in | ||
908 | the directory containing the testcase. For a testcase testing the code in | ||
909 | @code{foo.c} the @code{Makefile.am} would contain the following lines: | ||
910 | @example | ||
911 | check_PROGRAMS = test_foo TESTS = $(check_PROGRAMS) test_foo_SOURCES = | ||
912 | test_foo.c test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la | ||
913 | @end example | ||
914 | |||
915 | Naturally, other libraries used by the testcase may be specified in the | ||
916 | @code{LDADD} directive as necessary. | ||
917 | |||
918 | Often testcases depend on additional input files, such as a configuration file. | ||
919 | These support files have to be listed using the EXTRA_DIST directive in order | ||
920 | to ensure that they are included in the distribution. Example: | ||
921 | @example | ||
922 | EXTRA_DIST = test_foo_data.conf | ||
923 | @end example | ||
924 | |||
925 | |||
926 | Executing @code{make check} will run all testcases in the current directory and | ||
927 | all subdirectories. Testcases can be compiled individually by running | ||
928 | @code{make test_foo} and then invoked directly using @code{./test_foo}. Note | ||
929 | that due to the use of plugins in GNUnet, it is typically necessary to run | ||
930 | @code{make install} before running any testcases. Thus the canonical command | ||
931 | @code{make check install} has to be changed to @code{make install check} for | ||
932 | GNUnet. | ||
933 | |||
934 | @c *************************************************************************** | ||
935 | @node GNUnet's TESTING library | ||
936 | @section GNUnet's TESTING library | ||
937 | |||
938 | The TESTING library is used for writing testcases which involve starting a | ||
939 | single or multiple peers. While peers can also be started by testcases using | ||
940 | the ARM subsystem, using TESTING library provides an elegant way to do this. | ||
941 | The configurations of the peers are auto-generated from a given template to | ||
942 | have non-conflicting port numbers ensuring that peers' services do not run into | ||
943 | bind errors. This is achieved by testing ports' availability by binding a | ||
944 | listening socket to them before allocating them to services in the generated | ||
945 | configurations. | ||
946 | |||
947 | An another advantage while using TESTING is that it shortens the testcase | ||
948 | startup time as the hostkeys for peers are copied from a pre-computed set of | ||
949 | hostkeys instead of generating them at peer startup which may take a | ||
950 | considerable amount of time when starting multiple peers or on an embedded | ||
951 | processor. | ||
952 | |||
953 | TESTING also allows for certain services to be shared among peers. This feature | ||
954 | is invaluable when testing with multiple peers as it helps to reduce the number | ||
955 | of services run per each peer and hence the total number of processes run per | ||
956 | testcase. | ||
957 | |||
958 | TESTING library only handles creating, starting and stopping peers. Features | ||
959 | useful for testcases such as connecting peers in a topology are not available | ||
960 | in TESTING but are available in the TESTBED subsystem. Furthermore, TESTING | ||
961 | only creates peers on the localhost, however by using TESTBED testcases can | ||
962 | benefit from creating peers across multiple hosts. | ||
963 | |||
964 | @menu | ||
965 | * API:: | ||
966 | * Finer control over peer stop:: | ||
967 | * Helper functions:: | ||
968 | * Testing with multiple processes:: | ||
969 | @end menu | ||
970 | |||
971 | @c *************************************************************************** | ||
972 | @node API | ||
973 | @subsection API | ||
974 | |||
975 | TESTING abstracts a group of peers as a TESTING system. All peers in a system | ||
976 | have common hostname and no two services of these peers have a same port or a | ||
977 | UNIX domain socket path. | ||
978 | |||
979 | TESTING system can be created with the function | ||
980 | @code{GNUNET_TESTING_system_create()} which returns a handle to the system. | ||
981 | This function takes a directory path which is used for generating the | ||
982 | configurations of peers, an IP address from which connections to the peers' | ||
983 | services should be allowed, the hostname to be used in peers' configuration, | ||
984 | and an array of shared service specifications of type @code{struct | ||
985 | GNUNET_TESTING_SharedService}. | ||
986 | |||
987 | The shared service specification must specify the name of the service to share, | ||
988 | the configuration pertaining to that shared service and the maximum number of | ||
989 | peers that are allowed to share a single instance of the shared service. | ||
990 | |||
991 | TESTING system created with @code{GNUNET_TESTING_system_create()} chooses ports | ||
992 | from the default range 12000 - 56000 while auto-generating configurations for | ||
993 | peers. This range can be customised with the function | ||
994 | @code{GNUNET_TESTING_system_create_with_portrange()}. This function is similar | ||
995 | to @code{GNUNET_TESTING_system_create()} except that it take 2 additional | ||
996 | parameters --- the start and end of the port range to use. | ||
997 | |||
998 | A TESTING system is destroyed with the funciton | ||
999 | @code{GNUNET_TESTING_system_destory()}. This function takes the handle of the | ||
1000 | system and a flag to remove the files created in the directory used to generate | ||
1001 | configurations. | ||
1002 | |||
1003 | A peer is created with the function @code{GNUNET_TESTING_peer_configure()}. | ||
1004 | This functions takes the system handle, a configuration template from which the | ||
1005 | configuration for the peer is auto-generated and the index from where the | ||
1006 | hostkey for the peer has to be copied from. When successfull, this function | ||
1007 | returs a handle to the peer which can be used to start and stop it and to | ||
1008 | obtain the identity of the peer. If unsuccessful, a NULL pointer is returned | ||
1009 | with an error message. This function handles the generated configuration to | ||
1010 | have non-conflicting ports and paths. | ||
1011 | |||
1012 | Peers can be started and stopped by calling the functions | ||
1013 | @code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()} | ||
1014 | respectively. A peer can be destroyed by calling the function | ||
1015 | @code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports and | ||
1016 | paths in allocated in its configuration are reclaimed for usage in new | ||
1017 | peers. | ||
1018 | |||
1019 | @c *************************************************************************** | ||
1020 | @node Finer control over peer stop | ||
1021 | @subsection Finer control over peer stop | ||
1022 | |||
1023 | Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases. | ||
1024 | However, calling this function for each peer is inefficient when trying to | ||
1025 | shutdown multiple peers as this function sends the termination signal to the | ||
1026 | given peer process and waits for it to terminate. It would be faster in this | ||
1027 | case to send the termination signals to the peers first and then wait on them. | ||
1028 | This is accomplished by the functions @code{GNUNET_TESTING_peer_kill()} which | ||
1029 | sends a termination signal to the peer, and the function | ||
1030 | @code{GNUNET_TESTING_peer_wait()} which waits on the peer. | ||
1031 | |||
1032 | Further finer control can be achieved by choosing to stop a peer asynchronously | ||
1033 | with the function @code{GNUNET_TESTING_peer_stop_async()}. This function takes | ||
1034 | a callback parameter and a closure for it in addition to the handle to the peer | ||
1035 | to stop. The callback function is called with the given closure when the peer | ||
1036 | is stopped. Using this function eliminates blocking while waiting for the peer | ||
1037 | to terminate. | ||
1038 | |||
1039 | An asynchronous peer stop can be cancelled by calling the function | ||
1040 | @code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this function | ||
1041 | does not prevent the peer from terminating if the termination signal has | ||
1042 | already been sent to it. It does, however, cancels the callback to be called | ||
1043 | when the peer is stopped. | ||
1044 | |||
1045 | @c *************************************************************************** | ||
1046 | @node Helper functions | ||
1047 | @subsection Helper functions | ||
1048 | |||
1049 | Most of the testcases can benefit from an abstraction which configures a peer | ||
1050 | and starts it. This is provided by the function | ||
1051 | @code{GNUNET_TESTING_peer_run()}. This function takes the testing directory | ||
1052 | pathname, a configuration template, a callback and its closure. This function | ||
1053 | creates a peer in the given testing directory by using the configuration | ||
1054 | template, starts the peer and calls the given callback with the given closure. | ||
1055 | |||
1056 | The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of the | ||
1057 | peer which starts the rest of the configured services. A similar function | ||
1058 | @code{GNUNET_TESTING_service_run} can be used to just start a single service of | ||
1059 | a peer. In this case, the peer's ARM service is not started; instead, only the | ||
1060 | given service is run. | ||
1061 | |||
1062 | @c *************************************************************************** | ||
1063 | @node Testing with multiple processes | ||
1064 | @subsection Testing with multiple processes | ||
1065 | |||
1066 | When testing GNUnet, the splitting of the code into a services and clients | ||
1067 | often complicates testing. The solution to this is to have the testcase fork | ||
1068 | @code{gnunet-service-arm}, ask it to start the required server and daemon | ||
1069 | processes and then execute appropriate client actions (to test the client APIs | ||
1070 | or the core module or both). If necessary, multiple ARM services can be forked | ||
1071 | using different ports (!) to simulate a network. However, most of the time only | ||
1072 | one ARM process is needed. Note that on exit, the testcase should shutdown ARM | ||
1073 | with a @code{TERM} signal (to give it the chance to cleanly stop its child | ||
1074 | processes). | ||
1075 | |||
1076 | The following code illustrates spawning and killing an ARM process from a | ||
1077 | testcase: | ||
1078 | @example | ||
1079 | static void run (void *cls, char *const *args, const char | ||
1080 | *cfgfile, const struct GNUNET_CONFIGURATION_Handle *cfg) @{ struct | ||
1081 | GNUNET_OS_Process *arm_pid; arm_pid = GNUNET_OS_start_process (NULL, NULL, | ||
1082 | "gnunet-service-arm", "gnunet-service-arm", "-c", cfgname, NULL); | ||
1083 | /* do real test work here */ | ||
1084 | if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM)) GNUNET_log_strerror | ||
1085 | (GNUNET_ERROR_TYPE_WARNING, "kill"); GNUNET_assert (GNUNET_OK == | ||
1086 | GNUNET_OS_process_wait (arm_pid)); GNUNET_OS_process_close (arm_pid); @} | ||
1087 | |||
1088 | GNUNET_PROGRAM_run (argc, argv, "NAME-OF-TEST", "nohelp", options, &run, cls); | ||
1089 | @end example | ||
1090 | |||
1091 | |||
1092 | An alternative way that works well to test plugins is to implement a | ||
1093 | mock-version of the environment that the plugin expects and then to simply load | ||
1094 | the plugin directly. | ||
1095 | |||
1096 | @c *************************************************************************** | ||
1097 | @node Performance regression analysis with Gauger | ||
1098 | @section Performance regression analysis with Gauger | ||
1099 | |||
1100 | To help avoid performance regressions, GNUnet uses Gauger. Gauger is a simple | ||
1101 | logging tool that allows remote hosts to send performance data to a central | ||
1102 | server, where this data can be analyzed and visualized. Gauger shows graphs of | ||
1103 | the repository revisions and the performace data recorded for each revision, so | ||
1104 | sudden performance peaks or drops can be identified and linked to a specific | ||
1105 | revision number. | ||
1106 | |||
1107 | In the case of GNUnet, the buildbots log the performance data obtained during | ||
1108 | the tests after each build. The data can be accesed on GNUnet's Gauger page. | ||
1109 | |||
1110 | The menu on the left allows to select either the results of just one build bot | ||
1111 | (under "Hosts") or review the data from all hosts for a given test result | ||
1112 | (under "Metrics"). In case of very different absolute value of the results, for | ||
1113 | instance arm vs. amd64 machines, the option "Normalize" on a metric view can | ||
1114 | help to get an idea about the performance evolution across all hosts. | ||
1115 | |||
1116 | Using Gauger in GNUnet and having the performance of a module tracked over time | ||
1117 | is very easy. First of course, the testcase must generate some consistent | ||
1118 | metric, which makes sense to have logged. Highly volatile or random dependant | ||
1119 | metrics probably are not ideal candidates for meaningful regression detection. | ||
1120 | |||
1121 | To start logging any value, just include @code{gauger.h} in your testcase code. | ||
1122 | Then, use the macro @code{GAUGER()} to make the buildbots log whatever value is | ||
1123 | of interest for you to @code{gnunet.org}'s Gauger server. No setup is necessary | ||
1124 | as most buildbots have already everything in place and new metrics are created | ||
1125 | on demand. To delete a metric, you need to contact a member of the GNUnet | ||
1126 | development team (a file will need to be removed manually from the respective | ||
1127 | directory). | ||
1128 | |||
1129 | The code in the test should look like this: | ||
1130 | @example | ||
1131 | [other includes] | ||
1132 | #include <gauger.h> | ||
1133 | |||
1134 | int main (int argc, char *argv[]) @{ | ||
1135 | |||
1136 | [run test, generate data] GAUGER("YOUR_MODULE", "METRIC_NAME", (float)value, | ||
1137 | "UNIT"); @} | ||
1138 | @end example | ||
1139 | |||
1140 | |||
1141 | Where: | ||
1142 | @table @asis | ||
1143 | |||
1144 | @item @strong{YOUR_MODULE} is a category in the gauger page and should be the | ||
1145 | name of the module or subsystem like "Core" or "DHT" | ||
1146 | @item @strong{METRIC} is | ||
1147 | the name of the metric being collected and should be concise and descriptive, | ||
1148 | like "PUT operations in sqlite-datastore". | ||
1149 | @item @strong{value} is the value | ||
1150 | of the metric that is logged for this run. | ||
1151 | @item @strong{UNIT} is the unit in | ||
1152 | which the value is measured, for instance "kb/s" or "kb of RAM/node". | ||
1153 | @end table | ||
1154 | |||
1155 | If you wish to use Gauger for your own project, you can grab a copy of the | ||
1156 | latest stable release or check out Gauger's Subversion repository. | ||
1157 | |||
1158 | @c *************************************************************************** | ||
1159 | @node GNUnet's TESTBED Subsystem | ||
1160 | @section GNUnet's TESTBED Subsystem | ||
1161 | |||
1162 | The TESTBED subsystem facilitates testing and measuring of multi-peer | ||
1163 | deployments on a single host or over multiple hosts. | ||
1164 | |||
1165 | The architecture of the testbed module is divided into the following: | ||
1166 | @itemize @bullet | ||
1167 | |||
1168 | @item Testbed API: An API which is used by the testing driver programs. It | ||
1169 | provides with functions for creating, destroying, starting, stopping peers, | ||
1170 | etc. | ||
1171 | |||
1172 | @item Testbed service (controller): A service which is started through the | ||
1173 | Testbed API. This service handles operations to create, destroy, start, stop | ||
1174 | peers, connect them, modify their configurations. | ||
1175 | |||
1176 | @item Testbed helper: When a controller has to be started on a host, the | ||
1177 | testbed API starts the testbed helper on that host which in turn starts the | ||
1178 | controller. The testbed helper receives a configuration for the controller | ||
1179 | through its stdin and changes it to ensure the controller doesn't run into any | ||
1180 | port conflict on that host. | ||
1181 | @end itemize | ||
1182 | |||
1183 | |||
1184 | The testbed service (controller) is different from the other GNUnet services in | ||
1185 | that it is not started by ARM and is not supposed to be run as a daemon. It is | ||
1186 | started by the testbed API through a testbed helper. In a typical scenario | ||
1187 | involving multiple hosts, a controller is started on each host. Controllers | ||
1188 | take up the actual task of creating peers, starting and stopping them on the | ||
1189 | hosts they run. | ||
1190 | |||
1191 | While running deployments on a single localhost the testbed API starts the | ||
1192 | testbed helper directly as a child process. When running deployments on remote | ||
1193 | hosts the testbed API starts Testbed Helpers on each remote host through remote | ||
1194 | shell. By default testbed API uses SSH as a remote shell. This can be changed | ||
1195 | by setting the environmental variable GNUNET_TESTBED_RSH_CMD to the required | ||
1196 | remote shell program. This variable can also contain parameters which are to be | ||
1197 | passed to the remote shell program. For e.g:@ @code{@ export | ||
1198 | GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes -o | ||
1199 | NoHostAuthenticationForLocalhost=yes %h"@ }@ Substitutions are allowed int the | ||
1200 | above command string also allows for substitions. through placemarks which | ||
1201 | begin with a `%'. At present the following substitutions are supported | ||
1202 | @itemize @bullet | ||
1203 | @item | ||
1204 | %h: hostname | ||
1205 | @item | ||
1206 | %u: username | ||
1207 | @item | ||
1208 | %p: port | ||
1209 | @end itemize | ||
1210 | |||
1211 | Note that the substitution placemark is replaced only when the corresponding | ||
1212 | field is available and only once. Specifying @code{%u@@%h} doesn't work either. | ||
1213 | If you want to user username substitutions for SSH use the argument @code{-l} | ||
1214 | before the username substitution. Ex: @code{ssh -l %u -p %p %h} | ||
1215 | |||
1216 | The testbed API and the helper communicate through the helpers stdin and | ||
1217 | stdout. As the helper is started through a remote shell on remote hosts any | ||
1218 | output messages from the remote shell interfere with the communication and | ||
1219 | results in a failure while starting the helper. For this reason, it is | ||
1220 | suggested to use flags to make the remote shells produce no output messages and | ||
1221 | to have password-less logins. The default remote shell, SSH, the default | ||
1222 | options are "-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes". | ||
1223 | Password-less logins should be ensured by using SSH keys. | ||
1224 | |||
1225 | Since the testbed API executes the remote shell as a non-interactive shell, | ||
1226 | certain scripts like .bashrc, .profiler may not be executed. If this is the | ||
1227 | case testbed API can be forced to execute an interactive shell by setting up | ||
1228 | the environmental variable `GNUNET_TESTBED_RSH_CMD_SUFFIX' to a shell program. | ||
1229 | An example could be:@ @code{@ export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"@ }@ | ||
1230 | The testbed API will then execute the remote shell program as: @code{ | ||
1231 | $GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX | ||
1232 | gnunet-helper-testbed } | ||
1233 | |||
1234 | On some systems, problems may arise while starting testbed helpers if GNUnet is | ||
1235 | installed into a custom location since the helper may not be found in the | ||
1236 | standard path. This can be addressed by setting the variable | ||
1237 | `HELPER_BINARY_PATH' to the path of the testbed helper. Testbed API will then | ||
1238 | use this path to start helper binaries both locally and remotely. | ||
1239 | |||
1240 | Testbed API can accessed by including "gnunet_testbed_service.h" file and | ||
1241 | linking with -lgnunettestbed. | ||
1242 | |||
1243 | |||
1244 | |||
1245 | @c *************************************************************************** | ||
1246 | @menu | ||
1247 | * Supported Topologies:: | ||
1248 | * Hosts file format:: | ||
1249 | * Topology file format:: | ||
1250 | * Testbed Barriers:: | ||
1251 | * Automatic large-scale deployment of GNUnet in the PlanetLab testbed:: | ||
1252 | * TESTBED Caveats:: | ||
1253 | @end menu | ||
1254 | |||
1255 | @node Supported Topologies | ||
1256 | @subsection Supported Topologies | ||
1257 | |||
1258 | While testing multi-peer deployments, it is often needed that the peers are | ||
1259 | connected in some topology. This requirement is addressed by the function | ||
1260 | @code{GNUNET_TESTBED_overlay_connect()} which connects any given two peers in | ||
1261 | the testbed. | ||
1262 | |||
1263 | The API also provides a helper function | ||
1264 | @code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set of | ||
1265 | peers in any of the following supported topologies: | ||
1266 | @itemize @bullet | ||
1267 | |||
1268 | @item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with each | ||
1269 | other | ||
1270 | |||
1271 | @item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a line | ||
1272 | |||
1273 | @item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a ring | ||
1274 | topology | ||
1275 | |||
1276 | @item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to form a 2 | ||
1277 | dimensional torus topology. The number of peers may not be a perfect square, in | ||
1278 | that case the resulting torus may not have the uniform poloidal and toroidal | ||
1279 | lengths | ||
1280 | |||
1281 | @item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated to form | ||
1282 | a random graph. The number of links to be present should be given | ||
1283 | |||
1284 | @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to form a | ||
1285 | 2D Torus with some random links among them. The number of random links are to | ||
1286 | be given | ||
1287 | |||
1288 | @item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are connected to | ||
1289 | form a ring with some random links among them. The number of random links are | ||
1290 | to be given | ||
1291 | |||
1292 | @item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a topology | ||
1293 | where peer connectivity follows power law - new peers are connected with high | ||
1294 | probabililty to well connected peers. See Emergence of Scaling in Random | ||
1295 | Networks. Science 286, 509-512, 1999. | ||
1296 | |||
1297 | @item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information is | ||
1298 | loaded from a file. The path to the file has to be given. See Topology file | ||
1299 | format for the format of this file. | ||
1300 | |||
1301 | @item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology | ||
1302 | @end itemize | ||
1303 | |||
1304 | |||
1305 | The above supported topologies can be specified respectively by setting the | ||
1306 | variable @code{OVERLAY_TOPOLOGY} to the following values in the configuration | ||
1307 | passed to Testbed API functions @code{GNUNET_TESTBED_test_run()} and | ||
1308 | @code{GNUNET_TESTBED_run()}: | ||
1309 | @itemize @bullet | ||
1310 | @item @code{CLIQUE} | ||
1311 | @item @code{RING} | ||
1312 | @item @code{LINE} | ||
1313 | @item @code{2D_TORUS} | ||
1314 | @item @code{RANDOM} | ||
1315 | @item @code{SMALL_WORLD} | ||
1316 | @item @code{SMALL_WORLD_RING} | ||
1317 | @item @code{SCALE_FREE} | ||
1318 | @item @code{FROM_FILE} | ||
1319 | @item @code{NONE} | ||
1320 | @end itemize | ||
1321 | |||
1322 | |||
1323 | Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING} | ||
1324 | require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of | ||
1325 | random links to be generated in the configuration. The option will be ignored | ||
1326 | for the rest of the topologies. | ||
1327 | |||
1328 | Toplogy @code{SCALE_FREE} requires the options @code{SCALE_FREE_TOPOLOGY_CAP} | ||
1329 | to be set to the maximum number of peers which can connect to a peer and | ||
1330 | @code{SCALE_FREE_TOPOLOGY_M} to be set to how many peers a peer should be | ||
1331 | atleast connected to. | ||
1332 | |||
1333 | Similarly, the topology @code{FROM_FILE} requires the option | ||
1334 | @code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing the | ||
1335 | topology information. This option is ignored for the rest of the topologies. | ||
1336 | See Topology file format for the format of this file. | ||
1337 | |||
1338 | @c *************************************************************************** | ||
1339 | @node Hosts file format | ||
1340 | @subsection Hosts file format | ||
1341 | |||
1342 | The testbed API offers the function GNUNET_TESTBED_hosts_load_from_file() to | ||
1343 | load from a given file details about the hosts which testbed can use for | ||
1344 | deploying peers. This function is useful to keep the data about hosts separate | ||
1345 | instead of hard coding them in code. | ||
1346 | |||
1347 | Another helper function from testbed API, GNUNET_TESTBED_run() also takes a | ||
1348 | hosts file name as its parameter. It uses the above function to populate the | ||
1349 | hosts data structures and start controllers to deploy peers. | ||
1350 | |||
1351 | These functions require the hosts file to be of the following format: | ||
1352 | @itemize @bullet | ||
1353 | @item Each line is interpreted to have details about a host | ||
1354 | @item Host details should include the username to use for logging into the | ||
1355 | host, the hostname of the host and the port number to use for the remote shell | ||
1356 | program. All thee values should be given. | ||
1357 | @item These details should be given in the following format: | ||
1358 | @code{<username>@@<hostname>:<port>} | ||
1359 | @end itemize | ||
1360 | |||
1361 | Note that having canonical hostnames may cause problems while resolving the IP | ||
1362 | addresses (See this bug). Hence it is advised to provide the hosts' IP | ||
1363 | numerical addresses as hostnames whenever possible. | ||
1364 | |||
1365 | @c *************************************************************************** | ||
1366 | @node Topology file format | ||
1367 | @subsection Topology file format | ||
1368 | |||
1369 | A topology file describes how peers are to be connected. It should adhere to | ||
1370 | the following format for testbed to parse it correctly. | ||
1371 | |||
1372 | Each line should begin with the target peer id. This should be followed by a | ||
1373 | colon(`:') and origin peer ids seperated by `|'. All spaces except for newline | ||
1374 | characters are ignored. The API will then try to connect each origin peer to | ||
1375 | the target peer. | ||
1376 | |||
1377 | For example, the following file will result in 5 overlay connections: [2->1], | ||
1378 | [3->1],[4->3], [0->3], [2->0]@ @code{@ 1:2|3@ 3:4| 0@ 0: 2@ } | ||
1379 | |||
1380 | @c *************************************************************************** | ||
1381 | @node Testbed Barriers | ||
1382 | @subsection Testbed Barriers | ||
1383 | |||
1384 | The testbed subsystem's barriers API facilitates coordination among the peers | ||
1385 | run by the testbed and the experiment driver. The concept is similar to the | ||
1386 | barrier synchronisation mechanism found in parallel programming or | ||
1387 | multi-threading paradigms - a peer waits at a barrier upon reaching it until | ||
1388 | the barrier is reached by a predefined number of peers. This predefined number | ||
1389 | of peers required to cross a barrier is also called quorum. We say a peer has | ||
1390 | reached a barrier if the peer is waiting for the barrier to be crossed. | ||
1391 | Similarly a barrier is said to be reached if the required quorum of peers reach | ||
1392 | the barrier. A barrier which is reached is deemed as crossed after all the | ||
1393 | peers waiting on it are notified. | ||
1394 | |||
1395 | The barriers API provides the following functions: | ||
1396 | @itemize @bullet | ||
1397 | @item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to initialse a | ||
1398 | barrier in the experiment | ||
1399 | @item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel a | ||
1400 | barrier which has been initialised before | ||
1401 | @item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal barrier | ||
1402 | service that the caller has reached a barrier and is waiting for it to be | ||
1403 | crossed | ||
1404 | @item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to stop | ||
1405 | waiting for a barrier to be crossed | ||
1406 | @end itemize | ||
1407 | |||
1408 | |||
1409 | Among the above functions, the first two, namely | ||
1410 | @code{GNUNET_TESTBED_barrier_init()} and @code{GNUNET_TESTBED_barrier_cancel()} | ||
1411 | are used by experiment drivers. All barriers should be initialised by the | ||
1412 | experiment driver by calling @code{GNUNET_TESTBED_barrier_init()}. This | ||
1413 | function takes a name to identify the barrier, the quorum required for the | ||
1414 | barrier to be crossed and a notification callback for notifying the experiment | ||
1415 | driver when the barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} | ||
1416 | cancels an initialised barrier and frees the resources allocated for it. This | ||
1417 | function can be called upon a initialised barrier before it is crossed. | ||
1418 | |||
1419 | The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and | ||
1420 | @code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's processes. | ||
1421 | @code{GNUNET_TESTBED_barrier_wait()} connects to the local barrier service | ||
1422 | running on the same host the peer is running on and registers that the caller | ||
1423 | has reached the barrier and is waiting for the barrier to be crossed. Note that | ||
1424 | this function can only be used by peers which are started by testbed as this | ||
1425 | function tries to access the local barrier service which is part of the testbed | ||
1426 | controller service. Calling @code{GNUNET_TESTBED_barrier_wait()} on an | ||
1427 | uninitialised barrier results in failure. | ||
1428 | @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the notification registered | ||
1429 | by @code{GNUNET_TESTBED_barrier_wait()}. | ||
1430 | |||
1431 | |||
1432 | @c *************************************************************************** | ||
1433 | @menu | ||
1434 | * Implementation:: | ||
1435 | @end menu | ||
1436 | |||
1437 | @node Implementation | ||
1438 | @subsubsection Implementation | ||
1439 | |||
1440 | Since barriers involve coordination between experiment driver and peers, the | ||
1441 | barrier service in the testbed controller is split into two components. The | ||
1442 | first component responds to the message generated by the barrier API used by | ||
1443 | the experiment driver (functions @code{GNUNET_TESTBED_barrier_init()} and | ||
1444 | @code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the | ||
1445 | messages generated by barrier API used by peers (functions | ||
1446 | @code{GNUNET_TESTBED_barrier_wait()} and | ||
1447 | @code{GNUNET_TESTBED_barrier_wait_cancel()}). | ||
1448 | |||
1449 | Calling @code{GNUNET_TESTBED_barrier_init()} sends a | ||
1450 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master | ||
1451 | controller. The master controller then registers a barrier and calls | ||
1452 | @code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this way | ||
1453 | barrier initialisation is propagated to the controller hierarchy. While | ||
1454 | propagating initialisation, any errors at a subcontroller such as timeout | ||
1455 | during further propagation are reported up the hierarchy back to the experiment | ||
1456 | driver. | ||
1457 | |||
1458 | Similar to @code{GNUNET_TESTBED_barrier_init()}, | ||
1459 | @code{GNUNET_TESTBED_barrier_cancel()} propagates | ||
1460 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes | ||
1461 | controllers to remove an initialised barrier. | ||
1462 | |||
1463 | The second component is implemented as a separate service in the binary | ||
1464 | `gnunet-service-testbed' which already has the testbed controller service. | ||
1465 | Although this deviates from the gnunet process architecture of having one | ||
1466 | service per binary, it is needed in this case as this component needs access to | ||
1467 | barrier data created by the first component. This component responds to | ||
1468 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from local peers when | ||
1469 | they call @code{GNUNET_TESTBED_barrier_wait()}. Upon receiving | ||
1470 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the service checks if | ||
1471 | the requested barrier has been initialised before and if it was not | ||
1472 | initialised, an error status is sent through | ||
1473 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local peer and | ||
1474 | the connection from the peer is terminated. If the barrier is initialised | ||
1475 | before, the barrier's counter for reached peers is incremented and a | ||
1476 | notification is registered to notify the peer when the barrier is reached. The | ||
1477 | connection from the peer is left open. | ||
1478 | |||
1479 | When enough peers required to attain the quorum send | ||
1480 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller sends | ||
1481 | a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its parent | ||
1482 | informing that the barrier is crossed. If the controller has started further | ||
1483 | subcontrollers, it delays this message until it receives a similar notification | ||
1484 | from each of those subcontrollers. Finally, the barriers API at the experiment | ||
1485 | driver receives the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the | ||
1486 | barrier is reached at all the controllers. | ||
1487 | |||
1488 | The barriers API at the experiment driver responds to the | ||
1489 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it back to | ||
1490 | the master controller and notifying the experiment controller through the | ||
1491 | notification callback that a barrier has been crossed. The echoed | ||
1492 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is propagated by the | ||
1493 | master controller to the controller hierarchy. This propagation triggers the | ||
1494 | notifications registered by peers at each of the controllers in the hierarchy. | ||
1495 | Note the difference between this downward propagation of the | ||
1496 | @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message from its upward | ||
1497 | propagation --- the upward propagation is needed for ensuring that the barrier | ||
1498 | is reached by all the controllers and the downward propagation is for | ||
1499 | triggering that the barrier is crossed. | ||
1500 | |||
1501 | @c *************************************************************************** | ||
1502 | @node Automatic large-scale deployment of GNUnet in the PlanetLab testbed | ||
1503 | @subsection Automatic large-scale deployment of GNUnet in the PlanetLab testbed | ||
1504 | |||
1505 | PlanetLab is as a testbed for computer networking and distributed systems | ||
1506 | research. It was established in 2002 and as of June 2010 was composed of 1090 | ||
1507 | nodes at 507 sites worldwide. | ||
1508 | |||
1509 | To automate the GNUnet we created a set of automation tools to simplify the | ||
1510 | large-scale deployment. We provide you a set of scripts you can use to deploy | ||
1511 | GNUnet on a set of nodes and manage your installation. | ||
1512 | |||
1513 | Please also check @uref{https://gnunet.org/installation-fedora8-svn} and@ | ||
1514 | @uref{https://gnunet.org/installation-fedora12-svn} to find detailled | ||
1515 | instructions how to install GNUnet on a PlanetLab node. | ||
1516 | |||
1517 | |||
1518 | @c *************************************************************************** | ||
1519 | @menu | ||
1520 | * PlanetLab Automation for Fedora8 nodes:: | ||
1521 | * Install buildslave on PlanetLab nodes running fedora core 8:: | ||
1522 | * Setup a new PlanetLab testbed using GPLMT:: | ||
1523 | * Why do i get an ssh error when using the regex profiler?:: | ||
1524 | @end menu | ||
1525 | |||
1526 | @node PlanetLab Automation for Fedora8 nodes | ||
1527 | @subsubsection PlanetLab Automation for Fedora8 nodes | ||
1528 | |||
1529 | @c *************************************************************************** | ||
1530 | @node Install buildslave on PlanetLab nodes running fedora core 8 | ||
1531 | @subsubsection Install buildslave on PlanetLab nodes running fedora core 8 | ||
1532 | @c ** Actually this is a subsubsubsection, but must be fixed differently | ||
1533 | @c ** as subsubsection is the lowest. | ||
1534 | |||
1535 | Since most of the PlanetLab nodes are running the very old fedora core 8 image, | ||
1536 | installing the buildslave software is quite some pain. For our PlanetLab | ||
1537 | testbed we figured out how to install the buildslave software best. | ||
1538 | |||
1539 | Install Distribute for python:@ @code{@ curl | ||
1540 | http://python-distribute.org/distribute_setup.py | sudo python@ } | ||
1541 | |||
1542 | Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not work):@ | ||
1543 | @code{@ wget | ||
1544 | http://pypi.python.org/packages/source/z/zope.interface/zope.interface-3.8.0.tar.gz@ | ||
1545 | tar zvfz zope.interface-3.8.0.tar.gz@ cd zope.interface-3.8.0@ sudo python | ||
1546 | setup.py install@ } | ||
1547 | |||
1548 | Install the buildslave software (0.8.6 was the latest version):@ @code{@ wget | ||
1549 | http://buildbot.googlecode.com/files/buildbot-slave-0.8.6p1.tar.gz@ tar xvfz | ||
1550 | buildbot-slave-0.8.6p1.tar.gz@ cd buildslave-0.8.6p1@ sudo python setup.py | ||
1551 | install@ } | ||
1552 | |||
1553 | The setup will download the matching twisted package and install it.@ It will | ||
1554 | also try to install the latest version of zope.interface which will fail to | ||
1555 | install. Buildslave will work anyway since version 3.8.0 was installed before! | ||
1556 | |||
1557 | @c *************************************************************************** | ||
1558 | @node Setup a new PlanetLab testbed using GPLMT | ||
1559 | @subsubsection Setup a new PlanetLab testbed using GPLMT | ||
1560 | |||
1561 | @itemize @bullet | ||
1562 | @item Get a new slice and assign nodes | ||
1563 | Ask your PlanetLab PI to give you a new slice and assign the nodes you need | ||
1564 | @item Install a buildmaster | ||
1565 | You can stick to the buildbot documentation:@ | ||
1566 | @uref{http://buildbot.net/buildbot/docs/current/manual/installation.html} | ||
1567 | @item Install the buildslave software on all nodes | ||
1568 | To install the buildslave on all nodes assigned to your slice you can use the | ||
1569 | tasklist @code{install_buildslave_fc8.xml} provided with GPLMT: | ||
1570 | |||
1571 | @code{@ ./gplmt.py -c contrib/tumple_gnunet.conf -t | ||
1572 | contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>@ } | ||
1573 | |||
1574 | @item Create the buildmaster configuration and the slave setup commands | ||
1575 | |||
1576 | The master and the and the slaves have need to have credentials and the master | ||
1577 | has to have all nodes configured. This can be done with the | ||
1578 | @code{create_buildbot_configuration.py} script in the @code{scripts} directory | ||
1579 | |||
1580 | This scripts takes a list of nodes retrieved directly from PlanetLab or read | ||
1581 | from a file and a configuration template and creates:@ | ||
1582 | - a tasklist which can be executed with gplmt to setup the slaves@ | ||
1583 | - a master.cfg file containing a PlanetLab nodes | ||
1584 | |||
1585 | A configuration template is included in the <contrib>, most important is that | ||
1586 | the script replaces the following tags in the template: | ||
1587 | |||
1588 | %GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@ | ||
1589 | %GPLMT_SCHEDULER_BUILDERS | ||
1590 | |||
1591 | Create configuration for all nodes assigned to a slice:@ @code{@ | ||
1592 | ./create_buildbot_configuration.py -u <planetlab username> -p <planetlab | ||
1593 | password> -s <slice> -m <buildmaster+port> -t <template>@ }@ Create | ||
1594 | configuration for some nodes in a file:@ @code{@ | ||
1595 | ./create_buildbot_configuration.p -f <node_file> -m <buildmaster+port> -t | ||
1596 | <template>@ } | ||
1597 | |||
1598 | @item Copy the @code{master.cfg} to the buildmaster and start it | ||
1599 | Use @code{buildbot start <basedir>} to start the server | ||
1600 | @item Setup the buildslaves | ||
1601 | @end itemize | ||
1602 | |||
1603 | @c *************************************************************************** | ||
1604 | @node Why do i get an ssh error when using the regex profiler? | ||
1605 | @subsubsection Why do i get an ssh error when using the regex profiler? | ||
1606 | |||
1607 | Why do i get an ssh error "Permission denied (publickey,password)." when using | ||
1608 | the regex profiler although passwordless ssh to localhost works using publickey | ||
1609 | and ssh-agent? | ||
1610 | |||
1611 | You have to generate a public/private-key pair with no password:@ | ||
1612 | @code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@ | ||
1613 | and then add the following to your ~/.ssh/config file: | ||
1614 | |||
1615 | @code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost} | ||
1616 | |||
1617 | now make sure your hostsfile looks like@ | ||
1618 | |||
1619 | [USERNAME]@@127.0.0.1:22@ | ||
1620 | [USERNAME]@@127.0.0.1:22 | ||
1621 | |||
1622 | You can test your setup by running `ssh 127.0.0.1` in a terminal and then in | ||
1623 | the opened session run it again. If you were not asked for a password on either | ||
1624 | login, then you should be good to go. | ||
1625 | |||
1626 | @c *************************************************************************** | ||
1627 | @node TESTBED Caveats | ||
1628 | @subsection TESTBED Caveats | ||
1629 | |||
1630 | This section documents a few caveats when using the GNUnet testbed | ||
1631 | subsystem. | ||
1632 | |||
1633 | |||
1634 | @c *************************************************************************** | ||
1635 | @menu | ||
1636 | * CORE must be started:: | ||
1637 | * ATS must want the connections:: | ||
1638 | @end menu | ||
1639 | |||
1640 | @node CORE must be started | ||
1641 | @subsubsection CORE must be started | ||
1642 | |||
1643 | A simple issue is #3993: Your configuration MUST somehow ensure that for each | ||
1644 | peer the CORE service is started when the peer is setup, otherwise TESTBED may | ||
1645 | fail to connect peers when the topology is initialized, as TESTBED will start | ||
1646 | some CORE services but not necessarily all (but it relies on all of them | ||
1647 | running). The easiest way is to set 'FORCESTART = YES' in the '[core]' section | ||
1648 | of the configuration file. Alternatively, having any service that directly or | ||
1649 | indirectly depends on CORE being started with FORCESTART will also do. This | ||
1650 | issue largely arises if users try to over-optimize by not starting any services | ||
1651 | with FORCESTART. | ||
1652 | |||
1653 | @c *************************************************************************** | ||
1654 | @node ATS must want the connections | ||
1655 | @subsubsection ATS must want the connections | ||
1656 | |||
1657 | When TESTBED sets up connections, it only offers the respective HELLO | ||
1658 | information to the TRANSPORT service. It is then up to the ATS service to | ||
1659 | @strong{decide} to use the connection. The ATS service will typically eagerly | ||
1660 | establish any connection if the number of total connections is low (relative to | ||
1661 | bandwidth). Details may further depend on the specific ATS backend that was | ||
1662 | configured. If ATS decides to NOT establish a connection (even though TESTBED | ||
1663 | provided the required information), then that connection will count as failed | ||
1664 | for TESTBED. Note that you can configure TESTBED to tolerate a certain number | ||
1665 | of connection failures (see '-e' option of gnunet-testbed-profiler). This issue | ||
1666 | largely arises for dense overlay topologies, especially if you try to create | ||
1667 | cliques with more than 20 peers. | ||
1668 | |||
1669 | @c *************************************************************************** | ||
1670 | @node libgnunetutil | ||
1671 | @section libgnunetutil | ||
1672 | |||
1673 | libgnunetutil is the fundamental library that all GNUnet code builds upon. | ||
1674 | Ideally, this library should contain most of the platform dependent code | ||
1675 | (except for user interfaces and really special needs that only few applications | ||
1676 | have). It is also supposed to offer basic services that most if not all GNUnet | ||
1677 | binaries require. The code of libgnunetutil is in the src/util/ directory. The | ||
1678 | public interface to the library is in the gnunet_util.h header. The functions | ||
1679 | provided by libgnunetutil fall roughly into the following categories (in | ||
1680 | roughly the order of importance for new developers): | ||
1681 | @itemize @bullet | ||
1682 | @item logging (common_logging.c) | ||
1683 | @item memory allocation (common_allocation.c) | ||
1684 | @item endianess conversion (common_endian.c) | ||
1685 | @item internationalization (common_gettext.c) | ||
1686 | @item String manipulation (string.c) | ||
1687 | @item file access (disk.c) | ||
1688 | @item buffered disk IO (bio.c) | ||
1689 | @item time manipulation (time.c) | ||
1690 | @item configuration parsing (configuration.c) | ||
1691 | @item command-line handling (getopt*.c) | ||
1692 | @item cryptography (crypto_*.c) | ||
1693 | @item data structures (container_*.c) | ||
1694 | @item CPS-style scheduling (scheduler.c) | ||
1695 | @item Program initialization (program.c) | ||
1696 | @item Networking (network.c, client.c, server*.c, service.c) | ||
1697 | @item message queueing (mq.c) | ||
1698 | @item bandwidth calculations (bandwidth.c) | ||
1699 | @item Other OS-related (os*.c, plugin.c, signal.c) | ||
1700 | @item Pseudonym management (pseudonym.c) | ||
1701 | @end itemize | ||
1702 | |||
1703 | It should be noted that only developers that fully understand this entire API | ||
1704 | will be able to write good GNUnet code. | ||
1705 | |||
1706 | Ideally, porting GNUnet should only require porting the gnunetutil library. | ||
1707 | More testcases for the gnunetutil APIs are therefore a great way to make | ||
1708 | porting of GNUnet easier. | ||
1709 | |||
1710 | @menu | ||
1711 | * Logging:: | ||
1712 | * Interprocess communication API (IPC):: | ||
1713 | * Cryptography API:: | ||
1714 | * Message Queue API:: | ||
1715 | * Service API:: | ||
1716 | * Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps:: | ||
1717 | * The CONTAINER_MDLL API:: | ||
1718 | @end menu | ||
1719 | |||
1720 | @c *************************************************************************** | ||
1721 | @node Logging | ||
1722 | @subsection Logging | ||
1723 | |||
1724 | GNUnet is able to log its activity, mostly for the purposes of debugging the | ||
1725 | program at various levels. | ||
1726 | |||
1727 | @code{gnunet_common.h} defines several @strong{log levels}: | ||
1728 | @table @asis | ||
1729 | |||
1730 | @item ERROR for errors (really problematic situations, often leading to | ||
1731 | crashes) | ||
1732 | @item WARNING for warnings (troubling situations that might have | ||
1733 | negative consequences, although not fatal) | ||
1734 | @item INFO for various information. | ||
1735 | Used somewhat rarely, as GNUnet statistics is used to hold and display most of | ||
1736 | the information that users might find interesting. | ||
1737 | @item DEBUG for debugging. | ||
1738 | Does not produce much output on normal builds, but when extra logging is | ||
1739 | enabled at compile time, a staggering amount of data is outputted under this | ||
1740 | log level. | ||
1741 | @end table | ||
1742 | |||
1743 | |||
1744 | Normal builds of GNUnet (configured with @code{--enable-logging[=yes]}) are | ||
1745 | supposed to log nothing under DEBUG level. The @code{--enable-logging=verbose} | ||
1746 | configure option can be used to create a build with all logging enabled. | ||
1747 | However, such build will produce large amounts of log data, which is | ||
1748 | inconvenient when one tries to hunt down a specific problem. | ||
1749 | |||
1750 | To mitigate this problem, GNUnet provides facilities to apply a filter to | ||
1751 | reduce the logs: | ||
1752 | @table @asis | ||
1753 | |||
1754 | @item Logging by default When no log levels are configured in any other way | ||
1755 | (see below), GNUnet will default to the WARNING log level. This mostly applies | ||
1756 | to GNUnet command line utilities, services and daemons; tests will always set | ||
1757 | log level to WARNING or, if @code{--enable-logging=verbose} was passed to | ||
1758 | configure, to DEBUG. The default level is suggested for normal operation. | ||
1759 | @item The -L option Most GNUnet executables accept an "-L loglevel" or | ||
1760 | "--log=loglevel" option. If used, it makes the process set a global log level | ||
1761 | to "loglevel". Thus it is possible to run some processes with -L DEBUG, for | ||
1762 | example, and others with -L ERROR to enable specific settings to diagnose | ||
1763 | problems with a particular process. | ||
1764 | @item Configuration files. Because GNUnet | ||
1765 | service and deamon processes are usually launched by gnunet-arm, it is not | ||
1766 | possible to pass different custom command line options directly to every one of | ||
1767 | them. The options passed to @code{gnunet-arm} only affect gnunet-arm and not | ||
1768 | the rest of GNUnet. However, one can specify a configuration key "OPTIONS" in | ||
1769 | the section that corresponds to a service or a daemon, and put a value of "-L | ||
1770 | loglevel" there. This will make the respective service or daemon set its log | ||
1771 | level to "loglevel" (as the value of OPTIONS will be passed as a command-line | ||
1772 | argument). | ||
1773 | |||
1774 | To specify the same log level for all services without creating separate | ||
1775 | "OPTIONS" entries in the configuration for each one, the user can specify a | ||
1776 | config key "GLOBAL_POSTFIX" in the [arm] section of the configuration file. The | ||
1777 | value of GLOBAL_POSTFIX will be appended to all command lines used by the ARM | ||
1778 | service to run other services. It can contain any option valid for all GNUnet | ||
1779 | commands, thus in particular the "-L loglevel" option. The ARM service itself | ||
1780 | is, however, unaffected by GLOBAL_POSTFIX; to set log level for it, one has to | ||
1781 | specify "OPTIONS" key in the [arm] section. | ||
1782 | @item Environment variables. | ||
1783 | Setting global per-process log levels with "-L loglevel" does not offer | ||
1784 | sufficient log filtering granularity, as one service will call interface | ||
1785 | libraries and supporting libraries of other GNUnet services, potentially | ||
1786 | producing lots of debug log messages from these libraries. Also, changing the | ||
1787 | config file is not always convenient (especially when running the GNUnet test | ||
1788 | suite).@ To fix that, and to allow GNUnet to use different log filtering at | ||
1789 | runtime without re-compiling the whole source tree, the log calls were changed | ||
1790 | to be configurable at run time. To configure them one has to define environment | ||
1791 | variables "GNUNET_FORCE_LOGFILE", "GNUNET_LOG" and/or "GNUNET_FORCE_LOG": | ||
1792 | @itemize @bullet | ||
1793 | |||
1794 | @item "GNUNET_LOG" only affects the logging when no global log level is | ||
1795 | configured by any other means (that is, the process does not explicitly set its | ||
1796 | own log level, there are no "-L loglevel" options on command line or in | ||
1797 | configuration files), and can be used to override the default WARNING log | ||
1798 | level. | ||
1799 | |||
1800 | @item "GNUNET_FORCE_LOG" will completely override any other log configuration | ||
1801 | options given. | ||
1802 | |||
1803 | @item "GNUNET_FORCE_LOGFILE" will completely override the location of the file | ||
1804 | to log messages to. It should contain a relative or absolute file name. Setting | ||
1805 | GNUNET_FORCE_LOGFILE is equivalent to passing "--log-file=logfile" or "-l | ||
1806 | logfile" option (see below). It supports "[]" format in file names, but not | ||
1807 | "@{@}" (see below). | ||
1808 | @end itemize | ||
1809 | |||
1810 | |||
1811 | Because environment variables are inherited by child processes when they are | ||
1812 | launched, starting or re-starting the ARM service with these variables will | ||
1813 | propagate them to all other services. | ||
1814 | |||
1815 | "GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially | ||
1816 | formatted @strong{logging definition} string, which looks like this:@ @code{@ | ||
1817 | [component];[file];[function];[from_line[-to_line]];loglevel@emph{[/component...]}@ | ||
1818 | }@ That is, a logging definition consists of definition entries, separated by | ||
1819 | slashes ('/'). If only one entry is present, there is no need to add a slash | ||
1820 | to its end (although it is not forbidden either).@ All definition fields | ||
1821 | (component, file, function, lines and loglevel) are mandatory, but (except for | ||
1822 | the loglevel) they can be empty. An empty field means "match anything". Note | ||
1823 | that even if fields are empty, the semicolon (';') separators must be | ||
1824 | present.@ The loglevel field is mandatory, and must contain one of the log | ||
1825 | level names (ERROR, WARNING, INFO or DEBUG).@ The lines field might contain | ||
1826 | one non-negative number, in which case it matches only one line, or a range | ||
1827 | "from_line-to_line", in which case it matches any line in the interval | ||
1828 | [from_line;to_line] (that is, including both start and end line).@ GNUnet | ||
1829 | mostly defaults component name to the name of the service that is implemented | ||
1830 | in a process ('transport', 'core', 'peerinfo', etc), but logging calls can | ||
1831 | specify custom component names using @code{GNUNET_log_from}.@ File name and | ||
1832 | function name are provided by the compiler (__FILE__ and __FUNCTION__ | ||
1833 | built-ins). | ||
1834 | |||
1835 | Component, file and function fields are interpreted as non-extended regular | ||
1836 | expressions (GNU libc regex functions are used). Matching is case-sensitive, ^ | ||
1837 | and $ will match the beginning and the end of the text. If a field is empty, | ||
1838 | its contents are automatically replaced with a ".*" regular expression, which | ||
1839 | matches anything. Matching is done in the default way, which means that the | ||
1840 | expression matches as long as it's contained anywhere in the string. Thus | ||
1841 | "GNUNET_" will match both "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' | ||
1842 | to make sure that the expression matches at the start and/or at the end of the | ||
1843 | string.@ The semicolon (';') can't be escaped, and GNUnet will not use it in | ||
1844 | component names (it can't be used in function names and file names anyway).@ | ||
1845 | |||
1846 | @end table | ||
1847 | |||
1848 | |||
1849 | Every logging call in GNUnet code will be (at run time) matched against the | ||
1850 | log definitions passed to the process. If a log definition fields are matching | ||
1851 | the call arguments, then the call log level is compared the the log level of | ||
1852 | that definition. If the call log level is less or equal to the definition log | ||
1853 | level, the call is allowed to proceed. Otherwise the logging call is | ||
1854 | forbidden, and nothing is logged. If no definitions matched at all, GNUnet | ||
1855 | will use the global log level or (if a global log level is not specified) will | ||
1856 | default to WARNING (that is, it will allow the call to proceed, if its level | ||
1857 | is less or equal to the global log level or to WARNING). | ||
1858 | |||
1859 | That is, definitions are evaluated from left to right, and the first matching | ||
1860 | definition is used to allow or deny the logging call. Thus it is advised to | ||
1861 | place narrow definitions at the beginning of the logdef string, and generic | ||
1862 | definitions - at the end. | ||
1863 | |||
1864 | Whether a call is allowed or not is only decided the first time this particular | ||
1865 | call is made. The evaluation result is then cached, so that any attempts to | ||
1866 | make the same call later will be allowed or disallowed right away. Because of | ||
1867 | that runtime log level evaluation should not significantly affect the process | ||
1868 | performance.@ Log definition parsing is only done once, at the first call to | ||
1869 | GNUNET_log_setup () made by the process (which is usually done soon after it | ||
1870 | starts). | ||
1871 | |||
1872 | At the moment of writing there is no way to specify logging definitions from | ||
1873 | configuration files, only via environment variables. | ||
1874 | |||
1875 | At the moment GNUnet will stop processing a log definition when it encounters | ||
1876 | an error in definition formatting or an error in regular expression syntax, and | ||
1877 | will not report the failure in any way. | ||
1878 | |||
1879 | |||
1880 | @c *************************************************************************** | ||
1881 | @menu | ||
1882 | * Examples:: | ||
1883 | * Log files:: | ||
1884 | * Updated behavior of GNUNET_log:: | ||
1885 | @end menu | ||
1886 | |||
1887 | @node Examples | ||
1888 | @subsubsection Examples | ||
1889 | |||
1890 | @table @asis | ||
1891 | |||
1892 | @item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet process | ||
1893 | tree, running all processes with DEBUG level (one should be careful with it, as | ||
1894 | log files will grow at alarming rate!) | ||
1895 | @item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet process | ||
1896 | tree, running the core service under DEBUG level (everything else will use | ||
1897 | configured or default level). | ||
1898 | @item @code{GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;;DEBUG" gnunet-arm -s} | ||
1899 | Start GNUnet process tree, allowing any logging calls from | ||
1900 | gnunet-service-transport_validation.c (everything else will use configured or | ||
1901 | default level). | ||
1902 | @item @code{GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s} | ||
1903 | Start GNUnet process tree, allowing any logging calls from | ||
1904 | gnunet-gnunet-service-fs_push.c (everything else will use configured or default | ||
1905 | level). | ||
1906 | @item @code{GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s} | ||
1907 | Start GNUnet process tree, allowing any logging calls from the | ||
1908 | GNUNET_NETWORK_socket_select function (everything else will use configured or | ||
1909 | default level). | ||
1910 | @item @code{GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s} | ||
1911 | Start GNUnet process tree, allowing any logging calls from the components | ||
1912 | that have "transport" in their names, and are made from function that have | ||
1913 | "send" in their names. Everything else will be allowed to be logged only if it | ||
1914 | has WARNING level. | ||
1915 | @end table | ||
1916 | |||
1917 | |||
1918 | On Windows, one can use batch files to run GNUnet processes with special | ||
1919 | environment variables, without affecting the whole system. Such batch file will | ||
1920 | look like this:@ @code{@ set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm | ||
1921 | -s@ }@ (note the absence of double quotes in the environment variable | ||
1922 | definition, as opposed to earlier examples, which use the shell).@ Another | ||
1923 | limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set in order to | ||
1924 | GNUNET_FORCE_LOG to work. | ||
1925 | |||
1926 | |||
1927 | @c *************************************************************************** | ||
1928 | @node Log files | ||
1929 | @subsubsection Log files | ||
1930 | |||
1931 | GNUnet can be told to log everything into a file instead of stderr (which is | ||
1932 | the default) using the "--log-file=logfile" or "-l logfile" option. This option | ||
1933 | can also be passed via command line, or from the "OPTION" and "GLOBAL_POSTFIX" | ||
1934 | configuration keys (see above). The file name passed with this option is | ||
1935 | subject to GNUnet filename expansion. If specified in "GLOBAL_POSTFIX", it is | ||
1936 | also subject to ARM service filename expansion, in particular, it may contain | ||
1937 | "@{@}" (left and right curly brace) sequence, which will be replaced by ARM | ||
1938 | with the name of the service. This is used to keep logs from more than one | ||
1939 | service separate, while only specifying one template containing "@{@}" in | ||
1940 | GLOBAL_POSTFIX. | ||
1941 | |||
1942 | As part of a secondary file name expansion, the first occurrence of "[]" | ||
1943 | sequence ("left square brace" followed by "right square brace") in the file | ||
1944 | name will be replaced with a process identifier or the process when it | ||
1945 | initializes its logging subsystem. As a result, all processes will log into | ||
1946 | different files. This is convenient for isolating messages of a particular | ||
1947 | process, and prevents I/O races when multiple processes try to write into the | ||
1948 | file at the same time. This expansion is done independently of "@{@}" | ||
1949 | expansion that ARM service does (see above). | ||
1950 | |||
1951 | The log file name that is specified via "-l" can contain format characters | ||
1952 | from the 'strftime' function family. For example, "%Y" will be replaced with | ||
1953 | the current year. Using "basename-%Y-%m-%d.log" would include the current | ||
1954 | year, month and day in the log file. If a GNUnet process runs for long enough | ||
1955 | to need more than one log file, it will eventually clean up old log files. | ||
1956 | Currently, only the last three log files (plus the current log file) are | ||
1957 | preserved. So once the fifth log file goes into use (so after 4 days if you | ||
1958 | use "%Y-%m-%d" as above), the first log file will be automatically deleted. | ||
1959 | Note that if your log file name only contains "%Y", then log files would be | ||
1960 | kept for 4 years and the logs from the first year would be deleted once year 5 | ||
1961 | begins. If you do not use any date-related string format codes, logs would | ||
1962 | never be automatically deleted by GNUnet. | ||
1963 | |||
1964 | |||
1965 | @c *************************************************************************** | ||
1966 | |||
1967 | @node Updated behavior of GNUNET_log | ||
1968 | @subsubsection Updated behavior of GNUNET_log | ||
1969 | |||
1970 | It's currently quite common to see constructions like this all over the code: | ||
1971 | @example | ||
1972 | #if MESH_DEBUG GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client | ||
1973 | disconnected\n"); #endif | ||
1974 | @end example | ||
1975 | |||
1976 | The reason for the #if is not to avoid displaying the message when disabled | ||
1977 | (GNUNET_ERROR_TYPE takes care of that), but to avoid the compiler including it | ||
1978 | in the binary at all, when compiling GNUnet for platforms with restricted | ||
1979 | storage space / memory (MIPS routers, ARM plug computers / dev boards, etc). | ||
1980 | |||
1981 | This presents several problems: the code gets ugly, hard to write and it is | ||
1982 | very easy to forget to include the #if guards, creating non-consistent code. A | ||
1983 | new change in GNUNET_log aims to solve these problems. | ||
1984 | |||
1985 | @strong{This change requires to @code{./configure} with at least | ||
1986 | @code{--enable-logging=verbose} to see debug messages.} | ||
1987 | |||
1988 | Here is an example of code with dense debug statements: | ||
1989 | @example | ||
1990 | switch (restrict_topology) @{ | ||
1991 | case GNUNET_TESTING_TOPOLOGY_CLIQUE: #if VERBOSE_TESTING | ||
1992 | GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique | ||
1993 | topology\n")); #endif unblacklisted_connections = create_clique (pg, | ||
1994 | &remove_connections, BLACKLIST, GNUNET_NO); break; case | ||
1995 | GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log | ||
1996 | (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring) | ||
1997 | topology\n")); #endif unblacklisted_connections = create_small_world_ring (pg, | ||
1998 | &remove_connections, BLACKLIST); break; | ||
1999 | @end example | ||
2000 | |||
2001 | |||
2002 | Pretty hard to follow, huh? | ||
2003 | |||
2004 | From now on, it is not necessary to include the #if / #endif statements to | ||
2005 | acheive the same behavior. The GNUNET_log and GNUNET_log_from macros take care | ||
2006 | of it for you, depending on the configure option: | ||
2007 | @itemize @bullet | ||
2008 | @item If @code{--enable-logging} is set to @code{no}, the binary will contain | ||
2009 | no log messages at all. | ||
2010 | @item If @code{--enable-logging} is set to @code{yes}, the binary will contain | ||
2011 | no DEBUG messages, and therefore running with -L DEBUG will have no effect. | ||
2012 | Other messages (ERROR, WARNING, INFO, etc) will be included. | ||
2013 | @item If @code{--enable-logging} is set to @code{verbose}, or | ||
2014 | @code{veryverbose} the binary will contain DEBUG messages (still, it will be | ||
2015 | neccessary to run with -L DEBUG or set the DEBUG config option to show them). | ||
2016 | @end itemize | ||
2017 | |||
2018 | |||
2019 | If you are a developer: | ||
2020 | @itemize @bullet | ||
2021 | @item please make sure that you @code{./configure | ||
2022 | --enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages. | ||
2023 | @item please remove the @code{#if} statements around @code{GNUNET_log | ||
2024 | (GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your code. | ||
2025 | @end itemize | ||
2026 | |||
2027 | Since now activating DEBUG automatically makes it VERBOSE and activates | ||
2028 | @strong{all} debug messages by default, you probably want to use the | ||
2029 | https://gnunet.org/logging functionality to filter only relevant messages. A | ||
2030 | suitable configuration could be:@ @code{$ export | ||
2031 | GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"}@ Which will behave | ||
2032 | almost like enabling DEBUG in that subsytem before the change. Of course you | ||
2033 | can adapt it to your particular needs, this is only a quick example. | ||
2034 | |||
2035 | @c *************************************************************************** | ||
2036 | @node Interprocess communication API (IPC) | ||
2037 | @subsection Interprocess communication API (IPC) | ||
2038 | |||
2039 | In GNUnet a variety of new message types might be defined and used in | ||
2040 | interprocess communication, in this tutorial we use the @code{struct | ||
2041 | AddressLookupMessage} as a example to introduce how to construct our own | ||
2042 | message type in GNUnet and how to implement the message communication between | ||
2043 | service and client.@ (Here, a client uses the @code{struct | ||
2044 | AddressLookupMessage} as a request to ask the server to return the address of | ||
2045 | any other peer connecting to the service.) | ||
2046 | |||
2047 | |||
2048 | @c *************************************************************************** | ||
2049 | @menu | ||
2050 | * Define new message types:: | ||
2051 | * Define message struct:: | ||
2052 | * Client: Establish connection:: | ||
2053 | * Client: Initialize request message:: | ||
2054 | * Client: Send request and receive response:: | ||
2055 | * Server: Startup service:: | ||
2056 | * Server: Add new handles for specified messages:: | ||
2057 | * Server: Process request message:: | ||
2058 | * Server: Response to client:: | ||
2059 | * Server: Notification of clients:: | ||
2060 | * Conversion between Network Byte Order (Big Endian) and Host Byte Order:: | ||
2061 | @end menu | ||
2062 | |||
2063 | @node Define new message types | ||
2064 | @subsubsection Define new message types | ||
2065 | |||
2066 | First of all, you should define the new message type in | ||
2067 | @code{gnunet_protocols.h}: | ||
2068 | @example | ||
2069 | // Request to look addresses of peers in server. | ||
2070 | #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29 | ||
2071 | // Response to the address lookup request. | ||
2072 | #define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30 | ||
2073 | @end example | ||
2074 | |||
2075 | @c *************************************************************************** | ||
2076 | @node Define message struct | ||
2077 | @subsubsection Define message struct | ||
2078 | |||
2079 | After the type definition, the specified message structure should also be | ||
2080 | described in the header file, e.g. transport.h in our case. | ||
2081 | @example | ||
2082 | GNUNET_NETWORK_STRUCT_BEGIN | ||
2083 | |||
2084 | struct AddressLookupMessage @{ struct GNUNET_MessageHeader header; int32_t | ||
2085 | numeric_only GNUNET_PACKED; struct GNUNET_TIME_AbsoluteNBO timeout; uint32_t | ||
2086 | addrlen GNUNET_PACKED; | ||
2087 | /* followed by 'addrlen' bytes of the actual address, then | ||
2088 | followed by the 0-terminated name of the transport */ @}; | ||
2089 | GNUNET_NETWORK_STRUCT_END | ||
2090 | @end example | ||
2091 | |||
2092 | |||
2093 | Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED} which | ||
2094 | both ensure correct alignment when sending structs over the network | ||
2095 | |||
2096 | @menu | ||
2097 | @end menu | ||
2098 | |||
2099 | @c *************************************************************************** | ||
2100 | @node Client: Establish connection | ||
2101 | @subsubsection Client: Establish connection | ||
2102 | @c %**end of header | ||
2103 | |||
2104 | |||
2105 | At first, on the client side, the underlying API is employed to create a new | ||
2106 | connection to a service, in our example the transport service would be | ||
2107 | connected. | ||
2108 | @example | ||
2109 | struct GNUNET_CLIENT_Connection *client; client = | ||
2110 | GNUNET_CLIENT_connect ("transport", cfg); | ||
2111 | @end example | ||
2112 | |||
2113 | @c *************************************************************************** | ||
2114 | @node Client: Initialize request message | ||
2115 | @subsubsection Client: Initialize request message | ||
2116 | @c %**end of header | ||
2117 | |||
2118 | When the connection is ready, we initialize the message. In this step, all the | ||
2119 | fields of the message should be properly initialized, namely the size, type, | ||
2120 | and some extra user-defined data, such as timeout, name of transport, address | ||
2121 | and name of transport. | ||
2122 | @example | ||
2123 | struct AddressLookupMessage *msg; size_t len = | ||
2124 | sizeof (struct AddressLookupMessage) + addressLen + strlen (nameTrans) + 1; | ||
2125 | msg->header->size = htons (len); msg->header->type = htons | ||
2126 | (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP); msg->timeout = | ||
2127 | GNUNET_TIME_absolute_hton (abs_timeout); msg->addrlen = htonl (addressLen); | ||
2128 | char *addrbuf = (char *) &msg[1]; memcpy (addrbuf, address, addressLen); char | ||
2129 | *tbuf = &addrbuf[addressLen]; memcpy (tbuf, nameTrans, strlen (nameTrans) + 1); | ||
2130 | @end example | ||
2131 | |||
2132 | Note that, here the functions @code{htonl}, @code{htons} and | ||
2133 | @code{GNUNET_TIME_absolute_hton} are applied to convert little endian into big | ||
2134 | endian, about the usage of the big/small edian order and the corresponding | ||
2135 | conversion function please refer to Introduction of Big Endian and Little | ||
2136 | Endian. | ||
2137 | |||
2138 | @c *************************************************************************** | ||
2139 | @node Client: Send request and receive response | ||
2140 | @subsubsection Client: Send request and receive response | ||
2141 | @c %**end of header | ||
2142 | |||
2143 | FIXME: This is very outdated, see the tutorial for the | ||
2144 | current API! | ||
2145 | |||
2146 | Next, the client would send the constructed message as a request to the service | ||
2147 | and wait for the response from the service. To accomplish this goal, there are | ||
2148 | a number of API calls that can be used. In this example, | ||
2149 | @code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most | ||
2150 | appropriate function to use. | ||
2151 | @example | ||
2152 | GNUNET_CLIENT_transmit_and_get_response | ||
2153 | (client, msg->header, timeout, GNUNET_YES, &address_response_processor, | ||
2154 | arp_ctx); | ||
2155 | @end example | ||
2156 | |||
2157 | the argument @code{address_response_processor} is a function with | ||
2158 | @code{GNUNET_CLIENT_MessageHandler} type, which is used to process the reply | ||
2159 | message from the service. | ||
2160 | |||
2161 | @node Server: Startup service | ||
2162 | @subsubsection Server: Startup service | ||
2163 | |||
2164 | After receiving the request message, we run a standard GNUnet service startup | ||
2165 | sequence using @code{GNUNET_SERVICE_run}, as follows, | ||
2166 | @example | ||
2167 | int main(int | ||
2168 | argc, char**argv) @{ GNUNET_SERVICE_run(argc, argv, "transport" | ||
2169 | GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @} | ||
2170 | @end example | ||
2171 | |||
2172 | @c *************************************************************************** | ||
2173 | @node Server: Add new handles for specified messages | ||
2174 | @subsubsection Server: Add new handles for specified messages | ||
2175 | @c %**end of header | ||
2176 | |||
2177 | in the function above the argument @code{run} is used to initiate transport | ||
2178 | service,and defined like this: | ||
2179 | @example | ||
2180 | static void run (void *cls, struct | ||
2181 | GNUNET_SERVER_Handle *serv, const struct GNUNET_CONFIGURATION_Handle *cfg) @{ | ||
2182 | GNUNET_SERVER_add_handlers (serv, handlers); @} | ||
2183 | @end example | ||
2184 | |||
2185 | |||
2186 | Here, @code{GNUNET_SERVER_add_handlers} must be called in the run function to | ||
2187 | add new handlers in the service. The parameter @code{handlers} is a list of | ||
2188 | @code{struct GNUNET_SERVER_MessageHandler} to tell the service which function | ||
2189 | should be called when a particular type of message is received, and should be | ||
2190 | defined in this way: | ||
2191 | @example | ||
2192 | static struct GNUNET_SERVER_MessageHandler | ||
2193 | handlers[] = @{ @{&handle_start, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_START, | ||
2194 | 0@}, @{&handle_send, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_SEND, 0@}, | ||
2195 | @{&handle_try_connect, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT, sizeof | ||
2196 | (struct TryConnectMessage)@}, @{&handle_address_lookup, NULL, | ||
2197 | GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP, 0@}, @{NULL, NULL, 0, 0@} @}; | ||
2198 | @end example | ||
2199 | |||
2200 | |||
2201 | As shown, the first member of the struct in the first area is a callback | ||
2202 | function, which is called to process the specified message types, given as the | ||
2203 | third member. The second parameter is the closure for the callback function, | ||
2204 | which is set to @code{NULL} in most cases, and the last parameter is the | ||
2205 | expected size of the message of this type, usually we set it to 0 to accept | ||
2206 | variable size, for special cases the exact size of the specified message also | ||
2207 | can be set. In addition, the terminator sign depicted as @code{@{NULL, NULL, 0, | ||
2208 | 0@}} is set in the last aera. | ||
2209 | |||
2210 | @c *************************************************************************** | ||
2211 | @node Server: Process request message | ||
2212 | @subsubsection Server: Process request message | ||
2213 | @c %**end of header | ||
2214 | |||
2215 | After the initialization of transport service, the request message would be | ||
2216 | processed. Before handling the main message data, the validity of this message | ||
2217 | should be checked out, e.g., to check whether the size of message is correct. | ||
2218 | @example | ||
2219 | size = ntohs (message->size); if (size < sizeof (struct | ||
2220 | AddressLookupMessage)) @{ GNUNET_break_op (0); GNUNET_SERVER_receive_done | ||
2221 | (client, GNUNET_SYSERR); return; @} | ||
2222 | @end example | ||
2223 | |||
2224 | |||
2225 | Note that, opposite to the construction method of the request message in the | ||
2226 | client, in the server the function @code{nothl} and @code{ntohs} should be | ||
2227 | employed during the extraction of the data from the message, so that the data | ||
2228 | in big endian order can be converted back into little endian order. See more in | ||
2229 | detail please refer to Introduction of Big Endian and Little Endian. | ||
2230 | |||
2231 | Moreover in this example, the name of the transport stored in the message is a | ||
2232 | 0-terminated string, so we should also check whether the name of the transport | ||
2233 | in the received message is 0-terminated: | ||
2234 | @example | ||
2235 | nameTransport = (const char *) | ||
2236 | &address[addressLen]; if (nameTransport[size - sizeof (struct | ||
2237 | AddressLookupMessage) | ||
2238 | - addressLen - 1] != '\0') @{ GNUNET_break_op | ||
2239 | (0); GNUNET_SERVER_receive_done (client, | ||
2240 | GNUNET_SYSERR); return; @} | ||
2241 | @end example | ||
2242 | |||
2243 | Here, @code{GNUNET_SERVER_receive_done} should be called to tell the service | ||
2244 | that the request is done and can receive the next message. The argument | ||
2245 | @code{GNUNET_SYSERR} here indicates that the service didn't understand the | ||
2246 | request message, and the processing of this request would be terminated. | ||
2247 | |||
2248 | In comparison to the aforementioned situation, when the argument is equal to | ||
2249 | @code{GNUNET_OK}, the service would continue to process the requst message. | ||
2250 | |||
2251 | @c *************************************************************************** | ||
2252 | @node Server: Response to client | ||
2253 | @subsubsection Server: Response to client | ||
2254 | @c %**end of header | ||
2255 | |||
2256 | Once the processing of current request is done, the server should give the | ||
2257 | response to the client. A new @code{struct AddressLookupMessage} would be | ||
2258 | produced by the server in a similar way as the client did and sent to the | ||
2259 | client, but here the type should be | ||
2260 | @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than | ||
2261 | @code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client. | ||
2262 | @example | ||
2263 | struct | ||
2264 | AddressLookupMessage *msg; size_t len = sizeof (struct AddressLookupMessage) + | ||
2265 | addressLen + strlen (nameTrans) + 1; msg->header->size = htons (len); | ||
2266 | msg->header->type = htons (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY); | ||
2267 | |||
2268 | // ... | ||
2269 | |||
2270 | struct GNUNET_SERVER_TransmitContext *tc; tc = | ||
2271 | GNUNET_SERVER_transmit_context_create (client); | ||
2272 | GNUNET_SERVER_transmit_context_append_data (tc, NULL, 0, | ||
2273 | GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY); | ||
2274 | GNUNET_SERVER_transmit_context_run (tc, rtimeout); | ||
2275 | @end example | ||
2276 | |||
2277 | |||
2278 | Note that, there are also a number of other APIs provided to the service to | ||
2279 | send the message. | ||
2280 | |||
2281 | @c *************************************************************************** | ||
2282 | @node Server: Notification of clients | ||
2283 | @subsubsection Server: Notification of clients | ||
2284 | @c %**end of header | ||
2285 | |||
2286 | Often a service needs to (repeatedly) transmit notifications to a client or a | ||
2287 | group of clients. In these cases, the client typically has once registered for | ||
2288 | a set of events and then needs to receive a message whenever such an event | ||
2289 | happens (until the client disconnects). The use of a notification context can | ||
2290 | help manage message queues to clients and handle disconnects. Notification | ||
2291 | contexts can be used to send individualized messages to a particular client or | ||
2292 | to broadcast messages to a group of clients. An individualized notification | ||
2293 | might look like this: | ||
2294 | @example | ||
2295 | GNUNET_SERVER_notification_context_unicast(nc, | ||
2296 | client, msg, GNUNET_YES); | ||
2297 | @end example | ||
2298 | |||
2299 | |||
2300 | Note that after processing the original registration message for notifications, | ||
2301 | the server code still typically needs to call@ | ||
2302 | @code{GNUNET_SERVER_receive_done} so that the client can transmit further | ||
2303 | messages to the server. | ||
2304 | |||
2305 | @c *************************************************************************** | ||
2306 | @node Conversion between Network Byte Order (Big Endian) and Host Byte Order | ||
2307 | @subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order | ||
2308 | @c %** subsub? it's a referenced page on the ipc document. | ||
2309 | @c %**end of header | ||
2310 | |||
2311 | Here we can simply comprehend big endian and little endian as Network Byte | ||
2312 | Order and Host Byte Order respectively. What is the difference between both | ||
2313 | two? | ||
2314 | |||
2315 | Usually in our host computer we store the data byte as Host Byte Order, for | ||
2316 | example, we store a integer in the RAM which might occupies 4 Byte, as Host | ||
2317 | Byte Order the higher Byte would be stored at the lower address of RAM, and | ||
2318 | the lower Byte would be stored at the higher address of RAM. However, contrast | ||
2319 | to this, Network Byte Order just take the totally opposite way to store the | ||
2320 | data, says, it will store the lower Byte at the lower address, and the higher | ||
2321 | Byte will stay at higher address. | ||
2322 | |||
2323 | For the current communication of network, we normally exchange the information | ||
2324 | by surveying the data package, every two host wants to communicate with each | ||
2325 | other must send and receive data package through network. In order to maintain | ||
2326 | the identity of data through the transmission in the network, the order of the | ||
2327 | Byte storage must changed before sending and after receiving the data. | ||
2328 | |||
2329 | There ten convenient functions to realize the conversion of Byte Order in | ||
2330 | GNUnet, as following: | ||
2331 | @table @asis | ||
2332 | |||
2333 | @item uint16_t htons(uint16_t hostshort) Convert host byte order to net byte | ||
2334 | order with short int | ||
2335 | @item uint32_t htonl(uint32_t hostlong) Convert host byte | ||
2336 | order to net byte order with long int | ||
2337 | @item uint16_t ntohs(uint16_t netshort) | ||
2338 | Convert net byte order to host byte order with short int | ||
2339 | @item uint32_t | ||
2340 | ntohl(uint32_t netlong) Convert net byte order to host byte order with long int | ||
2341 | @item unsigned long long GNUNET_ntohll (unsigned long long netlonglong) Convert | ||
2342 | net byte order to host byte order with long long int | ||
2343 | @item unsigned long long | ||
2344 | GNUNET_htonll (unsigned long long hostlonglong) Convert host byte order to net | ||
2345 | byte order with long long int | ||
2346 | @item struct GNUNET_TIME_RelativeNBO | ||
2347 | GNUNET_TIME_relative_hton (struct GNUNET_TIME_Relative a) Convert relative time | ||
2348 | to network byte order. | ||
2349 | @item struct GNUNET_TIME_Relative | ||
2350 | GNUNET_TIME_relative_ntoh (struct GNUNET_TIME_RelativeNBO a) Convert relative | ||
2351 | time from network byte order. | ||
2352 | @item struct GNUNET_TIME_AbsoluteNBO | ||
2353 | GNUNET_TIME_absolute_hton (struct GNUNET_TIME_Absolute a) Convert relative time | ||
2354 | to network byte order. | ||
2355 | @item struct GNUNET_TIME_Absolute | ||
2356 | GNUNET_TIME_absolute_ntoh (struct GNUNET_TIME_AbsoluteNBO a) Convert relative | ||
2357 | time from network byte order. | ||
2358 | @end table | ||
2359 | |||
2360 | @c *************************************************************************** | ||
2361 | |||
2362 | @node Cryptography API | ||
2363 | @subsection Cryptography API | ||
2364 | @c %**end of header | ||
2365 | |||
2366 | The gnunetutil APIs provides the cryptographic primitives used in GNUnet. | ||
2367 | GNUnet uses 2048 bit RSA keys for the session key exchange and for signing | ||
2368 | messages by peers and most other public-key operations. Most researchers in | ||
2369 | cryptography consider 2048 bit RSA keys as secure and practically unbreakable | ||
2370 | for a long time. The API provides functions to create a fresh key pair, read a | ||
2371 | private key from a file (or create a new file if the file does not exist), | ||
2372 | encrypt, decrypt, sign, verify and extraction of the public key into a format | ||
2373 | suitable for network transmission. | ||
2374 | |||
2375 | For the encryption of files and the actual data exchanged between peers GNUnet | ||
2376 | uses 256-bit AES encryption. Fresh, session keys are negotiated for every new | ||
2377 | connection.@ Again, there is no published technique to break this cipher in any | ||
2378 | realistic amount of time. The API provides functions for generation of keys, | ||
2379 | validation of keys (important for checking that decryptions using RSA | ||
2380 | succeeded), encryption and decryption. | ||
2381 | |||
2382 | GNUnet uses SHA-512 for computing one-way hash codes. The API provides | ||
2383 | functions to compute a hash over a block in memory or over a file on disk. | ||
2384 | |||
2385 | The crypto API also provides functions for randomizing a block of memory, | ||
2386 | obtaining a single random number and for generating a permuation of the numbers | ||
2387 | 0 to n-1. Random number generation distinguishes between WEAK and STRONG random | ||
2388 | number quality; WEAK random numbers are pseudo-random whereas STRONG random | ||
2389 | numbers use entropy gathered from the operating system. | ||
2390 | |||
2391 | Finally, the crypto API provides a means to deterministically generate a | ||
2392 | 1024-bit RSA key from a hash code. These functions should most likely not be | ||
2393 | used by most applications; most importantly,@ | ||
2394 | GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that should | ||
2395 | be considered secure for traditional applications of RSA. | ||
2396 | |||
2397 | @c *************************************************************************** | ||
2398 | @node Message Queue API | ||
2399 | @subsection Message Queue API | ||
2400 | @c %**end of header | ||
2401 | |||
2402 | @strong{ Introduction }@ Often, applications need to queue messages that are to | ||
2403 | be sent to other GNUnet peers, clients or services. As all of GNUnet's | ||
2404 | message-based communication APIs, by design, do not allow messages to be | ||
2405 | queued, it is common to implement custom message queues manually when they are | ||
2406 | needed. However, writing very similar code in multiple places is tedious and | ||
2407 | leads to code duplication. | ||
2408 | |||
2409 | MQ (for Message Queue) is an API that provides the functionality to implement | ||
2410 | and use message queues. We intend to eventually replace all of the custom | ||
2411 | message queue implementations in GNUnet with MQ. | ||
2412 | |||
2413 | @strong{ Basic Concepts }@ The two most important entities in MQ are queues and | ||
2414 | envelopes. | ||
2415 | |||
2416 | Every queue is backed by a specific implementation (e.g. for mesh, stream, | ||
2417 | connection, server client, etc.) that will actually deliver the queued | ||
2418 | messages. For convenience,@ some queues also allow to specify a list of message | ||
2419 | handlers. The message queue will then also wait for incoming messages and | ||
2420 | dispatch them appropriately. | ||
2421 | |||
2422 | An envelope holds the the memory for a message, as well as metadata (Where is | ||
2423 | the envelope queued? What should happen after it has been sent?). Any envelope | ||
2424 | can only be queued in one message queue. | ||
2425 | |||
2426 | @strong{ Creating Queues }@ The following is a list of currently available | ||
2427 | message queues. Note that to avoid layering issues, message queues for higher | ||
2428 | level APIs are not part of @code{libgnunetutil}, but@ the respective API itself | ||
2429 | provides the queue implementation. | ||
2430 | @table @asis | ||
2431 | |||
2432 | @item @code{GNUNET_MQ_queue_for_connection_client} Transmits queued messages | ||
2433 | over a @code{GNUNET_CLIENT_Connection}@ handle. Also supports receiving with | ||
2434 | message handlers.@ | ||
2435 | |||
2436 | @item @code{GNUNET_MQ_queue_for_server_client} Transmits queued messages over a | ||
2437 | @code{GNUNET_SERVER_Client}@ handle. Does not support incoming message | ||
2438 | handlers.@ | ||
2439 | |||
2440 | @item @code{GNUNET_MESH_mq_create} Transmits queued messages over a | ||
2441 | @code{GNUNET_MESH_Tunnel}@ handle. Does not support incoming message handlers.@ | ||
2442 | |||
2443 | @item @code{GNUNET_MQ_queue_for_callbacks} This is the most general | ||
2444 | implementation. Instead of delivering and receiving messages with one of | ||
2445 | GNUnet's communication APIs, implementation callbacks are called. Refer to | ||
2446 | "Implementing Queues" for a more detailed explanation. | ||
2447 | @end table | ||
2448 | |||
2449 | |||
2450 | @strong{ Allocating Envelopes }@ A GNUnet message (as defined by the | ||
2451 | GNUNET_MessageHeader) has three parts: The size, the type, and the body. | ||
2452 | |||
2453 | MQ provides macros to allocate an envelope containing a message conveniently,@ | ||
2454 | automatically setting the size and type fields of the message. | ||
2455 | |||
2456 | Consider the following simple message, with the body consisting of a single | ||
2457 | number value.@ @code{} | ||
2458 | @example | ||
2459 | struct NumberMessage @{ | ||
2460 | /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */ | ||
2461 | struct GNUNET_MessageHeader header; uint32_t number GNUNET_PACKED; @}; | ||
2462 | @end example | ||
2463 | |||
2464 | An envelope containing an instance of the NumberMessage can be constructed like | ||
2465 | this: | ||
2466 | @example | ||
2467 | struct GNUNET_MQ_Envelope *ev; struct NumberMessage *msg; ev = | ||
2468 | GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1); msg->number = htonl (42); | ||
2469 | @end example | ||
2470 | |||
2471 | |||
2472 | In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is the | ||
2473 | newly allocated envelope. The first argument must be a pointer to some | ||
2474 | @code{struct} containing a @code{struct GNUNET_MessageHeader header} field, | ||
2475 | while the second argument is the desired message type, in host byte order. | ||
2476 | |||
2477 | The @code{msg} pointer now points to an allocated message, where the message | ||
2478 | type and the message size are already set. The message's size is inferred from | ||
2479 | the type of the @code{msg} pointer: It will be set to 'sizeof(*msg)', properly | ||
2480 | converted to network byte order. | ||
2481 | |||
2482 | If the message body's size is dynamic, the the macro @code{GNUNET_MQ_msg_extra} | ||
2483 | can be used to allocate an envelope whose message has additional space | ||
2484 | allocated after the @code{msg} structure. | ||
2485 | |||
2486 | If no structure has been defined for the message, | ||
2487 | @code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space | ||
2488 | after the message header. The first argument then must be a pointer to a | ||
2489 | @code{GNUNET_MessageHeader}. | ||
2490 | |||
2491 | @strong{Envelope Properties}@ A few functions in MQ allow to set additional | ||
2492 | properties on envelopes: | ||
2493 | @table @asis | ||
2494 | |||
2495 | @item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will be | ||
2496 | called once the envelope's message@ has been sent irrevocably. An envelope can | ||
2497 | be canceled precisely up to the@ point where the notify sent callback has been | ||
2498 | called. | ||
2499 | @item @code{GNUNET_MQ_disable_corking} No corking will be used when | ||
2500 | sending the message. Not every@ queue supports this flag, per default, | ||
2501 | envelopes are sent with corking.@ | ||
2502 | |||
2503 | @end table | ||
2504 | |||
2505 | |||
2506 | @strong{Sending Envelopes}@ Once an envelope has been constructed, it can be | ||
2507 | queued for sending with @code{GNUNET_MQ_send}. | ||
2508 | |||
2509 | Note that in order to avoid memory leaks, an envelope must either be sent (the | ||
2510 | queue will free it) or destroyed explicitly with @code{GNUNET_MQ_discard}. | ||
2511 | |||
2512 | @strong{Canceling Envelopes}@ An envelope queued with @code{GNUNET_MQ_send} can | ||
2513 | be canceled with @code{GNUNET_MQ_cancel}. Note that after the notify sent | ||
2514 | callback has been called, canceling a message results in undefined behavior. | ||
2515 | Thus it is unsafe to cancel an envelope that does not have a notify sent | ||
2516 | callback. When canceling an envelope, it is not necessary@ to call | ||
2517 | @code{GNUNET_MQ_discard}, and the envelope can't be sent again. | ||
2518 | |||
2519 | @strong{ Implementing Queues }@ @code{TODO} | ||
2520 | |||
2521 | @c *************************************************************************** | ||
2522 | @node Service API | ||
2523 | @subsection Service API | ||
2524 | @c %**end of header | ||
2525 | |||
2526 | Most GNUnet code lives in the form of services. Services are processes that | ||
2527 | offer an API for other components of the system to build on. Those other | ||
2528 | components can be command-line tools for users, graphical user interfaces or | ||
2529 | other services. Services provide their API using an IPC protocol. For this, | ||
2530 | each service must listen on either a TCP port or a UNIX domain socket; for | ||
2531 | this, the service implementation uses the server API. This use of server is | ||
2532 | exposed directly to the users of the service API. Thus, when using the service | ||
2533 | API, one is usually also often using large parts of the server API. The service | ||
2534 | API provides various convenience functions, such as parsing command-line | ||
2535 | arguments and the configuration file, which are not found in the server API. | ||
2536 | The dual to the service/server API is the client API, which can be used to | ||
2537 | access services. | ||
2538 | |||
2539 | The most common way to start a service is to use the GNUNET_SERVICE_run | ||
2540 | function from the program's main function. GNUNET_SERVICE_run will then parse | ||
2541 | the command line and configuration files and, based on the options found there, | ||
2542 | start the server. It will then give back control to the main program, passing | ||
2543 | the server and the configuration to the GNUNET_SERVICE_Main callback. | ||
2544 | GNUNET_SERVICE_run will also take care of starting the scheduler loop. If this | ||
2545 | is inappropriate (for example, because the scheduler loop is already running), | ||
2546 | GNUNET_SERVICE_start and related functions provide an alternative to | ||
2547 | GNUNET_SERVICE_run. | ||
2548 | |||
2549 | When starting a service, the service_name option is used to determine which | ||
2550 | sections in the configuration file should be used to configure the service. A | ||
2551 | typical value here is the name of the src/ sub-directory, for example | ||
2552 | "statistics". The same string would also be given to GNUNET_CLIENT_connect to | ||
2553 | access the service. | ||
2554 | |||
2555 | Once a service has been initialized, the program should use the@ | ||
2556 | GNUNET_SERVICE_Main callback to register message handlers using | ||
2557 | GNUNET_SERVER_add_handlers. The service will already have registered a handler | ||
2558 | for the "TEST" message. | ||
2559 | |||
2560 | The option bitfield (enum GNUNET_SERVICE_Options) determines how a service | ||
2561 | should behave during shutdown. There are three key strategies: | ||
2562 | @table @asis | ||
2563 | |||
2564 | @item instant (GNUNET_SERVICE_OPTION_NONE) Upon receiving the shutdown signal | ||
2565 | from the scheduler, the service immediately terminates the server, closing all | ||
2566 | existing connections with clients. | ||
2567 | @item manual | ||
2568 | (GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN) The service does nothing by itself | ||
2569 | during shutdown. The main program will need to take the appropriate action by | ||
2570 | calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending on how the | ||
2571 | service was initialized) to terminate the service. This method is used by | ||
2572 | gnunet-service-arm and rather uncommon. | ||
2573 | @item soft | ||
2574 | (GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN) Upon receiving the shutdown signal from | ||
2575 | the scheduler, the service immediately tells the server to stop listening for | ||
2576 | incoming clients. Requests from normal existing clients are still processed and | ||
2577 | the server/service terminates once all normal clients have disconnected. | ||
2578 | Clients that are not expected to ever disconnect (such as clients that monitor | ||
2579 | performance values) can be marked as 'monitor' clients using | ||
2580 | GNUNET_SERVER_client_mark_monitor. Those clients will continue to be processed | ||
2581 | until all 'normal' clients have disconnected. Then, the server will terminate, | ||
2582 | closing the monitor connections. This mode is for example used by 'statistics', | ||
2583 | allowing existing 'normal' clients to set (possibly persistent) statistic | ||
2584 | values before terminating. | ||
2585 | @end table | ||
2586 | |||
2587 | @c *************************************************************************** | ||
2588 | @node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps | ||
2589 | @subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps | ||
2590 | @c %**end of header | ||
2591 | |||
2592 | A commonly used data structure in GNUnet is a (multi-)hash map. It is most | ||
2593 | often used to map a peer identity to some data structure, but also to map | ||
2594 | arbitrary keys to values (for example to track requests in the distributed hash | ||
2595 | table or in file-sharing). As it is commonly used, the DHT is actually | ||
2596 | sometimes responsible for a large share of GNUnet's overall memory consumption | ||
2597 | (for some processes, 30% is not uncommon). The following text documents some | ||
2598 | API quirks (and their implications for applications) that were recently | ||
2599 | introduced to minimize the footprint of the hash map. | ||
2600 | |||
2601 | |||
2602 | @c *************************************************************************** | ||
2603 | @menu | ||
2604 | * Analysis:: | ||
2605 | * Solution:: | ||
2606 | * Migration:: | ||
2607 | * Conclusion:: | ||
2608 | * Availability:: | ||
2609 | @end menu | ||
2610 | |||
2611 | @node Analysis | ||
2612 | @subsubsection Analysis | ||
2613 | @c %**end of header | ||
2614 | |||
2615 | The main reason for the "excessive" memory consumption by the hash map is that | ||
2616 | GNUnet uses 512-bit cryptographic hash codes --- and the (multi-)hash map also | ||
2617 | uses the same 512-bit 'struct GNUNET_HashCode'. As a result, storing just the | ||
2618 | keys requires 64 bytes of memory for each key. As some applications like to | ||
2619 | keep a large number of entries in the hash map (after all, that's what maps | ||
2620 | are good for), 64 bytes per hash is significant: keeping a pointer to the | ||
2621 | value and having a linked list for collisions consume between 8 and 16 bytes, | ||
2622 | and 'malloc' may add about the same overhead per allocation, putting us in the | ||
2623 | 16 to 32 byte per entry ballpark. Adding a 64-byte key then triples the | ||
2624 | overall memory requirement for the hash map. | ||
2625 | |||
2626 | To make things "worse", most of the time storing the key in the hash map is | ||
2627 | not required: it is typically already in memory elsewhere! In most cases, the | ||
2628 | values stored in the hash map are some application-specific struct that _also_ | ||
2629 | contains the hash. Here is a simplified example: | ||
2630 | @example | ||
2631 | struct MyValue @{ | ||
2632 | struct GNUNET_HashCode key; unsigned int my_data; @}; | ||
2633 | |||
2634 | // ... | ||
2635 | val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data = | ||
2636 | 42; GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...); | ||
2637 | @end example | ||
2638 | |||
2639 | |||
2640 | This is a common pattern as later the entries might need to be removed, and at | ||
2641 | that time it is convenient to have the key immediately at hand: | ||
2642 | @example | ||
2643 | GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val); | ||
2644 | @end example | ||
2645 | |||
2646 | |||
2647 | Note that here we end up with two times 64 bytes for the key, plus maybe 64 | ||
2648 | bytes total for the rest of the 'struct MyValue' and the map entry in the hash | ||
2649 | map. The resulting redundant storage of the key increases overall memory | ||
2650 | consumption per entry from the "optimal" 128 bytes to 192 bytes. This is not | ||
2651 | just an extreme example: overheads in practice are actually sometimes close to | ||
2652 | those highlighted in this example. This is especially true for maps with a | ||
2653 | significant number of entries, as there we tend to really try to keep the | ||
2654 | entries small. | ||
2655 | @c *************************************************************************** | ||
2656 | @node Solution | ||
2657 | @subsubsection Solution | ||
2658 | @c %**end of header | ||
2659 | |||
2660 | The solution that has now been implemented is to @strong{optionally} allow the | ||
2661 | hash map to not make a (deep) copy of the hash but instead have a pointer to | ||
2662 | the hash/key in the entry. This reduces the memory consumption for the key | ||
2663 | from 64 bytes to 4 to 8 bytes. However, it can also only work if the key is | ||
2664 | actually stored in the entry (which is the case most of the time) and if the | ||
2665 | entry does not modify the key (which in all of the code I'm aware of has been | ||
2666 | always the case if there key is stored in the entry). Finally, when the client | ||
2667 | stores an entry in the hash map, it @strong{must} provide a pointer to the key | ||
2668 | within the entry, not just a pointer to a transient location of the key. If | ||
2669 | the client code does not meet these requirements, the result is a dangling | ||
2670 | pointer and undefined behavior of the (multi-)hash map API. | ||
2671 | @c *************************************************************************** | ||
2672 | @node Migration | ||
2673 | @subsubsection Migration | ||
2674 | @c %**end of header | ||
2675 | |||
2676 | To use the new feature, first check that the values contain the respective key | ||
2677 | (and never modify it). Then, all calls to | ||
2678 | @code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be audited | ||
2679 | and most likely changed to pass a pointer into the value's struct. For the | ||
2680 | initial example, the new code would look like this: | ||
2681 | @example | ||
2682 | struct MyValue @{ | ||
2683 | struct GNUNET_HashCode key; unsigned int my_data; @}; | ||
2684 | |||
2685 | // ... | ||
2686 | val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data = | ||
2687 | 42; GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...); | ||
2688 | @end example | ||
2689 | |||
2690 | |||
2691 | Note that @code{&val} was changed to @code{&val->key} in the argument to the | ||
2692 | @code{put} call. This is critical as often @code{key} is on the stack or in | ||
2693 | some other transient data structure and thus having the hash map keep a pointer | ||
2694 | to @code{key} would not work. Only the key inside of @code{val} has the same | ||
2695 | lifetime as the entry in the map (this must of course be checked as well). | ||
2696 | Naturally, @code{val->key} must be intiialized before the @code{put} call. Once | ||
2697 | all @code{put} calls have been converted and double-checked, you can change the | ||
2698 | call to create the hash map from | ||
2699 | @example | ||
2700 | map = | ||
2701 | GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO); | ||
2702 | @end example | ||
2703 | |||
2704 | to | ||
2705 | |||
2706 | @example | ||
2707 | map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES); | ||
2708 | @end example | ||
2709 | |||
2710 | If everything was done correctly, you now use about 60 bytes less memory per | ||
2711 | entry in @code{map}. However, if now (or in the future) any call to @code{put} | ||
2712 | does not ensure that the given key is valid until the entry is removed from the | ||
2713 | map, undefined behavior is likely to be observed. | ||
2714 | @c *************************************************************************** | ||
2715 | @node Conclusion | ||
2716 | @subsubsection Conclusion | ||
2717 | @c %**end of header | ||
2718 | |||
2719 | The new optimization can is often applicable and can result in a reduction in | ||
2720 | memory consumption of up to 30% in practice. However, it makes the code less | ||
2721 | robust as additional invariants are imposed on the multi hash map client. Thus | ||
2722 | applications should refrain from enabling the new mode unless the resulting | ||
2723 | performance increase is deemed significant enough. In particular, it should | ||
2724 | generally not be used in new code (wait at least until benchmarks exist). | ||
2725 | @c *************************************************************************** | ||
2726 | @node Availability | ||
2727 | @subsubsection Availability | ||
2728 | @c %**end of header | ||
2729 | |||
2730 | The new multi hash map code was committed in SVN 24319 (will be in GNUnet | ||
2731 | 0.9.4). Various subsystems (transport, core, dht, file-sharing) were | ||
2732 | previously audited and modified to take advantage of the new capability. In | ||
2733 | particular, memory consumption of the file-sharing service is expected to drop | ||
2734 | by 20-30% due to this change. | ||
2735 | |||
2736 | @c *************************************************************************** | ||
2737 | @node The CONTAINER_MDLL API | ||
2738 | @subsection The CONTAINER_MDLL API | ||
2739 | @c %**end of header | ||
2740 | |||
2741 | This text documents the GNUNET_CONTAINER_MDLL API. The GNUNET_CONTAINER_MDLL | ||
2742 | API is similar to the GNUNET_CONTAINER_DLL API in that it provides operations | ||
2743 | for the construction and manipulation of doubly-linked lists. The key | ||
2744 | difference to the (simpler) DLL-API is that the MDLL-version allows a single | ||
2745 | element (instance of a "struct") to be in multiple linked lists at the same | ||
2746 | time. | ||
2747 | |||
2748 | Like the DLL API, the MDLL API stores (most of) the data structures for the | ||
2749 | doubly-linked list with the respective elements; only the 'head' and 'tail' | ||
2750 | pointers are stored "elsewhere" --- and the application needs to provide the | ||
2751 | locations of head and tail to each of the calls in the MDLL API. The key | ||
2752 | difference for the MDLL API is that the "next" and "previous" pointers in the | ||
2753 | struct can no longer be simply called "next" and "prev" --- after all, the | ||
2754 | element may be in multiple doubly-linked lists, so we cannot just have one | ||
2755 | "next" and one "prev" pointer! | ||
2756 | |||
2757 | The solution is to have multiple fields that must have a name of the format | ||
2758 | "next_XX" and "prev_XX" where "XX" is the name of one of the doubly-linked | ||
2759 | lists. Here is a simple example: | ||
2760 | @example | ||
2761 | struct MyMultiListElement @{ struct | ||
2762 | MyMultiListElement *next_ALIST; struct MyMultiListElement *prev_ALIST; struct | ||
2763 | MyMultiListElement *next_BLIST; struct MyMultiListElement *prev_BLIST; void | ||
2764 | *data; @}; | ||
2765 | @end example | ||
2766 | |||
2767 | |||
2768 | Note that by convention, we use all-uppercase letters for the list names. In | ||
2769 | addition, the program needs to have a location for the head and tail pointers | ||
2770 | for both lists, for example: | ||
2771 | @example | ||
2772 | static struct MyMultiListElement | ||
2773 | *head_ALIST; static struct MyMultiListElement *tail_ALIST; static struct | ||
2774 | MyMultiListElement *head_BLIST; static struct MyMultiListElement *tail_BLIST; | ||
2775 | @end example | ||
2776 | |||
2777 | |||
2778 | Using the MDLL-macros, we can now insert an element into the ALIST: | ||
2779 | @example | ||
2780 | GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element); | ||
2781 | @end example | ||
2782 | |||
2783 | |||
2784 | Passing "ALIST" as the first argument to MDLL specifies which of the next/prev | ||
2785 | fields in the 'struct MyMultiListElement' should be used. The extra "ALIST" | ||
2786 | argument and the "_ALIST" in the names of the next/prev-members are the only | ||
2787 | differences between the MDDL and DLL-API. Like the DLL-API, the MDLL-API offers | ||
2788 | functions for inserting (at head, at tail, after a given element) and removing | ||
2789 | elements from the list. Iterating over the list should be done by directly | ||
2790 | accessing the "next_XX" and/or "prev_XX" members. | ||
2791 | |||
2792 | @c *************************************************************************** | ||
2793 | @node The Automatic Restart Manager (ARM) | ||
2794 | @section The Automatic Restart Manager (ARM) | ||
2795 | @c %**end of header | ||
2796 | |||
2797 | GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible for | ||
2798 | system initialization and service babysitting. ARM starts and halts services, | ||
2799 | detects configuration changes and restarts services impacted by the changes as | ||
2800 | needed. It's also responsible for restarting services in case of crashes and is | ||
2801 | planned to incorporate automatic debugging for diagnosing service crashes | ||
2802 | providing developers insights about crash reasons. The purpose of this document | ||
2803 | is to give GNUnet developer an idea about how ARM works and how to interact | ||
2804 | with it. | ||
2805 | |||
2806 | @menu | ||
2807 | * Basic functionality:: | ||
2808 | * Key configuration options:: | ||
2809 | * Availability2:: | ||
2810 | * Reliability:: | ||
2811 | @end menu | ||
2812 | |||
2813 | @c *************************************************************************** | ||
2814 | @node Basic functionality | ||
2815 | @subsection Basic functionality | ||
2816 | @c %**end of header | ||
2817 | |||
2818 | @itemize @bullet | ||
2819 | @item ARM source code can be found under "src/arm".@ Service processes are | ||
2820 | managed by the functions in "gnunet-service-arm.c" which is controlled with | ||
2821 | "gnunet-arm.c" (main function in that file is ARM's entry point). | ||
2822 | |||
2823 | @item The functions responsible for communicating with ARM , starting and | ||
2824 | stopping services -including ARM service itself- are provided by the ARM API | ||
2825 | "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller an ARM | ||
2826 | handle after setting it to the caller's context (configuration and scheduler in | ||
2827 | use). This handle can be used afterwards by the caller to communicate with ARM. | ||
2828 | Functions GNUNET_ARM_start_service() and GNUNET_ARM_stop_service() are used for | ||
2829 | starting and stopping services respectively. | ||
2830 | |||
2831 | @item A typical example of using these basic ARM services can be found in file | ||
2832 | test_arm_api.c. The test case connects to ARM, starts it, then uses it to start | ||
2833 | a service "resolver", stops the "resolver" then stops "ARM". | ||
2834 | @end itemize | ||
2835 | |||
2836 | @c *************************************************************************** | ||
2837 | @node Key configuration options | ||
2838 | @subsection Key configuration options | ||
2839 | @c %**end of header | ||
2840 | |||
2841 | Configurations for ARM and services should be available in a .conf file (As an | ||
2842 | example, see test_arm_api_data.conf). When running ARM, the configuration file | ||
2843 | to use should be passed to the command:@ @code{@ $ gnunet-arm -s -c | ||
2844 | configuration_to_use.conf@ }@ If no configuration is passed, the default | ||
2845 | configuration file will be used (see GNUNET_PREFIX/share/gnunet/defaults.conf | ||
2846 | which is created from contrib/defaults.conf).@ Each of the services is having a | ||
2847 | section starting by the service name between square brackets, for example: | ||
2848 | "[arm]". The following options configure how ARM configures or interacts with | ||
2849 | the various services: | ||
2850 | |||
2851 | @table @asis | ||
2852 | |||
2853 | @item PORT Port number on which the service is listening for incoming TCP | ||
2854 | connections. ARM will start the services should it notice a request at this | ||
2855 | port. | ||
2856 | |||
2857 | @item HOSTNAME Specifies on which host the service is deployed. Note | ||
2858 | that ARM can only start services that are running on the local system (but will | ||
2859 | not check that the hostname matches the local machine name). This option is | ||
2860 | used by the @code{gnunet_client_lib.h} implementation to determine which system | ||
2861 | to connect to. The default is "localhost". | ||
2862 | |||
2863 | @item BINARY The name of the service binary file. | ||
2864 | |||
2865 | @item OPTIONS To be passed to the service. | ||
2866 | |||
2867 | @item PREFIX A command to pre-pend to the actual command, for example, running | ||
2868 | a service with "valgrind" or "gdb" | ||
2869 | |||
2870 | @item DEBUG Run in debug mode (much verbosity). | ||
2871 | |||
2872 | @item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of the | ||
2873 | service and start the service on-demand. | ||
2874 | |||
2875 | @item FORCESTART ARM will always | ||
2876 | start this service when the peer is started. | ||
2877 | |||
2878 | @item ACCEPT_FROM IPv4 addresses the service accepts connections from. | ||
2879 | |||
2880 | @item ACCEPT_FROM6 IPv6 addresses the service accepts connections from. | ||
2881 | |||
2882 | @end table | ||
2883 | |||
2884 | |||
2885 | Options that impact the operation of ARM overall are in the "[arm]" section. | ||
2886 | ARM is a normal service and has (except for AUTOSTART) all of the options that | ||
2887 | other services do. In addition, ARM has the following options: | ||
2888 | @table @asis | ||
2889 | |||
2890 | @item GLOBAL_PREFIX Command to be pre-pended to all services that are going to | ||
2891 | run.@ | ||
2892 | |||
2893 | @item GLOBAL_POSTFIX Global option that will be supplied to all the services | ||
2894 | that are going to run.@ | ||
2895 | |||
2896 | @end table | ||
2897 | |||
2898 | @c *************************************************************************** | ||
2899 | @node Availability2 | ||
2900 | @subsection Availability2 | ||
2901 | @c %**end of header | ||
2902 | |||
2903 | As mentioned before, one of the features provided by ARM is starting services | ||
2904 | on demand. Consider the example of one service "client" that wants to connect | ||
2905 | to another service a "server". The "client" will ask ARM to run the "server". | ||
2906 | ARM starts the "server". The "server" starts listening to incoming connections. | ||
2907 | The "client" will establish a connection with the "server". And then, they will | ||
2908 | start to communicate together.@ One problem with that scheme is that it's | ||
2909 | slow!@ The "client" service wants to communicate with the "server" service at | ||
2910 | once and is not willing wait for it to be started and listening to incoming | ||
2911 | connections before serving its request.@ One solution for that problem will be | ||
2912 | that ARM starts all services as default services. That solution will solve the | ||
2913 | problem, yet, it's not quite practical, for some services that are going to be | ||
2914 | started can never be used or are going to be used after a relatively long | ||
2915 | time.@ The approach followed by ARM to solve this problem is as follows: | ||
2916 | @itemize @bullet | ||
2917 | |||
2918 | |||
2919 | @item For each service having a PORT field in the configuration file and that | ||
2920 | is not one of the default services ( a service that accepts incoming | ||
2921 | connections from clients), ARM creates listening sockets for all addresses | ||
2922 | associated with that service. | ||
2923 | |||
2924 | @item The "client" will immediately establish a connection with the "server". | ||
2925 | |||
2926 | @item ARM --- pretending to be the "server" --- will listen on the respective | ||
2927 | port and notice the incoming connection from the "client" (but not accept it), | ||
2928 | instead | ||
2929 | |||
2930 | @item Once there is an incoming connection, ARM will start the "server", | ||
2931 | passing on the listen sockets (now, the service is started and can do its | ||
2932 | work). | ||
2933 | |||
2934 | @item Other client services now can directly connect directly to the "server". | ||
2935 | @end itemize | ||
2936 | |||
2937 | @c *************************************************************************** | ||
2938 | @node Reliability | ||
2939 | @subsection Reliability | ||
2940 | |||
2941 | One of the features provided by ARM, is the automatic restart of crashed | ||
2942 | services.@ ARM needs to know which of the running services died. Function | ||
2943 | "gnunet-service-arm.c/maint_child_death()" is responsible for that. The | ||
2944 | function is scheduled to run upon receiving a SIGCHLD signal. The function, | ||
2945 | then, iterates ARM's list of services running and monitors which service has | ||
2946 | died (crashed). For all crashing services, ARM restarts them.@ Now, considering | ||
2947 | the case of a service having a serious problem causing it to crash each time | ||
2948 | it's started by ARM. If ARM keeps blindly restarting such a service, we are | ||
2949 | going to have the pattern: start-crash-restart-crash-restart-crash and so | ||
2950 | forth!! Which is of course not practical.@ For that reason, ARM schedules the | ||
2951 | service to be restarted after waiting for some delay that grows exponentially | ||
2952 | with each crash/restart of that service.@ To clarify the idea, considering the | ||
2953 | following example: | ||
2954 | @itemize @bullet | ||
2955 | |||
2956 | |||
2957 | @item Service S crashed. | ||
2958 | |||
2959 | @item ARM receives the SIGCHLD and inspects its list of services to find the | ||
2960 | dead one(s). | ||
2961 | |||
2962 | @item ARM finds S dead and schedules it for restarting after "backoff" time | ||
2963 | which is initially set to 1ms. ARM will double the backoff time correspondent | ||
2964 | to S (now backoff(S) = 2ms) | ||
2965 | |||
2966 | @item Because there is a severe problem with S, it crashed again. | ||
2967 | |||
2968 | @item Again ARM receives the SIGCHLD and detects that it's S again that's | ||
2969 | crashed. ARM schedules it for restarting but after its new backoff time (which | ||
2970 | became 2ms), and doubles its backoff time (now backoff(S) = 4). | ||
2971 | |||
2972 | @item and so on, until backoff(S) reaches a certain threshold | ||
2973 | (EXPONENTIAL_BACKOFF_THRESHOLD is set to half an hour), after reaching it, | ||
2974 | backoff(S) will remain half an hour, hence ARM won't be busy for a lot of time | ||
2975 | trying to restart a problematic service. | ||
2976 | @end itemize | ||
2977 | |||
2978 | @c *************************************************************************** | ||
2979 | @node GNUnet's TRANSPORT Subsystem | ||
2980 | @section GNUnet's TRANSPORT Subsystem | ||
2981 | @c %**end of header | ||
2982 | |||
2983 | This chapter documents how the GNUnet transport subsystem works. The GNUnet | ||
2984 | transport subsystem consists of three main components: the transport API (the | ||
2985 | interface used by the rest of the system to access the transport service), the | ||
2986 | transport service itself (most of the interesting functions, such as choosing | ||
2987 | transports, happens here) and the transport plugins. A transport plugin is a | ||
2988 | concrete implementation for how two GNUnet peers communicate; many plugins | ||
2989 | exist, for example for communication via TCP, UDP, HTTP, HTTPS and others. | ||
2990 | Finally, the transport subsystem uses supporting code, especially the NAT/UPnP | ||
2991 | library to help with tasks such as NAT traversal. | ||
2992 | |||
2993 | Key tasks of the transport service include: | ||
2994 | @itemize @bullet | ||
2995 | |||
2996 | |||
2997 | @item Create our HELLO message, notify clients and neighbours if our HELLO | ||
2998 | changes (using NAT library as necessary) | ||
2999 | |||
3000 | @item Validate HELLOs from other peers (send PING), allow other peers to | ||
3001 | validate our HELLO's addresses (send PONG) | ||
3002 | |||
3003 | @item Upon request, establish connections to other peers (using address | ||
3004 | selection from ATS subsystem) and maintain them (again using PINGs and PONGs) | ||
3005 | as long as desired | ||
3006 | |||
3007 | @item Accept incoming connections, give ATS service the opportunity to switch | ||
3008 | communication channels | ||
3009 | |||
3010 | @item Notify clients about peers that have connected to us or that have been | ||
3011 | disconnected from us | ||
3012 | |||
3013 | @item If a (stateful) connection goes down unexpectedly (without explicit | ||
3014 | DISCONNECT), quickly attempt to recover (without notifying clients) but do | ||
3015 | notify clients quickly if reconnecting fails | ||
3016 | |||
3017 | @item Send (payload) messages arriving from clients to other peers via | ||
3018 | transport plugins and receive messages from other peers, forwarding those to | ||
3019 | clients | ||
3020 | |||
3021 | @item Enforce inbound traffic limits (using flow-control if it is applicable); | ||
3022 | outbound traffic limits are enforced by CORE, not by us (!) | ||
3023 | |||
3024 | @item Enforce restrictions on P2P connection as specified by the blacklist | ||
3025 | configuration and blacklisting clients | ||
3026 | @end itemize | ||
3027 | |||
3028 | |||
3029 | Note that the term "clients" in the list above really refers to the GNUnet-CORE | ||
3030 | service, as CORE is typically the only client of the transport service. | ||
3031 | |||
3032 | @menu | ||
3033 | * Address validation protocol:: | ||
3034 | @end menu | ||
3035 | |||
3036 | @node Address validation protocol | ||
3037 | @subsection Address validation protocol | ||
3038 | @c %**end of header | ||
3039 | |||
3040 | This section documents how the GNUnet transport service validates connections | ||
3041 | with other peers. It is a high-level description of the protocol necessary to | ||
3042 | understand the details of the implementation. It should be noted that when we | ||
3043 | talk about PING and PONG messages in this section, we refer to transport-level | ||
3044 | PING and PONG messages, which are different from core-level PING and PONG | ||
3045 | messages (both in implementation and function). | ||
3046 | |||
3047 | The goal of transport-level address validation is to minimize the chances of a | ||
3048 | successful man-in-the-middle attack against GNUnet peers on the transport | ||
3049 | level. Such an attack would not allow the adversary to decrypt the P2P | ||
3050 | transmissions, but a successful attacker could at least measure traffic volumes | ||
3051 | and latencies (raising the adversaries capablities by those of a global passive | ||
3052 | adversary in the worst case). The scenarios we are concerned about is an | ||
3053 | attacker, Mallory, giving a HELLO to Alice that claims to be for Bob, but | ||
3054 | contains Mallory's IP address instead of Bobs (for some transport). Mallory | ||
3055 | would then forward the traffic to Bob (by initiating a connection to Bob and | ||
3056 | claiming to be Alice). As a further complication, the scheme has to work even | ||
3057 | if say Alice is behind a NAT without traversal support and hence has no address | ||
3058 | of her own (and thus Alice must always initiate the connection to Bob). | ||
3059 | |||
3060 | An additional constraint is that HELLO messages do not contain a cryptographic | ||
3061 | signature since other peers must be able to edit (i.e. remove) addresses from | ||
3062 | the HELLO at any time (this was not true in GNUnet 0.8.x). A basic | ||
3063 | @strong{assumption} is that each peer knows the set of possible network | ||
3064 | addresses that it @strong{might} be reachable under (so for example, the | ||
3065 | external IP address of the NAT plus the LAN address(es) with the respective | ||
3066 | ports). | ||
3067 | |||
3068 | The solution is the following. If Alice wants to validate that a given address | ||
3069 | for Bob is valid (i.e. is actually established @strong{directly} with the | ||
3070 | intended target), it sends a PING message over that connection to Bob. Note | ||
3071 | that in this case, Alice initiated the connection so only she knows which | ||
3072 | address was used for sure (Alice maybe behind NAT, so whatever address Bob | ||
3073 | sees may not be an address Alice knows she has). Bob checks that the address | ||
3074 | given in the PING is actually one of his addresses (does not belong to | ||
3075 | Mallory), and if it is, sends back a PONG (with a signature that says that Bob | ||
3076 | owns/uses the address from the PING). Alice checks the signature and is happy | ||
3077 | if it is valid and the address in the PONG is the address she used. This is | ||
3078 | similar to the 0.8.x protocol where the HELLO contained a signature from Bob | ||
3079 | for each address used by Bob. Here, the purpose code for the signature is | ||
3080 | @code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will | ||
3081 | remember Bob's address and consider the address valid for a while (12h in the | ||
3082 | current implementation). Note that after this exchange, Alice only considers | ||
3083 | Bob's address to be valid, the connection itself is not considered | ||
3084 | 'established'. In particular, Alice may have many addresses for Bob that she | ||
3085 | considers valid. | ||
3086 | |||
3087 | The PONG message is protected with a nonce/challenge against replay attacks | ||
3088 | and uses an expiration time for the signature (but those are almost | ||
3089 | implementation details). | ||
3090 | |||
3091 | @node NAT library | ||
3092 | @section NAT library | ||
3093 | @c %**end of header | ||
3094 | |||
3095 | The goal of the GNUnet NAT library is to provide a general-purpose API for NAT | ||
3096 | traversal @strong{without} third-party support. So protocols that involve | ||
3097 | contacting a third peer to help establish a connection between two peers are | ||
3098 | outside of the scope of this API. That does not mean that GNUnet doesn't | ||
3099 | support involving a third peer (we can do this with the distance-vector | ||
3100 | transport or using application-level protocols), it just means that the NAT API | ||
3101 | is not concerned with this possibility. The API is written so that it will work | ||
3102 | for IPv6-NAT in the future as well as current IPv4-NAT. Furthermore, the NAT | ||
3103 | API is always used, even for peers that are not behind NAT --- in that case, | ||
3104 | the mapping provided is simply the identity. | ||
3105 | |||
3106 | NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a set | ||
3107 | of addresses that the peer has locally bound to (TCP or UDP), the NAT library | ||
3108 | will return (via callback) a (possibly longer) list of addresses the peer | ||
3109 | @strong{might} be reachable under. Internally, depending on the configuration, | ||
3110 | the NAT library will try to punch a hole (using UPnP) or just "know" that the | ||
3111 | NAT was manually punched and generate the respective external IP address (the | ||
3112 | one that should be globally visible) based on the given information. | ||
3113 | |||
3114 | The NAT library also supports ICMP-based NAT traversal. Here, the other peer | ||
3115 | can request connection-reversal by this peer (in this special case, the peer is | ||
3116 | even allowed to configure a port number of zero). If the NAT library detects a | ||
3117 | connection-reversal request, it returns the respective target address to the | ||
3118 | client as well. It should be noted that connection-reversal is currently only | ||
3119 | intended for TCP, so other plugins @strong{must} pass @code{NULL} for the | ||
3120 | reversal callback. Naturally, the NAT library also supports requesting | ||
3121 | connection reversal from a remote peer (@code{GNUNET_NAT_run_client}). | ||
3122 | |||
3123 | Once initialized, the NAT handle can be used to test if a given address is | ||
3124 | possibly a valid address for this peer (@code{GNUNET_NAT_test_address}). This | ||
3125 | is used for validating our addresses when generating PONGs. | ||
3126 | |||
3127 | Finally, the NAT library contains an API to test if our NAT configuration is | ||
3128 | correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to the | ||
3129 | respective port, the NAT library can be used to test if the configuration | ||
3130 | works. The test function act as a local client, initialize the NAT traversal | ||
3131 | and then contact a @code{gnunet-nat-server} (running by default on | ||
3132 | @code{gnunet.org}) and ask for a connection to be established. This way, it is | ||
3133 | easy to test if the current NAT configuration is valid. | ||
3134 | |||
3135 | @node Distance-Vector plugin | ||
3136 | @section Distance-Vector plugin | ||
3137 | @c %**end of header | ||
3138 | |||
3139 | The Distance Vector (DV) transport is a transport mechanism that allows peers | ||
3140 | to act as relays for each other, thereby connecting peers that would otherwise | ||
3141 | be unable to connect. This gives a larger connection set to applications that | ||
3142 | may work better with more peers to choose from (for example, File Sharing | ||
3143 | and/or DHT). | ||
3144 | |||
3145 | The Distance Vector transport essentially has two functions. The first is | ||
3146 | "gossiping" connection information about more distant peers to directly | ||
3147 | connected peers. The second is taking messages intended for non-directly | ||
3148 | connected peers and encapsulating them in a DV wrapper that contains the | ||
3149 | required information for routing the message through forwarding peers. Via | ||
3150 | gossiping, optimal routes through the known DV neighborhood are discovered and | ||
3151 | utilized and the message encapsulation provides some benefits in addition to | ||
3152 | simply getting the message from the correct source to the proper destination. | ||
3153 | |||
3154 | The gossiping function of DV provides an up to date routing table of peers that | ||
3155 | are available up to some number of hops. We call this a fisheye view of the | ||
3156 | network (like a fish, nearby objects are known while more distant ones | ||
3157 | unknown). Gossip messages are sent only to directly connected peers, but they | ||
3158 | are sent about other knowns peers within the "fisheye distance". Whenever two | ||
3159 | peers connect, they immediately gossip to each other about their appropriate | ||
3160 | other neighbors. They also gossip about the newly connected peer to previously | ||
3161 | connected neighbors. In order to keep the routing tables up to date, disconnect | ||
3162 | notifications are propogated as gossip as well (because disconnects may not be | ||
3163 | sent/received, timeouts are also used remove stagnant routing table entries). | ||
3164 | |||
3165 | Routing of messages via DV is straightforward. When the DV transport is | ||
3166 | notified of a message destined for a non-direct neighbor, the appropriate | ||
3167 | forwarding peer is selected, and the base message is encapsulated in a DV | ||
3168 | message which contains information about the initial peer and the intended | ||
3169 | recipient. At each forwarding hop, the initial peer is validated (the | ||
3170 | forwarding peer ensures that it has the initial peer in its neighborhood, | ||
3171 | otherwise the message is dropped). Next the base message is re-encapsulated in | ||
3172 | a new DV message for the next hop in the forwarding chain (or delivered to the | ||
3173 | current peer, if it has arrived at the destination). | ||
3174 | |||
3175 | Assume a three peer network with peers Alice, Bob and Carol. Assume that Alice | ||
3176 | <-> Bob and Bob <-> Carol are direct (e.g. over TCP or UDP transports) | ||
3177 | connections, but that Alice cannot directly connect to Carol. This may be the | ||
3178 | case due to NAT or firewall restrictions, or perhaps based on one of the peers | ||
3179 | respective configurations. If the Distance Vector transport is enabled on all | ||
3180 | three peers, it will automatically discover (from the gossip protocol) that | ||
3181 | Alice and Carol can connect via Bob and provide a "virtual" Alice <-> Carol | ||
3182 | connection. Routing between Alice and Carol happens as follows; Alice creates a | ||
3183 | message destined for Carol and notifies the DV transport about it. The DV | ||
3184 | transport at Alice looks up Carol in the routing table and finds that the | ||
3185 | message must be sent through Bob for Carol. The message is encapsulated setting | ||
3186 | Alice as the initiator and Carol as the destination and sent to Bob. Bob | ||
3187 | receives the messages, verifies both Alice and Carol are known to Bob, and | ||
3188 | re-wraps the message in a new DV message for Carol. The DV transport at Carol | ||
3189 | receives this message, unwraps the original message, and delivers it to Carol | ||
3190 | as though it came directly from Alice. | ||
3191 | |||
3192 | @node SMTP plugin | ||
3193 | @section SMTP plugin | ||
3194 | @c %**end of header | ||
3195 | |||
3196 | This page describes the new SMTP transport plugin for GNUnet as it exists in | ||
3197 | the 0.7.x and 0.8.x branch. SMTP support is currently not available in GNUnet | ||
3198 | 0.9.x. This page also describes the transport layer abstraction (as it existed | ||
3199 | in 0.7.x and 0.8.x) in more detail and gives some benchmarking results. The | ||
3200 | performance results presented are quite old and maybe outdated at this point. | ||
3201 | @itemize @bullet | ||
3202 | @item Why use SMTP for a peer-to-peer transport? | ||
3203 | @item SMTPHow does it work? | ||
3204 | @item How do I configure my peer? | ||
3205 | @item How do I test if it works? | ||
3206 | @item How fast is it? | ||
3207 | @item Is there any additional documentation? | ||
3208 | @end itemize | ||
3209 | |||
3210 | |||
3211 | @menu | ||
3212 | * Why use SMTP for a peer-to-peer transport?:: | ||
3213 | * How does it work?:: | ||
3214 | * How do I configure my peer?:: | ||
3215 | * How do I test if it works?:: | ||
3216 | * How fast is it?:: | ||
3217 | @end menu | ||
3218 | |||
3219 | @node Why use SMTP for a peer-to-peer transport? | ||
3220 | @subsection Why use SMTP for a peer-to-peer transport? | ||
3221 | @c %**end of header | ||
3222 | |||
3223 | There are many reasons why one would not want to use SMTP: | ||
3224 | @itemize @bullet | ||
3225 | @item SMTP is using more bandwidth than TCP, UDP or HTTP | ||
3226 | @item SMTP has a much higher latency. | ||
3227 | @item SMTP requires significantly more computation (encoding and decoding time) | ||
3228 | for the peers. | ||
3229 | @item SMTP is significantly more complicated to configure. | ||
3230 | @item SMTP may be abused by tricking GNUnet into sending mail to@ | ||
3231 | non-participating third parties. | ||
3232 | @end itemize | ||
3233 | |||
3234 | So why would anybody want to use SMTP? | ||
3235 | @itemize @bullet | ||
3236 | @item SMTP can be used to contact peers behind NAT boxes (in virtual private | ||
3237 | networks). | ||
3238 | @item SMTP can be used to circumvent policies that limit or prohibit | ||
3239 | peer-to-peer traffic by masking as "legitimate" traffic. | ||
3240 | @item SMTP uses E-mail addresses which are independent of a specific IP, which | ||
3241 | can be useful to address peers that use dynamic IP addresses. | ||
3242 | @item SMTP can be used to initiate a connection (e.g. initial address exchange) | ||
3243 | and peers can then negotiate the use of a more efficient protocol (e.g. TCP) | ||
3244 | for the actual communication. | ||
3245 | @end itemize | ||
3246 | |||
3247 | In summary, SMTP can for example be used to send a message to a peer behind a | ||
3248 | NAT box that has a dynamic IP to tell the peer to establish a TCP connection | ||
3249 | to a peer outside of the private network. Even an extraordinary overhead for | ||
3250 | this first message would be irrelevant in this type of situation. | ||
3251 | |||
3252 | @node How does it work? | ||
3253 | @subsection How does it work? | ||
3254 | @c %**end of header | ||
3255 | |||
3256 | When a GNUnet peer needs to send a message to another GNUnet peer that has | ||
3257 | advertised (only) an SMTP transport address, GNUnet base64-encodes the message | ||
3258 | and sends it in an E-mail to the advertised address. The advertisement | ||
3259 | contains a filter which is placed in the E-mail header, such that the | ||
3260 | receiving host can filter the tagged E-mails and forward it to the GNUnet peer | ||
3261 | process. The filter can be specified individually by each peer and be changed | ||
3262 | over time. This makes it impossible to censor GNUnet E-mail messages by | ||
3263 | searching for a generic filter. | ||
3264 | |||
3265 | @node How do I configure my peer? | ||
3266 | @subsection How do I configure my peer? | ||
3267 | @c %**end of header | ||
3268 | |||
3269 | First, you need to configure @code{procmail} to filter your inbound E-mail for | ||
3270 | GNUnet traffic. The GNUnet messages must be delivered into a pipe, for example | ||
3271 | @code{/tmp/gnunet.smtp}. You also need to define a filter that is used by | ||
3272 | procmail to detect GNUnet messages. You are free to choose whichever filter | ||
3273 | you like, but you should make sure that it does not occur in your other | ||
3274 | E-mail. In our example, we will use @code{X-mailer: GNUnet}. The | ||
3275 | @code{~/.procmailrc} configuration file then looks like this: | ||
3276 | @example | ||
3277 | :0: | ||
3278 | * ^X-mailer: GNUnet | ||
3279 | /tmp/gnunet.smtp | ||
3280 | # where do you want your other e-mail delivered to (default: /var/spool/mail/) | ||
3281 | :0: /var/spool/mail/ | ||
3282 | @end example | ||
3283 | |||
3284 | After adding this file, first make sure that your regular E-mail still works | ||
3285 | (e.g. by sending an E-mail to yourself). Then edit the GNUnet configuration. | ||
3286 | In the section @code{SMTP} you need to specify your E-mail address under | ||
3287 | @code{EMAIL}, your mail server (for outgoing mail) under @code{SERVER}, the | ||
3288 | filter (X-mailer: GNUnet in the example) under @code{FILTER} and the name of | ||
3289 | the pipe under @code{PIPE}.@ The completed section could then look like this: | ||
3290 | @example | ||
3291 | EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER = | ||
3292 | "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp | ||
3293 | @end example | ||
3294 | |||
3295 | Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in the | ||
3296 | @code{GNUNETD} section. GNUnet peers will use the E-mail address that you | ||
3297 | specified to contact your peer until the advertisement times out. Thus, if you | ||
3298 | are not sure if everything works properly or if you are not planning to be | ||
3299 | online for a long time, you may want to configure this timeout to be short, | ||
3300 | e.g. just one hour. For this, set @code{HELLOEXPIRES} to @code{1} in the | ||
3301 | @code{GNUNETD} section. | ||
3302 | |||
3303 | This should be it, but you may probably want to test it first.@ | ||
3304 | @node How do I test if it works? | ||
3305 | @subsection How do I test if it works? | ||
3306 | @c %**end of header | ||
3307 | |||
3308 | Any transport can be subjected to some rudimentary tests using the | ||
3309 | @code{gnunet-transport-check} tool. The tool sends a message to the local node | ||
3310 | via the transport and checks that a valid message is received. While this test | ||
3311 | does not involve other peers and can not check if firewalls or other network | ||
3312 | obstacles prohibit proper operation, this is a great testcase for the SMTP | ||
3313 | transport since it tests pretty much nearly all of the functionality. | ||
3314 | |||
3315 | @code{gnunet-transport-check} should only be used without running | ||
3316 | @code{gnunetd} at the same time. By default, @code{gnunet-transport-check} | ||
3317 | tests all transports that are specified in the configuration file. But you can | ||
3318 | specifically test SMTP by giving the option @code{--transport=smtp}. | ||
3319 | |||
3320 | Note that this test always checks if a transport can receive and send. While | ||
3321 | you can configure most transports to only receive or only send messages, this | ||
3322 | test will only work if you have configured the transport to send and receive | ||
3323 | messages. | ||
3324 | |||
3325 | @node How fast is it? | ||
3326 | @subsection How fast is it? | ||
3327 | @c %**end of header | ||
3328 | |||
3329 | We have measured the performance of the UDP, TCP and SMTP transport layer | ||
3330 | directly and when used from an application using the GNUnet core. Measureing | ||
3331 | just the transport layer gives the better view of the actual overhead of the | ||
3332 | protocol, whereas evaluating the transport from the application puts the | ||
3333 | overhead into perspective from a practical point of view. | ||
3334 | |||
3335 | The loopback measurements of the SMTP transport were performed on three | ||
3336 | different machines spanning a range of modern SMTP configurations. We used a | ||
3337 | PIII-800 running RedHat 7.3 with the Purdue Computer Science configuration | ||
3338 | which includes filters for spam. We also used a Xenon 2 GHZ with a vanilla | ||
3339 | RedHat 8.0 sendmail configuration. Furthermore, we used qmail on a PIII-1000 | ||
3340 | running Sorcerer GNU Linux (SGL). The numbers for UDP and TCP are provided | ||
3341 | using the SGL configuration. The qmail benchmark uses qmail's internal | ||
3342 | filtering whereas the sendmail benchmarks relies on procmail to filter and | ||
3343 | deliver the mail. We used the transport layer to send a message of b bytes | ||
3344 | (excluding transport protocol headers) directly to the local machine. This | ||
3345 | way, network latency and packet loss on the wire have no impact on the | ||
3346 | timings. n messages were sent sequentially over the transport layer, sending | ||
3347 | message i+1 after the i-th message was received. All messages were sent over | ||
3348 | the same connection and the time to establish the connection was not taken | ||
3349 | into account since this overhead is miniscule in practice --- as long as a | ||
3350 | connection is used for a significant number of messages. | ||
3351 | |||
3352 | @multitable @columnfractions .20 .15 .15 .15 .15 .15 | ||
3353 | @headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail) @tab SMTP (RH 8.0) @tab SMTP (SGL qmail) | ||
3354 | @item 11 bytes @tab 31 ms @tab 55 ms @tab 781 s @tab 77 s @tab 24 s | ||
3355 | @item 407 bytes @tab 37 ms @tab 62 ms @tab 789 s @tab 78 s @tab 25 s | ||
3356 | @item 1,221 bytes @tab 46 ms @tab 73 ms @tab 804 s @tab 78 s @tab 25 s | ||
3357 | @end multitable | ||
3358 | |||
3359 | The benchmarks show that UDP and TCP are, as expected, both significantly | ||
3360 | faster compared with any of the SMTP services. Among the SMTP implementations, | ||
3361 | there can be significant differences depending on the SMTP configuration. | ||
3362 | Filtering with an external tool like procmail that needs to re-parse its | ||
3363 | configuration for each mail can be very expensive. Applying spam filters can | ||
3364 | also significantly impact the performance of the underlying SMTP | ||
3365 | implementation. The microbenchmark shows that SMTP can be a viable solution | ||
3366 | for initiating peer-to-peer sessions: a couple of seconds to connect to a peer | ||
3367 | are probably not even going to be noticed by users. The next benchmark | ||
3368 | measures the possible throughput for a transport. Throughput can be measured | ||
3369 | by sending multiple messages in parallel and measuring packet loss. Note that | ||
3370 | not only UDP but also the TCP transport can actually loose messages since the | ||
3371 | TCP implementation drops messages if the @code{write} to the socket would | ||
3372 | block. While the SMTP protocol never drops messages itself, it is often so | ||
3373 | slow that only a fraction of the messages can be sent and received in the | ||
3374 | given time-bounds. For this benchmark we report the message loss after | ||
3375 | allowing t time for sending m messages. If messages were not sent (or | ||
3376 | received) after an overall timeout of t, they were considered lost. The | ||
3377 | benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0 with | ||
3378 | sendmail. The machines were connected with a direct 100 MBit ethernet | ||
3379 | connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the throughput | ||
3380 | for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps and 6 kbps for | ||
3381 | UDP, TCP and SMTP respectively. The high per-message overhead of SMTP can be | ||
3382 | improved by increasing the MTU, for example, an MTU of 12,000 octets improves | ||
3383 | the throughput to 13 kbps as figure smtp-MTUs shows. Our research paper) has | ||
3384 | some more details on the benchmarking results. | ||
3385 | |||
3386 | @node Bluetooth plugin | ||
3387 | @section Bluetooth plugin | ||
3388 | @c %**end of header | ||
3389 | |||
3390 | This page describes the new Bluetooth transport plugin for GNUnet. The plugin | ||
3391 | is still in the testing stage so don't expect it to work perfectly. If you | ||
3392 | have any questions or problems just post them here or ask on the IRC channel. | ||
3393 | @itemize @bullet | ||
3394 | @item What do I need to use the Bluetooth plugin transport? | ||
3395 | @item BluetoothHow does it work? | ||
3396 | @item What possible errors should I be aware of? | ||
3397 | @item How do I configure my peer? | ||
3398 | @item How can I test it? | ||
3399 | @end itemize | ||
3400 | |||
3401 | |||
3402 | |||
3403 | @menu | ||
3404 | * What do I need to use the Bluetooth plugin transport?:: | ||
3405 | * How does it work2?:: | ||
3406 | * What possible errors should I be aware of?:: | ||
3407 | * How do I configure my peer2?:: | ||
3408 | * How can I test it?:: | ||
3409 | * The implementation of the Bluetooth transport plugin:: | ||
3410 | @end menu | ||
3411 | |||
3412 | @node What do I need to use the Bluetooth plugin transport? | ||
3413 | @subsection What do I need to use the Bluetooth plugin transport? | ||
3414 | @c %**end of header | ||
3415 | |||
3416 | If you are a Linux user and you want to use the Bluetooth transport plugin you | ||
3417 | should install the BlueZ development libraries (if they aren't already | ||
3418 | installed). For instructions about how to install the libraries you should | ||
3419 | check out the BlueZ site (@uref{http://www.bluez.org/, http://www.bluez.org}). | ||
3420 | If you don't know if you have the necesarry libraries, don't worry, just run | ||
3421 | the GNUnet configure script and you will be able to see a notification at the | ||
3422 | end which will warn you if you don't have the necessary libraries. | ||
3423 | |||
3424 | If you are a Windows user you should have installed the | ||
3425 | @emph{MinGW}/@emph{MSys2} with the latest updates (especially the | ||
3426 | @emph{ws2bth} header). If this is your first build of GNUnet on Windows you | ||
3427 | should check out the SBuild repository. It will semi-automatically assembles a | ||
3428 | @emph{MinGW}/@emph{MSys2} installation with a lot of extra packages which are | ||
3429 | needed for the GNUnet build. So this will ease your work!@ Finally you just | ||
3430 | have to be sure that you have the correct drivers for your Bluetooth device | ||
3431 | installed and that your device is on and in a discoverable mode. The Windows | ||
3432 | Bluetooth Stack supports only the RFCOMM protocol so we cannot turn on your | ||
3433 | device programatically! | ||
3434 | |||
3435 | @node How does it work2? | ||
3436 | @subsection How does it work2? | ||
3437 | @c %**end of header | ||
3438 | |||
3439 | The Bluetooth transport plugin uses virtually the same code as the WLAN plugin | ||
3440 | and only the helper binary is different. The helper takes a single argument, | ||
3441 | which represents the interface name and is specified in the configuration | ||
3442 | file. Here are the basic steps that are followed by the helper binary used on | ||
3443 | Linux: | ||
3444 | |||
3445 | @itemize @bullet | ||
3446 | @item it verifies if the name corresponds to a Bluetooth interface name | ||
3447 | @item it verifies if the iterface is up (if it is not, it tries to bring it up) | ||
3448 | @item it tries to enable the page and inquiry scan in order to make the device | ||
3449 | discoverable and to accept incoming connection requests | ||
3450 | @emph{The above operations require root access so you should start the | ||
3451 | transport plugin with root privileges.} | ||
3452 | @item it finds an available port number and registers a SDP service which will | ||
3453 | be used to find out on which port number is the server listening on and switch | ||
3454 | the socket in listening mode | ||
3455 | @item it sends a HELLO message with its address | ||
3456 | @item finally it forwards traffic from the reading sockets to the STDOUT and | ||
3457 | from the STDIN to the writing socket | ||
3458 | @end itemize | ||
3459 | |||
3460 | Once in a while the device will make an inquiry scan to discover the nearby | ||
3461 | devices and it will send them randomly HELLO messages for peer discovery. | ||
3462 | |||
3463 | @node What possible errors should I be aware of? | ||
3464 | @subsection What possible errors should I be aware of? | ||
3465 | @c %**end of header | ||
3466 | |||
3467 | @emph{This section is dedicated for Linux users} | ||
3468 | |||
3469 | Well there are many ways in which things could go wrong but I will try to | ||
3470 | present some tools that you could use to debug and some scenarios. | ||
3471 | @itemize @bullet | ||
3472 | |||
3473 | @item @code{bluetoothd -n -d} : use this command to enable logging in the | ||
3474 | foreground and to print the logging messages | ||
3475 | |||
3476 | @item @code{hciconfig}: can be used to configure the Bluetooth devices. If you | ||
3477 | run it without any arguments it will print information about the state of the | ||
3478 | interfaces. So if you receive an error that the device couldn't be brought up | ||
3479 | you should try to bring it manually and to see if it works (use @code{hciconfig | ||
3480 | -a hciX up}). If you can't and the Bluetooth address has the form | ||
3481 | 00:00:00:00:00:00 it means that there is something wrong with the D-Bus daemon | ||
3482 | or with the Bluetooth daemon. Use @code{bluetoothd} tool to see the logs | ||
3483 | |||
3484 | @item @code{sdptool} can be used to control and interogate SDP servers. If you | ||
3485 | encounter problems regarding the SDP server (like the SDP server is down) you | ||
3486 | should check out if the D-Bus daemon is running correctly and to see if the | ||
3487 | Bluetooth daemon started correctly(use @code{bluetoothd} tool). Also, sometimes | ||
3488 | the SDP service could work but somehow the device couldn't register his | ||
3489 | service. Use @code{sdptool browse [dev-address]} to see if the service is | ||
3490 | registered. There should be a service with the name of the interface and GNUnet | ||
3491 | as provider. | ||
3492 | |||
3493 | @item @code{hcitool} : another useful tool which can be used to configure the | ||
3494 | device and to send some particular commands to it. | ||
3495 | |||
3496 | @item @code{hcidump} : could be used for low level debugging | ||
3497 | @end itemize | ||
3498 | |||
3499 | @node How do I configure my peer2? | ||
3500 | @subsection How do I configure my peer2? | ||
3501 | @c %**end of header | ||
3502 | |||
3503 | On Linux, you just have to be sure that the interface name corresponds to the | ||
3504 | one that you want to use. Use the @code{hciconfig} tool to check that. By | ||
3505 | default it is set to hci0 but you can change it. | ||
3506 | |||
3507 | A basic configuration looks like this: | ||
3508 | @example | ||
3509 | [transport-bluetooth] | ||
3510 | # Name of the interface (typically hciX) | ||
3511 | INTERFACE = hci0 | ||
3512 | # Real hardware, no testing | ||
3513 | TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM; | ||
3514 | @end example | ||
3515 | |||
3516 | |||
3517 | In order to use the Bluetooth transport plugin when the transport service is | ||
3518 | started, you must add the plugin name to the default transport service plugins | ||
3519 | list. For example: | ||
3520 | @example | ||
3521 | [transport] ... PLUGINS = dns bluetooth ... | ||
3522 | @end example | ||
3523 | |||
3524 | If you want to use only the Bluetooth plugin set @emph{PLUGINS = bluetooth} | ||
3525 | |||
3526 | On Windows, you cannot specify which device to use. The only thing that you | ||
3527 | should do is to add @emph{bluetooth} on the plugins list of the transport | ||
3528 | service. | ||
3529 | |||
3530 | @node How can I test it? | ||
3531 | @subsection How can I test it? | ||
3532 | @c %**end of header | ||
3533 | |||
3534 | If you have two Bluetooth devices on the same machine which use Linux you | ||
3535 | must: | ||
3536 | @itemize @bullet | ||
3537 | |||
3538 | @item create two different file configuration (one which will use the first | ||
3539 | interface (@emph{hci0}) and the other which will use the second interface | ||
3540 | (@emph{hci1})). Let's name them @emph{peer1.conf} and @emph{peer2.conf}. | ||
3541 | |||
3542 | @item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the | ||
3543 | peers private keys. The @strong{X} must be replace with 1 or 2. | ||
3544 | |||
3545 | @item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to start the | ||
3546 | transport service. (Make sure that you have "bluetooth" on the transport | ||
3547 | plugins list if the Bluetooth transport service doesn't start.) | ||
3548 | |||
3549 | @item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's ID. | ||
3550 | If you already know your peer ID (you saved it from the first command), this | ||
3551 | can be skipped. | ||
3552 | |||
3553 | @item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start sending | ||
3554 | data for benchmarking to the other peer. | ||
3555 | @end itemize | ||
3556 | |||
3557 | |||
3558 | This scenario will try to connect the second peer to the first one and then | ||
3559 | start sending data for benchmarking. | ||
3560 | |||
3561 | On Windows you cannot test the plugin functionality using two Bluetooth devices | ||
3562 | from the same machine because after you install the drivers there will occur | ||
3563 | some conflicts between the Bluetooth stacks. (At least that is what happend on | ||
3564 | my machine : I wasn't able to use the Bluesoleil stack and the WINDCOMM one in | ||
3565 | the same time). | ||
3566 | |||
3567 | If you have two different machines and your configuration files are good you | ||
3568 | can use the same scenario presented on the begining of this section. | ||
3569 | |||
3570 | Another way to test the plugin functionality is to create your own application | ||
3571 | which will use the GNUnet framework with the Bluetooth transport service. | ||
3572 | |||
3573 | @node The implementation of the Bluetooth transport plugin | ||
3574 | @subsection The implementation of the Bluetooth transport plugin | ||
3575 | @c %**end of header | ||
3576 | |||
3577 | This page describes the implementation of the Bluetooth transport plugin. | ||
3578 | |||
3579 | First I want to remind you that the Bluetooth transport plugin uses virtually | ||
3580 | the same code as the WLAN plugin and only the helper binary is different. Also | ||
3581 | the scope of the helper binary from the Bluetooth transport plugin is the same | ||
3582 | as the one used for the wlan transport plugin: it acceses the interface and | ||
3583 | then it forwards traffic in both directions between the Bluetooth interface | ||
3584 | and stdin/stdout of the process involved. | ||
3585 | |||
3586 | The Bluetooth plugin transport could be used both on Linux and Windows | ||
3587 | platforms. | ||
3588 | |||
3589 | @itemize @bullet | ||
3590 | @item Linux functionality | ||
3591 | @item Windows functionality | ||
3592 | @item Pending Features | ||
3593 | @end itemize | ||
3594 | |||
3595 | |||
3596 | |||
3597 | @menu | ||
3598 | * Linux functionality:: | ||
3599 | * THE INITIALIZATION:: | ||
3600 | * THE LOOP:: | ||
3601 | * Details about the broadcast implementation:: | ||
3602 | * Windows functionality:: | ||
3603 | * Pending features:: | ||
3604 | @end menu | ||
3605 | |||
3606 | @node Linux functionality | ||
3607 | @subsubsection Linux functionality | ||
3608 | @c %**end of header | ||
3609 | |||
3610 | In order to implement the plugin functionality on Linux I used the BlueZ | ||
3611 | stack. For the communication with the other devices I used the RFCOMM | ||
3612 | protocol. Also I used the HCI protocol to gain some control over the device. | ||
3613 | The helper binary takes a single argument (the name of the Bluetooth | ||
3614 | interface) and is separated in two stages: | ||
3615 | |||
3616 | @c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not | ||
3617 | @c %** starting a new section? | ||
3618 | @node THE INITIALIZATION | ||
3619 | @subsubsection THE INITIALIZATION | ||
3620 | |||
3621 | @itemize @bullet | ||
3622 | @item first, it checks if we have root privilegies (@emph{Remember that we need | ||
3623 | to have root privilegies in order to be able to bring the interface up if it is | ||
3624 | down or to change its state.}). | ||
3625 | |||
3626 | @item second, it verifies if the interface with the given name exists. | ||
3627 | |||
3628 | @strong{If the interface with that name exists and it is a Bluetooth | ||
3629 | interface:} | ||
3630 | |||
3631 | @item it creates a RFCOMM socket which will be used for listening and call the | ||
3632 | @emph{open_device} method | ||
3633 | |||
3634 | On the @emph{open_device} method: | ||
3635 | @itemize @bullet | ||
3636 | @item creates a HCI socket used to send control events to the the device | ||
3637 | @item searches for the device ID using the interface name | ||
3638 | @item saves the device MAC address | ||
3639 | @item checks if the interface is down and tries to bring it UP | ||
3640 | @item checks if the interface is in discoverable mode and tries to make it | ||
3641 | discoverable | ||
3642 | @item closes the HCI socket and binds the RFCOMM one | ||
3643 | @item switches the RFCOMM socket in listening mode | ||
3644 | @item registers the SDP service (the service will be used by the other devices | ||
3645 | to get the port on which this device is listening on) | ||
3646 | @end itemize | ||
3647 | |||
3648 | @item drops the root privilegies | ||
3649 | |||
3650 | @strong{If the interface is not a Bluetooth interface the helper exits with a | ||
3651 | suitable error} | ||
3652 | @end itemize | ||
3653 | |||
3654 | @c %** Same as for @node entry above | ||
3655 | @node THE LOOP | ||
3656 | @subsubsection THE LOOP | ||
3657 | |||
3658 | The helper binary uses a list where it saves all the connected neighbour | ||
3659 | devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and | ||
3660 | @emph{write_std}). The first message which is send is a control message with | ||
3661 | the device's MAC address in order to announce the peer presence to the | ||
3662 | neighbours. Here are a short description of what happens in the main loop: | ||
3663 | |||
3664 | @itemize @bullet | ||
3665 | @item Every time when it receives something from the STDIN it processes the | ||
3666 | data and saves the message in the first buffer (@emph{write_pout}). When it has | ||
3667 | something in the buffer, it gets the destination address from the buffer, | ||
3668 | searches the destination address in the list (if there is no connection with | ||
3669 | that device, it creates a new one and saves it to the list) and sends the | ||
3670 | message. | ||
3671 | @item Every time when it receives something on the listening socket it accepts | ||
3672 | the connection and saves the socket on a list with the reading sockets. | ||
3673 | @item Every time when it receives something from a reading socket it parses the | ||
3674 | message, verifies the CRC and saves it in the @emph{write_std} buffer in order | ||
3675 | to be sent later to the STDOUT. | ||
3676 | @end itemize | ||
3677 | |||
3678 | So in the main loop we use the select function to wait until one of the file | ||
3679 | descriptor saved in one of the two file descriptors sets used is ready to use. | ||
3680 | The first set (@emph{rfds}) represents the reading set and it could contain the | ||
3681 | list with the reading sockets, the STDIN file descriptor or the listening | ||
3682 | socket. The second set (@emph{wfds}) is the writing set and it could contain | ||
3683 | the sending socket or the STDOUT file descriptor. After the select function | ||
3684 | returns, we check which file descriptor is ready to use and we do what is | ||
3685 | supposed to do on that kind of event. @emph{For example:} if it is the | ||
3686 | listening socket then we accept a new connection and save the socket in the | ||
3687 | reading list; if it is the STDOUT file descriptor, then we write to STDOUT the | ||
3688 | message from the @emph{write_std} buffer. | ||
3689 | |||
3690 | To find out on which port a device is listening on we connect to the local SDP | ||
3691 | server and searche the registered service for that device. | ||
3692 | |||
3693 | @emph{You should be aware of the fact that if the device fails to connect to | ||
3694 | another one when trying to send a message it will attempt one more time. If it | ||
3695 | fails again, then it skips the message.} | ||
3696 | @emph{Also you should know that the | ||
3697 | transport Bluetooth plugin has support for @strong{broadcast messages}.} | ||
3698 | |||
3699 | @node Details about the broadcast implementation | ||
3700 | @subsubsection Details about the broadcast implementation | ||
3701 | @c %**end of header | ||
3702 | |||
3703 | First I want to point out that the broadcast functionality for the CONTROL | ||
3704 | messages is not implemented in a conventional way. Since the inquiry scan time | ||
3705 | is too big and it will take some time to send a message to all the | ||
3706 | discoverable devices I decided to tackle the problem in a different way. Here | ||
3707 | is how I did it: | ||
3708 | |||
3709 | @itemize @bullet | ||
3710 | @item If it is the first time when I have to broadcast a message I make an | ||
3711 | inquiry scan and save all the devices' addresses to a vector. | ||
3712 | @item After the inquiry scan ends I take the first address from the list and I | ||
3713 | try to connect to it. If it fails, I try to connect to the next one. If it | ||
3714 | succeeds, I save the socket to a list and send the message to the device. | ||
3715 | @item When I have to broadcast another message, first I search on the list for | ||
3716 | a new device which I'm not connected to. If there is no new device on the list | ||
3717 | I go to the beginning of the list and send the message to the old devices. | ||
3718 | After 5 cycles I make a new inquiry scan to check out if there are new | ||
3719 | discoverable devices and save them to the list. If there are no new | ||
3720 | discoverable devices I reset the cycling counter and go again through the old | ||
3721 | list and send messages to the devices saved in it. | ||
3722 | @end itemize | ||
3723 | |||
3724 | @strong{Therefore}: | ||
3725 | |||
3726 | @itemize @bullet | ||
3727 | @item every time when I have a broadcast message I look up on the list for a | ||
3728 | new device and send the message to it | ||
3729 | @item if I reached the end of the list for 5 times and I'm connected to all the | ||
3730 | devices from the list I make a new inquiry scan. @emph{The number of the list's | ||
3731 | cycles after an inquiry scan could be increased by redefining the MAX_LOOPS | ||
3732 | variable} | ||
3733 | @item when there are no new devices I send messages to the old ones. | ||
3734 | @end itemize | ||
3735 | |||
3736 | Doing so, the broadcast control messages will reach the devices but with delay. | ||
3737 | |||
3738 | @emph{NOTICE:} When I have to send a message to a certain device first I check | ||
3739 | on the broadcast list to see if we are connected to that device. If not we try | ||
3740 | to connect to it and in case of success we save the address and the socket on | ||
3741 | the list. If we are already connected to that device we simply use the socket. | ||
3742 | |||
3743 | @node Windows functionality | ||
3744 | @subsubsection Windows functionality | ||
3745 | @c %**end of header | ||
3746 | |||
3747 | For Windows I decided to use the Microsoft Bluetooth stack which has the | ||
3748 | advantage of coming standard from Windows XP SP2. The main disadvantage is | ||
3749 | that it only supports the RFCOMM protocol so we will not be able to have a low | ||
3750 | level control over the Bluetooth device. Therefore it is the user | ||
3751 | responsability to check if the device is up and in the discoverable mode. Also | ||
3752 | there are no tools which could be used for debugging in order to read the data | ||
3753 | coming from and going to a Bluetooth device, which obviously hindered my work. | ||
3754 | Another thing that slowed down the implementation of the plugin (besides that | ||
3755 | I wasn't too accomodated with the win32 API) was that there were some bugs on | ||
3756 | MinGW regarding the Bluetooth. Now they are solved but you should keep in mind | ||
3757 | that you should have the latest updates (especially the @emph{ws2bth} header). | ||
3758 | |||
3759 | Besides the fact that it uses the Windows Sockets, the Windows implemenation | ||
3760 | follows the same principles as the Linux one: | ||
3761 | |||
3762 | @itemize @bullet | ||
3763 | @item | ||
3764 | It has a initalization part where it initializes the Windows Sockets, creates a | ||
3765 | RFCOMM socket which will be binded and switched to the listening mode and | ||
3766 | registers a SDP service. | ||
3767 | In the Microsoft Bluetooth API there are two ways to work with the SDP: | ||
3768 | @itemize @bullet | ||
3769 | @item an easy way which works with very simple service records | ||
3770 | @item a hard way which is useful when you need to update or to delete the | ||
3771 | record | ||
3772 | @end itemize | ||
3773 | @end itemize | ||
3774 | |||
3775 | Since I only needed the SDP service to find out on which port the device is | ||
3776 | listening on and that did not change, I decided to use the easy way. In order | ||
3777 | to register the service I used the @emph{WSASetService} function and I | ||
3778 | generated the @emph{Universally Unique Identifier} with the @emph{guidgen.exe} | ||
3779 | Windows's tool. | ||
3780 | |||
3781 | In the loop section the only difference from the Linux implementation is that | ||
3782 | I used the GNUNET_NETWORK library for functions like @emph{accept}, | ||
3783 | @emph{bind}, @emph{connect} or @emph{select}. I decided to use the | ||
3784 | GNUNET_NETWORK library because I also needed to interact with the STDIN and | ||
3785 | STDOUT handles and on Windows the select function is only defined for sockets, | ||
3786 | and it will not work for arbitrary file handles. | ||
3787 | |||
3788 | Another difference between Linux and Windows implementation is that in Linux, | ||
3789 | the Bluetooth address is represented in 48 bits while in Windows is | ||
3790 | represented in 64 bits. Therefore I had to do some changes on | ||
3791 | @emph{plugin_transport_wlan} header. | ||
3792 | |||
3793 | Also, currently on Windows the Bluetooth plugin doesn't have support for | ||
3794 | broadcast messages. When it receives a broadcast message it will skip it. | ||
3795 | |||
3796 | @node Pending features | ||
3797 | @subsubsection Pending features | ||
3798 | @c %**end of header | ||
3799 | |||
3800 | @itemize @bullet | ||
3801 | @item Implement the broadcast functionality on Windows @emph{(currently working | ||
3802 | on)} | ||
3803 | @item Implement a testcase for the helper :@ @emph{@ The testcase consists of a | ||
3804 | program which emaluates the plugin and uses the helper. It will simulate | ||
3805 | connections, disconnections and data transfers.@ } | ||
3806 | @end itemize | ||
3807 | |||
3808 | If you have a new idea about a feature of the plugin or suggestions about how | ||
3809 | I could improve the implementation you are welcome to comment or to contact | ||
3810 | me. | ||
3811 | |||
3812 | @node WLAN plugin | ||
3813 | @section WLAN plugin | ||
3814 | @c %**end of header | ||
3815 | |||
3816 | This section documents how the wlan transport plugin works. Parts which are not | ||
3817 | implemented yet or could be better implemented are described at the end. | ||
3818 | |||
3819 | @node The ATS Subsystem | ||
3820 | @section The ATS Subsystem | ||
3821 | @c %**end of header | ||
3822 | |||
3823 | ATS stands for "automatic transport selection", and the function of ATS in | ||
3824 | GNUnet is to decide on which address (and thus transport plugin) should be used | ||
3825 | for two peers to communicate, and what bandwidth limits should be imposed on | ||
3826 | such an individual connection. To help ATS make an informed decision, | ||
3827 | higher-level services inform the ATS service about their requirements and the | ||
3828 | quality of the service rendered. The ATS service also interacts with the | ||
3829 | transport service to be appraised of working addresses and to communicate its | ||
3830 | resource allocation decisions. Finally, the ATS service's operation can be | ||
3831 | observed using a monitoring API. | ||
3832 | |||
3833 | The main logic of the ATS service only collects the available addresses, their | ||
3834 | performance characteristics and the applications requirements, but does not | ||
3835 | make the actual allocation decision. This last critical step is left to an ATS | ||
3836 | plugin, as we have implemented (currently three) different allocation | ||
3837 | strategies which differ significantly in their performance and maturity, and it | ||
3838 | is still unclear if any particular plugin is generally superior. | ||
3839 | |||
3840 | @node GNUnet's CORE Subsystem | ||
3841 | @section GNUnet's CORE Subsystem | ||
3842 | @c %**end of header | ||
3843 | |||
3844 | The CORE subsystem in GNUnet is responsible for securing link-layer | ||
3845 | communications between nodes in the GNUnet overlay network. CORE builds on the | ||
3846 | TRANSPORT subsystem which provides for the actual, insecure, unreliable | ||
3847 | link-layer communication (for example, via UDP or WLAN), and then adds | ||
3848 | fundamental security to the connections: | ||
3849 | |||
3850 | @itemize @bullet | ||
3851 | @item confidentiality with so-called perfect forward secrecy; we use | ||
3852 | @uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, | ||
3853 | ECDHE} powered by @uref{http://cr.yp.to/ecdh.html, Curve25519} for the key | ||
3854 | exchange and then use symmetric encryption, encrypting with both | ||
3855 | @uref{http://en.wikipedia.org/wiki/Rijndael, AES-256} and | ||
3856 | @uref{http://en.wikipedia.org/wiki/Twofish, Twofish} | ||
3857 | @item @uref{http://en.wikipedia.org/wiki/Authentication, authentication} is | ||
3858 | achieved by signing the ephemeral keys using @uref{http://ed25519.cr.yp.to/, | ||
3859 | Ed25519}, a deterministic variant of @uref{http://en.wikipedia.org/wiki/ECDSA, | ||
3860 | ECDSA} | ||
3861 | @item integrity protection (using @uref{http://en.wikipedia.org/wiki/SHA-2, | ||
3862 | SHA-512} to do @uref{http://en.wikipedia.org/wiki/Authenticated_encryption, | ||
3863 | encrypt-then-MAC)} | ||
3864 | @item @uref{http://en.wikipedia.org/wiki/Replay_attack, replay} protection | ||
3865 | (using nonces, timestamps, challenge-response, message counters and ephemeral | ||
3866 | keys) | ||
3867 | @item liveness (keep-alive messages, timeout) | ||
3868 | @end itemize | ||
3869 | |||
3870 | @menu | ||
3871 | * Limitations:: | ||
3872 | * When is a peer "connected"?:: | ||
3873 | * libgnunetcore:: | ||
3874 | * The CORE Client-Service Protocol:: | ||
3875 | * The CORE Peer-to-Peer Protocol:: | ||
3876 | @end menu | ||
3877 | |||
3878 | @node Limitations | ||
3879 | @subsection Limitations | ||
3880 | @c %**end of header | ||
3881 | |||
3882 | CORE does not perform @uref{http://en.wikipedia.org/wiki/Routing, routing}; | ||
3883 | using CORE it is only possible to communicate with peers that happen to | ||
3884 | already be "directly" connected with each other. CORE also does not have an | ||
3885 | API to allow applications to establish such "direct" connections --- for this, | ||
3886 | applications can ask TRANSPORT, but TRANSPORT might not be able to establish a | ||
3887 | "direct" connection. The TOPOLOGY subsystem is responsible for trying to keep | ||
3888 | a few "direct" connections open at all times. Applications that need to talk | ||
3889 | to particular peers should use the CADET subsystem, as it can establish | ||
3890 | arbitrary "indirect" connections. | ||
3891 | |||
3892 | Because CORE does not perform routing, CORE must only be used directly by | ||
3893 | applications that either perform their own routing logic (such as anonymous | ||
3894 | file-sharing) or that do not require routing, for example because they are | ||
3895 | based on flooding the network. CORE communication is unreliable and delivery | ||
3896 | is possibly out-of-order. Applications that require reliable communication | ||
3897 | should use the CADET service. Each application can only queue one message per | ||
3898 | target peer with the CORE service at any time; messages cannot be larger than | ||
3899 | approximately 63 kilobytes. If messages are small, CORE may group multiple | ||
3900 | messages (possibly from different applications) prior to encryption. If | ||
3901 | permitted by the application (using the @uref{http://baus.net/on-tcp_cork/, | ||
3902 | cork} option), CORE may delay transmissions to facilitate grouping of multiple | ||
3903 | small messages. If cork is not enabled, CORE will transmit the message as soon | ||
3904 | as TRANSPORT allows it (TRANSPORT is responsible for limiting bandwidth and | ||
3905 | congestion control). CORE does not allow flow control; applications are | ||
3906 | expected to process messages at line-speed. If flow control is needed, | ||
3907 | applications should use the CADET service. | ||
3908 | |||
3909 | @node When is a peer "connected"? | ||
3910 | @subsection When is a peer "connected"? | ||
3911 | @c %**end of header | ||
3912 | |||
3913 | In addition to the security features mentioned above, CORE also provides one | ||
3914 | additional key feature to applications using it, and that is a limited form of | ||
3915 | protocol-compatibility checking. CORE distinguishes between TRANSPORT-level | ||
3916 | connections (which enable communication with other peers) and | ||
3917 | application-level connections. Applications using the CORE API will | ||
3918 | (typically) learn about application-level connections from CORE, and not about | ||
3919 | TRANSPORT-level connections. When a typical application uses CORE, it will | ||
3920 | specify a set of message types (from @code{gnunet_protocols.h}) that it | ||
3921 | understands. CORE will then notify the application about connections it has | ||
3922 | with other peers if and only if those applications registered an intersecting | ||
3923 | set of message types with their CORE service. Thus, it is quite possible that | ||
3924 | CORE only exposes a subset of the established direct connections to a | ||
3925 | particular application --- and different applications running above CORE might | ||
3926 | see different sets of connections at the same time. | ||
3927 | |||
3928 | A special case are applications that do not register a handler for any message | ||
3929 | type. CORE assumes that these applications merely want to monitor connections | ||
3930 | (or "all" messages via other callbacks) and will notify those applications | ||
3931 | about all connections. This is used, for example, by the @code{gnunet-core} | ||
3932 | command-line tool to display the active connections. Note that it is also | ||
3933 | possible that the TRANSPORT service has more active connections than the CORE | ||
3934 | service, as the CORE service first has to perform a key exchange with | ||
3935 | connecting peers before exchanging information about supported message types | ||
3936 | and notifying applications about the new connection. | ||
3937 | |||
3938 | @node libgnunetcore | ||
3939 | @subsection libgnunetcore | ||
3940 | @c %**end of header | ||
3941 | |||
3942 | The CORE API (defined in @code{gnunet_core_service.h}) is the basic messaging | ||
3943 | API used by P2P applications built using GNUnet. It provides applications the | ||
3944 | ability to send and receive encrypted messages to the peer's "directly" | ||
3945 | connected neighbours. | ||
3946 | |||
3947 | As CORE connections are generally "direct" connections,@ applications must not | ||
3948 | assume that they can connect to arbitrary peers this way, as "direct" | ||
3949 | connections may not always be possible. Applications using CORE are notified | ||
3950 | about which peers are connected. Creating new "direct" connections must be | ||
3951 | done using the TRANSPORT API. | ||
3952 | |||
3953 | The CORE API provides unreliable, out-of-order delivery. While the | ||
3954 | implementation tries to ensure timely, in-order delivery, both message losses | ||
3955 | and reordering are not detected and must be tolerated by the application. Most | ||
3956 | important, the core will NOT perform retransmission if messages could not be | ||
3957 | delivered. | ||
3958 | |||
3959 | Note that CORE allows applications to queue one message per connected peer. | ||
3960 | The rate at which each connection operates is influenced by the preferences | ||
3961 | expressed by local application as well as restrictions imposed by the other | ||
3962 | peer. Local applications can express their preferences for particular | ||
3963 | connections using the "performance" API of the ATS service. | ||
3964 | |||
3965 | Applications that require more sophisticated transmission capabilities such as | ||
3966 | TCP-like behavior, or if you intend to send messages to arbitrary remote | ||
3967 | peers, should use the CADET API. | ||
3968 | |||
3969 | The typical use of the CORE API is to connect to the CORE service using | ||
3970 | @code{GNUNET_CORE_connect}, process events from the CORE service (such as | ||
3971 | peers connecting, peers disconnecting and incoming messages) and send messages | ||
3972 | to connected peers using @code{GNUNET_CORE_notify_transmit_ready}. Note that | ||
3973 | applications must cancel pending transmission requests if they receive a | ||
3974 | disconnect event for a peer that had a transmission pending; furthermore, | ||
3975 | queueing more than one transmission request per peer per application using the | ||
3976 | service is not permitted. | ||
3977 | |||
3978 | The CORE API also allows applications to monitor all communications of the | ||
3979 | peer prior to encryption (for outgoing messages) or after decryption (for | ||
3980 | incoming messages). This can be useful for debugging, diagnostics or to | ||
3981 | establish the presence of cover traffic (for anonymity). As monitoring | ||
3982 | applications are often not interested in the payload, the monitoring callbacks | ||
3983 | can be configured to only provide the message headers (including the message | ||
3984 | type and size) instead of copying the full data stream to the monitoring | ||
3985 | client. | ||
3986 | |||
3987 | The init callback of the @code{GNUNET_CORE_connect} function is called with | ||
3988 | the hash of the public key of the peer. This public key is used to identify | ||
3989 | the peer globally in the GNUnet network. Applications are encouraged to check | ||
3990 | that the provided hash matches the hash that they are using (as theoretically | ||
3991 | the application may be using a different configuration file with a different | ||
3992 | private key, which would result in hard to find bugs). | ||
3993 | |||
3994 | As with most service APIs, the CORE API isolates applications from crashes of | ||
3995 | the CORE service. If the CORE service crashes, the application will see | ||
3996 | disconnect events for all existing connections. Once the connections are | ||
3997 | re-established, the applications will be receive matching connect events. | ||
3998 | |||
3999 | @node The CORE Client-Service Protocol | ||
4000 | @subsection The CORE Client-Service Protocol | ||
4001 | @c %**end of header | ||
4002 | |||
4003 | This section describes the protocol between an application using the CORE | ||
4004 | service (the client) and the CORE service process itself. | ||
4005 | |||
4006 | |||
4007 | @menu | ||
4008 | * Setup2:: | ||
4009 | * Notifications:: | ||
4010 | * Sending:: | ||
4011 | @end menu | ||
4012 | |||
4013 | @node Setup2 | ||
4014 | @subsubsection Setup2 | ||
4015 | @c %**end of header | ||
4016 | |||
4017 | When a client connects to the CORE service, it first sends a | ||
4018 | @code{InitMessage} which specifies options for the connection and a set of | ||
4019 | message type values which are supported by the application. The options | ||
4020 | bitmask specifies which events the client would like to be notified about. The | ||
4021 | options include: | ||
4022 | |||
4023 | @table @asis | ||
4024 | @item GNUNET_CORE_OPTION_NOTHING No notifications | ||
4025 | @item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting | ||
4026 | @item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after decryption) with | ||
4027 | full payload | ||
4028 | @item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader} | ||
4029 | of all inbound messages | ||
4030 | @item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound | ||
4031 | messages (prior to encryption) with full payload | ||
4032 | @item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all outbound | ||
4033 | messages | ||
4034 | @end table | ||
4035 | |||
4036 | Typical applications will only monitor for connection status changes. | ||
4037 | |||
4038 | The CORE service responds to the @code{InitMessage} with an | ||
4039 | @code{InitReplyMessage} which contains the peer's identity. Afterwards, both | ||
4040 | CORE and the client can send messages. | ||
4041 | |||
4042 | @node Notifications | ||
4043 | @subsubsection Notifications | ||
4044 | @c %**end of header | ||
4045 | |||
4046 | The CORE will send @code{ConnectNotifyMessage}s and | ||
4047 | @code{DisconnectNotifyMessage}s whenever peers connect or disconnect from the | ||
4048 | CORE (assuming their type maps overlap with the message types registered by | ||
4049 | the client). When the CORE receives a message that matches the set of message | ||
4050 | types specified during the @code{InitMessage} (or if monitoring is enabled in | ||
4051 | for inbound messages in the options), it sends a @code{NotifyTrafficMessage} | ||
4052 | with the peer identity of the sender and the decrypted payload. The same | ||
4053 | message format (except with @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} | ||
4054 | for the message type) is used to notify clients monitoring outbound messages; | ||
4055 | here, the peer identity given is that of the receiver. | ||
4056 | |||
4057 | @node Sending | ||
4058 | @subsubsection Sending | ||
4059 | @c %**end of header | ||
4060 | |||
4061 | When a client wants to transmit a message, it first requests a transmission | ||
4062 | slot by sending a @code{SendMessageRequest} which specifies the priority, | ||
4063 | deadline and size of the message. Note that these values may be ignored by | ||
4064 | CORE. When CORE is ready for the message, it answers with a | ||
4065 | @code{SendMessageReady} response. The client can then transmit the payload | ||
4066 | with a @code{SendMessage} message. Note that the actual message size in the | ||
4067 | @code{SendMessage} is allowed to be smaller than the size in the original | ||
4068 | request. A client may at any time send a fresh @code{SendMessageRequest}, | ||
4069 | which then superceeds the previous @code{SendMessageRequest}, which is then no | ||
4070 | longer valid. The client can tell which @code{SendMessageRequest} the CORE | ||
4071 | service's @code{SendMessageReady} message is for as all of these messages | ||
4072 | contain a "unique" request ID (based on a counter incremented by the client | ||
4073 | for each request). | ||
4074 | |||
4075 | @node The CORE Peer-to-Peer Protocol | ||
4076 | @subsection The CORE Peer-to-Peer Protocol | ||
4077 | @c %**end of header | ||
4078 | |||
4079 | |||
4080 | @menu | ||
4081 | * Creating the EphemeralKeyMessage:: | ||
4082 | * Establishing a connection:: | ||
4083 | * Encryption and Decryption:: | ||
4084 | * Type maps:: | ||
4085 | @end menu | ||
4086 | |||
4087 | @node Creating the EphemeralKeyMessage | ||
4088 | @subsubsection Creating the EphemeralKeyMessage | ||
4089 | @c %**end of header | ||
4090 | |||
4091 | When the CORE service starts, each peer creates a fresh ephemeral (ECC) | ||
4092 | public-private key pair and signs the corresponding @code{EphemeralKeyMessage} | ||
4093 | with its long-term key (which we usually call the peer's identity; the hash of | ||
4094 | the public long term key is what results in a @code{struct | ||
4095 | GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral key is ONLY used for an | ||
4096 | @uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, | ||
4097 | ECDHE} exchange by the CORE service to establish symmetric session keys. A | ||
4098 | peer will use the same @code{EphemeralKeyMessage} for all peers for | ||
4099 | @code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it will | ||
4100 | create a fresh ephemeral key (forgetting the old one) and broadcast the new | ||
4101 | @code{EphemeralKeyMessage} to all connected peers, resulting in fresh | ||
4102 | symmetric session keys. Note that peers independently decide on when to | ||
4103 | discard ephemeral keys; it is not a protocol violation to discard keys more | ||
4104 | often. Ephemeral keys are also never stored to disk; restarting a peer will | ||
4105 | thus always create a fresh ephemeral key. The use of ephemeral keys is what | ||
4106 | provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}. | ||
4107 | |||
4108 | Just before transmission, the @code{EphemeralKeyMessage} is patched to reflect | ||
4109 | the current sender_status, which specifies the current state of the connection | ||
4110 | from the point of view of the sender. The possible values are: | ||
4111 | |||
4112 | @table @asis | ||
4113 | @item KX_STATE_DOWN Initial value, never used on the network | ||
4114 | @item KX_STATE_KEY_SENT We sent our ephemeral key, do not know the key of the other | ||
4115 | peer | ||
4116 | @item KX_STATE_KEY_RECEIVED This peer has received a valid ephemeral key | ||
4117 | of the other peer, but we are waiting for the other peer to confirm it's | ||
4118 | authenticity (ability to decode) via challenge-response. | ||
4119 | @item KX_STATE_UP The | ||
4120 | connection is fully up from the point of view of the sender (now performing | ||
4121 | keep-alives) | ||
4122 | @item KX_STATE_REKEY_SENT The sender has initiated a rekeying | ||
4123 | operation; the other peer has so far failed to confirm a working connection | ||
4124 | using the new ephemeral key | ||
4125 | @end table | ||
4126 | |||
4127 | @node Establishing a connection | ||
4128 | @subsubsection Establishing a connection | ||
4129 | @c %**end of header | ||
4130 | |||
4131 | Peers begin their interaction by sending a @code{EphemeralKeyMessage} to the | ||
4132 | other peer once the TRANSPORT service notifies the CORE service about the | ||
4133 | connection. A peer receiving an @code{EphemeralKeyMessage} with a status | ||
4134 | indicating that the sender does not have the receiver's ephemeral key, the | ||
4135 | receiver's @code{EphemeralKeyMessage} is sent in response.@ Additionally, if | ||
4136 | the receiver has not yet confirmed the authenticity of the sender, it also | ||
4137 | sends an (encrypted)@code{PingMessage} with a challenge (and the identity of | ||
4138 | the target) to the other peer. Peers receiving a @code{PingMessage} respond | ||
4139 | with an (encrypted) @code{PongMessage} which includes the challenge. Peers | ||
4140 | receiving a @code{PongMessage} check the challenge, and if it matches set the | ||
4141 | connection to @code{KX_STATE_UP}. | ||
4142 | |||
4143 | @node Encryption and Decryption | ||
4144 | @subsubsection Encryption and Decryption | ||
4145 | @c %**end of header | ||
4146 | |||
4147 | All functions related to the key exchange and encryption/decryption of | ||
4148 | messages can be found in @code{gnunet-service-core_kx.c} (except for the | ||
4149 | cryptographic primitives, which are in @code{util/crypto*.c}).@ Given the key | ||
4150 | material from ECDHE, a | ||
4151 | @uref{http://en.wikipedia.org/wiki/Key_derivation_function, Key derivation | ||
4152 | function} is used to derive two pairs of encryption and decryption keys for | ||
4153 | AES-256 and TwoFish, as well as initialization vectors and authentication keys | ||
4154 | (for @uref{http://en.wikipedia.org/wiki/HMAC, HMAC}). The HMAC is computed | ||
4155 | over the encrypted payload. Encrypted messages include an iv_seed and the HMAC | ||
4156 | in the header. | ||
4157 | |||
4158 | Each encrypted message in the CORE service includes a sequence number and a | ||
4159 | timestamp in the encrypted payload. The CORE service remembers the largest | ||
4160 | observed sequence number and a bit-mask which represents which of the previous | ||
4161 | 32 sequence numbers were already used. Messages with sequence numbers lower | ||
4162 | than the largest observed sequence number minus 32 are discarded. Messages | ||
4163 | with a timestamp that is less than @code{REKEY_TOLERANCE} off (5 minutes) are | ||
4164 | also discarded. This of course means that system clocks need to be reasonably | ||
4165 | synchronized for peers to be able to communicate. Additionally, as the | ||
4166 | ephemeral key changes every 12h, a peer would not even be able to decrypt | ||
4167 | messages older than 12h. | ||
4168 | |||
4169 | @node Type maps | ||
4170 | @subsubsection Type maps | ||
4171 | @c %**end of header | ||
4172 | |||
4173 | Once an encrypted connection has been established, peers begin to exchange | ||
4174 | type maps. Type maps are used to allow the CORE service to determine which | ||
4175 | (encrypted) connections should be shown to which applications. A type map is | ||
4176 | an array of 65536 bits representing the different types of messages understood | ||
4177 | by applications using the CORE service. Each CORE service maintains this map, | ||
4178 | simply by setting the respective bit for each message type supported by any of | ||
4179 | the applications using the CORE service. Note that bits for message types | ||
4180 | embedded in higher-level protocols (such as MESH) will not be included in | ||
4181 | these type maps. | ||
4182 | |||
4183 | Typically, the type map of a peer will be sparse. Thus, the CORE service | ||
4184 | attempts to compress its type map using @code{gzip}-style compression | ||
4185 | ("deflate") prior to transmission. However, if the compression fails to | ||
4186 | compact the map, the map may also be transmitted without compression | ||
4187 | (resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or | ||
4188 | @code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively). Upon | ||
4189 | receiving a type map, the respective CORE service notifies applications about | ||
4190 | the connection to the other peer if they support any message type indicated in | ||
4191 | the type map (or no message type at all). If the CORE service experience a | ||
4192 | connect or disconnect event from an application, it updates its type map | ||
4193 | (setting or unsetting the respective bits) and notifies its neighbours about | ||
4194 | the change. The CORE services of the neighbours then in turn generate connect | ||
4195 | and disconnect events for the peer that sent the type map for their respective | ||
4196 | applications. As CORE messages may be lost, the CORE service confirms | ||
4197 | receiving a type map by sending back a | ||
4198 | @code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation (with | ||
4199 | the correct hash of the type map) is not received, the sender will retransmit | ||
4200 | the type map (with exponential back-off). | ||
4201 | |||
4202 | @node GNUnet's CADET subsystem | ||
4203 | @section GNUnet's CADET subsystem | ||
4204 | |||
4205 | The CADET subsystem in GNUnet is responsible for secure end-to-end | ||
4206 | communications between nodes in the GNUnet overlay network. CADET builds on the | ||
4207 | CORE subsystem which provides for the link-layer communication and then adds | ||
4208 | routing, forwarding and additional security to the connections. CADET offers | ||
4209 | the same cryptographic services as CORE, but on an end-to-end level. This is | ||
4210 | done so peers retransmitting traffic on behalf of other peers cannot access the | ||
4211 | payload data. | ||
4212 | |||
4213 | @itemize @bullet | ||
4214 | @item CADET provides confidentiality with so-called perfect forward secrecy; we | ||
4215 | use ECDHE powered by Curve25519 for the key exchange and then use symmetric | ||
4216 | encryption, encrypting with both AES-256 and Twofish | ||
4217 | @item authentication is achieved by signing the ephemeral keys using Ed25519, a | ||
4218 | deterministic variant of ECDSA | ||
4219 | @item integrity protection (using SHA-512 to do encrypt-then-MAC, although only | ||
4220 | 256 bits are sent to reduce overhead) | ||
4221 | @item replay protection (using nonces, timestamps, challenge-response, message | ||
4222 | counters and ephemeral keys) | ||
4223 | @item liveness (keep-alive messages, timeout) | ||
4224 | @end itemize | ||
4225 | |||
4226 | Additional to the CORE-like security benefits, CADET offers other properties | ||
4227 | that make it a more universal service than CORE. | ||
4228 | |||
4229 | @itemize @bullet | ||
4230 | @item CADET can establish channels to arbitrary peers in GNUnet. If a peer is | ||
4231 | not immediately reachable, CADET will find a path through the network and ask | ||
4232 | other peers to retransmit the traffic on its behalf. | ||
4233 | @item CADET offers (optional) reliability mechanisms. In a reliable channel | ||
4234 | traffic is guaranteed to arrive complete, unchanged and in-order. | ||
4235 | @item CADET takes care of flow and congestion control mechanisms, not allowing | ||
4236 | the sender to send more traffic than the receiver or the network are able to | ||
4237 | process. | ||
4238 | @end itemize | ||
4239 | |||
4240 | @menu | ||
4241 | * libgnunetcadet:: | ||
4242 | @end menu | ||
4243 | |||
4244 | @node libgnunetcadet | ||
4245 | @subsection libgnunetcadet | ||
4246 | |||
4247 | |||
4248 | The CADET API (defined in gnunet_cadet_service.h) is the messaging API used by | ||
4249 | P2P applications built using GNUnet. It provides applications the ability to | ||
4250 | send and receive encrypted messages to any peer participating in GNUnet. The | ||
4251 | API is heavily base on the CORE API. | ||
4252 | |||
4253 | CADET delivers messages to other peers in "channels". A channel is a permanent | ||
4254 | connection defined by a destination peer (identified by its public key) and a | ||
4255 | port number. Internally, CADET tunnels all channels towards a destiantion peer | ||
4256 | using one session key and relays the data on multiple "connections", | ||
4257 | independent from the channels. | ||
4258 | |||
4259 | Each channel has optional paramenters, the most important being the reliability | ||
4260 | flag. Should a message get lost on TRANSPORT/CORE level, if a channel is | ||
4261 | created with as reliable, CADET will retransmit the lost message and deliver it | ||
4262 | in order to the destination application. | ||
4263 | |||
4264 | To communicate with other peers using CADET, it is necessary to first connect | ||
4265 | to the service using @code{GNUNET_CADET_connect}. This function takes several | ||
4266 | parameters in form of callbacks, to allow the client to react to various | ||
4267 | events, like incoming channels or channels that terminate, as well as specify a | ||
4268 | list of ports the client wishes to listen to (at the moment it is not possible | ||
4269 | to start listening on further ports once connected, but nothing prevents a | ||
4270 | client to connect several times to CADET, even do one connection per listening | ||
4271 | port). The function returns a handle which has to be used for any further | ||
4272 | interaction with the service. | ||
4273 | |||
4274 | To connect to a remote peer a client has to call the | ||
4275 | @code{GNUNET_CADET_channel_create} function. The most important parameters | ||
4276 | given are the remote peer's identity (it public key) and a port, which | ||
4277 | specifies which application on the remote peer to connect to, similar to | ||
4278 | TCP/UDP ports. CADET will then find the peer in the GNUnet network and | ||
4279 | establish the proper low-level connections and do the necessary key exchanges | ||
4280 | to assure and authenticated, secure and verified communication. Similar to | ||
4281 | @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel} returns a handle | ||
4282 | to interact with the created channel. | ||
4283 | |||
4284 | For every message the client wants to send to the remote application, | ||
4285 | @code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the | ||
4286 | channel on which the message should be sent and the size of the message (but | ||
4287 | not the message itself!). Once CADET is ready to send the message, the provided | ||
4288 | callback will fire, and the message contents are provided to this callback. | ||
4289 | |||
4290 | Please note the CADET does not provide an explicit notification of when a | ||
4291 | channel is connected. In loosely connected networks, like big wireless mesh | ||
4292 | networks, this can take several seconds, even minutes in the worst case. To be | ||
4293 | alerted when a channel is online, a client can call | ||
4294 | @code{GNUNET_CADET_notify_transmit_ready} immediately after | ||
4295 | @code{GNUNET_CADET_create_channel}. When the callback is activated, it means | ||
4296 | that the channel is online. The callback can give 0 bytes to CADET if no | ||
4297 | message is to be sent, this is ok. | ||
4298 | |||
4299 | If a transmission was requested but before the callback fires it is no longer | ||
4300 | needed, it can be cancelled with | ||
4301 | @code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle given | ||
4302 | back by @code{GNUNET_CADET_notify_transmit_ready}. As in the case of CORE, only | ||
4303 | one message can be requested at a time: a client must not call | ||
4304 | @code{GNUNET_CADET_notify_transmit_ready} again until the callback is called or | ||
4305 | the request is cancelled. | ||
4306 | |||
4307 | When a channel is no longer needed, a client can call | ||
4308 | @code{GNUNET_CADET_channel_destroy} to get rid of it. Note that CADET will try | ||
4309 | to transmit all pending traffic before notifying the remote peer of the | ||
4310 | destruction of the channel, including retransmitting lost messages if the | ||
4311 | channel was reliable. | ||
4312 | |||
4313 | Incoming channels, channels being closed by the remote peer, and traffic on any | ||
4314 | incoming or outgoing channels are given to the client when CADET executes the | ||
4315 | callbacks given to it at the time of @code{GNUNET_CADET_connect}. | ||
4316 | |||
4317 | Finally, when an application no longer wants to use CADET, it should call | ||
4318 | @code{GNUNET_CADET_disconnect}, but first all channels and pending | ||
4319 | transmissions must be closed (otherwise CADET will complain). | ||
4320 | |||
4321 | @node GNUnet's NSE subsystem | ||
4322 | @section GNUnet's NSE subsystem | ||
4323 | |||
4324 | |||
4325 | NSE stands for Network Size Estimation. The NSE subsystem provides other | ||
4326 | subsystems and users with a rough estimate of the number of peers currently | ||
4327 | participating in the GNUnet overlay. The computed value is not a precise number | ||
4328 | as producing a precise number in a decentralized, efficient and secure way is | ||
4329 | impossible. While NSE's estimate is inherently imprecise, NSE also gives the | ||
4330 | expected range. For a peer that has been running in a stable network for a | ||
4331 | while, the real network size will typically (99.7% of the time) be in the range | ||
4332 | of [2/3 estimate, 3/2 estimate]. We will now give an overview of the algorithm | ||
4333 | used to calcualte the estimate; all of the details can be found in this | ||
4334 | technical report. | ||
4335 | |||
4336 | @menu | ||
4337 | * Motivation:: | ||
4338 | * Principle:: | ||
4339 | * libgnunetnse:: | ||
4340 | * The NSE Client-Service Protocol:: | ||
4341 | * The NSE Peer-to-Peer Protocol:: | ||
4342 | @end menu | ||
4343 | |||
4344 | @node Motivation | ||
4345 | @subsection Motivation | ||
4346 | |||
4347 | |||
4348 | Some subsytems, like DHT, need to know the size of the GNUnet network to | ||
4349 | optimize some parameters of their own protocol. The decentralized nature of | ||
4350 | GNUnet makes efficient and securely counting the exact number of peers | ||
4351 | infeasable. Although there are several decentralized algorithms to count the | ||
4352 | number of peers in a system, so far there is none to do so securely. Other | ||
4353 | protocols may allow any malicious peer to manipulate the final result or to | ||
4354 | take advantage of the system to perform DoS (Denial of Service) attacks against | ||
4355 | the network. GNUnet's NSE protocol avoids these drawbacks. | ||
4356 | |||
4357 | |||
4358 | |||
4359 | @menu | ||
4360 | * Security:: | ||
4361 | @end menu | ||
4362 | |||
4363 | @node Security | ||
4364 | @subsubsection Security | ||
4365 | |||
4366 | |||
4367 | The NSE subsystem is designed to be resilient against these attacks. It uses | ||
4368 | @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work} to | ||
4369 | prevent one peer from impersonating a large number of participants, which would | ||
4370 | otherwise allow an adversary to artifically inflate the estimate. The DoS | ||
4371 | protection comes from the time-based nature of the protocol: the estimates are | ||
4372 | calculated periodically and out-of-time traffic is either ignored or stored for | ||
4373 | later retransmission by benign peers. In particular, peers cannot trigger | ||
4374 | global network communication at will. | ||
4375 | |||
4376 | @node Principle | ||
4377 | @subsection Principle | ||
4378 | |||
4379 | |||
4380 | The algorithm calculates the estimate by finding the globally closest peer ID | ||
4381 | to a random, time-based value. | ||
4382 | |||
4383 | The idea is that the closer the ID is to the random value, the more "densely | ||
4384 | packed" the ID space is, and therefore, more peers are in the network. | ||
4385 | |||
4386 | |||
4387 | |||
4388 | @menu | ||
4389 | * Example:: | ||
4390 | * Algorithm:: | ||
4391 | * Target value:: | ||
4392 | * Timing:: | ||
4393 | * Controlled Flooding:: | ||
4394 | * Calculating the estimate:: | ||
4395 | @end menu | ||
4396 | |||
4397 | @node Example | ||
4398 | @subsubsection Example | ||
4399 | |||
4400 | |||
4401 | Suppose all peers have IDs between 0 and 100 (our ID space), and the random | ||
4402 | value is 42. If the closest peer has the ID 70 we can imagine that the average | ||
4403 | "distance" between peers is around 30 and therefore the are around 3 peers in | ||
4404 | the whole ID space. On the other hand, if the closest peer has the ID 44, we | ||
4405 | can imagine that the space is rather packed with peers, maybe as much as 50 of | ||
4406 | them. Naturally, we could have been rather unlucky, and there is only one peer | ||
4407 | and happens to have the ID 44. Thus, the current estimate is calculated as the | ||
4408 | average over multiple rounds, and not just a single sample. | ||
4409 | |||
4410 | @node Algorithm | ||
4411 | @subsubsection Algorithm | ||
4412 | |||
4413 | |||
4414 | Given that example, one can imagine that the job of the subsystem is to | ||
4415 | efficiently communicate the ID of the closest peer to the target value to all | ||
4416 | the other peers, who will calculate the estimate from it. | ||
4417 | |||
4418 | @node Target value | ||
4419 | @subsubsection Target value | ||
4420 | |||
4421 | @c %**end of header | ||
4422 | |||
4423 | The target value itself is generated by hashing the current time, rounded down | ||
4424 | to an agreed value. If the rounding amount is 1h (default) and the time is | ||
4425 | 12:34:56, the time to hash would be 12:00:00. The process is repeated each | ||
4426 | rouning amount (in this example would be every hour). Every repetition is | ||
4427 | called a round. | ||
4428 | |||
4429 | @node Timing | ||
4430 | @subsubsection Timing | ||
4431 | @c %**end of header | ||
4432 | |||
4433 | The NSE subsystem has some timing control to avoid everybody broadcasting its | ||
4434 | ID all at one. Once each peer has the target random value, it compares its own | ||
4435 | ID to the target and calculates the hypothetical size of the network if that | ||
4436 | peer were to be the closest. Then it compares the hypothetical size with the | ||
4437 | estimate from the previous rounds. For each value there is an assiciated point | ||
4438 | in the period, let's call it "broadcast time". If its own hypothetical estimate | ||
4439 | is the same as the previous global estimate, its "broadcast time" will be in | ||
4440 | the middle of the round. If its bigger it will be earlier and if its smaler | ||
4441 | (the most likely case) it will be later. This ensures that the peers closests | ||
4442 | to the target value start broadcasting their ID the first. | ||
4443 | |||
4444 | @node Controlled Flooding | ||
4445 | @subsubsection Controlled Flooding | ||
4446 | |||
4447 | @c %**end of header | ||
4448 | |||
4449 | When a peer receives a value, first it verifies that it is closer than the | ||
4450 | closest value it had so far, otherwise it answers the incoming message with a | ||
4451 | message containing the better value. Then it checks a proof of work that must | ||
4452 | be included in the incoming message, to ensure that the other peer's ID is not | ||
4453 | made up (otherwise a malicious peer could claim to have an ID of exactly the | ||
4454 | target value every round). Once validated, it compares the brodcast time of the | ||
4455 | received value with the current time and if it's not too early, sends the | ||
4456 | received value to its neighbors. Otherwise it stores the value until the | ||
4457 | correct broadcast time comes. This prevents unnecessary traffic of sub-optimal | ||
4458 | values, since a better value can come before the broadcast time, rendering the | ||
4459 | previous one obsolete and saving the traffic that would have been used to | ||
4460 | broadcast it to the neighbors. | ||
4461 | |||
4462 | @node Calculating the estimate | ||
4463 | @subsubsection Calculating the estimate | ||
4464 | |||
4465 | @c %**end of header | ||
4466 | |||
4467 | Once the closest ID has been spread across the network each peer gets the exact | ||
4468 | distance betweed this ID and the target value of the round and calculates the | ||
4469 | estimate with a mathematical formula described in the tech report. The estimate | ||
4470 | generated with this method for a single round is not very precise. Remember the | ||
4471 | case of the example, where the only peer is the ID 44 and we happen to generate | ||
4472 | the target value 42, thinking there are 50 peers in the network. Therefore, the | ||
4473 | NSE subsystem remembers the last 64 estimates and calculates an average over | ||
4474 | them, giving a result of which usually has one bit of uncertainty (the real | ||
4475 | size could be half of the estimate or twice as much). Note that the actual | ||
4476 | network size is calculated in powers of two of the raw input, thus one bit of | ||
4477 | uncertainty means a factor of two in the size estimate. | ||
4478 | |||
4479 | @node libgnunetnse | ||
4480 | @subsection libgnunetnse | ||
4481 | |||
4482 | @c %**end of header | ||
4483 | |||
4484 | The NSE subsystem has the simplest API of all services, with only two calls: | ||
4485 | @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}. | ||
4486 | |||
4487 | The connect call gets a callback function as a parameter and this function is | ||
4488 | called each time the network agrees on an estimate. This usually is once per | ||
4489 | round, with some exceptions: if the closest peer has a late local clock and | ||
4490 | starts spreading his ID after everyone else agreed on a value, the callback | ||
4491 | might be activated twice in a round, the second value being always bigger than | ||
4492 | the first. The default round time is set to 1 hour. | ||
4493 | |||
4494 | The disconnect call disconnects from the NSE subsystem and the callback is no | ||
4495 | longer called with new estimates. | ||
4496 | |||
4497 | |||
4498 | |||
4499 | @menu | ||
4500 | * Results:: | ||
4501 | * Examples2:: | ||
4502 | @end menu | ||
4503 | |||
4504 | @node Results | ||
4505 | @subsubsection Results | ||
4506 | |||
4507 | @c %**end of header | ||
4508 | |||
4509 | The callback provides two values: the average and the | ||
4510 | @uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation} of | ||
4511 | the last 64 rounds. The values provided by the callback function are | ||
4512 | logarithmic, this means that the real estimate numbers can be obtained by | ||
4513 | calculating 2 to the power of the given value (2average). From a statistics | ||
4514 | point of view this means that: | ||
4515 | |||
4516 | @itemize @bullet | ||
4517 | @item 68% of the time the real size is included in the interval | ||
4518 | [(2average-stddev), 2] | ||
4519 | @item 95% of the time the real size is included in the interval | ||
4520 | [(2average-2*stddev, 2^average+2*stddev] | ||
4521 | @item 99.7% of the time the real size is included in the interval | ||
4522 | [(2average-3*stddev, 2average+3*stddev] | ||
4523 | @end itemize | ||
4524 | |||
4525 | The expected standard variation for 64 rounds in a network of stable size is | ||
4526 | 0.2. Thus, we can say that normally: | ||
4527 | |||
4528 | @itemize @bullet | ||
4529 | @item 68% of the time the real size is in the range [-13%, +15%] | ||
4530 | @item 95% of the time the real size is in the range [-24%, +32%] | ||
4531 | @item 99.7% of the time the real size is in the range [-34%, +52%] | ||
4532 | @end itemize | ||
4533 | |||
4534 | As said in the introduction, we can be quite sure that usually the real size is | ||
4535 | between one third and three times the estimate. This can of course vary with | ||
4536 | network conditions. Thus, applications may want to also consider the provided | ||
4537 | standard deviation value, not only the average (in particular, if the standard | ||
4538 | veriation is very high, the average maybe meaningless: the network size is | ||
4539 | changing rapidly). | ||
4540 | |||
4541 | @node Examples2 | ||
4542 | @subsubsection Examples2 | ||
4543 | |||
4544 | @c %**end of header | ||
4545 | |||
4546 | Let's close with a couple examples. | ||
4547 | |||
4548 | @table @asis | ||
4549 | |||
4550 | @item Average: 10, std dev: 1 Here the estimate would be 2^10 = 1024 peers.@ | ||
4551 | The range in which we can be 95% sure is: [2^8, 2^12] = [256, 4096]. We can be | ||
4552 | very (>99.7%) sure that the network is not a hundred peers and absolutely sure | ||
4553 | that it is not a million peers, but somewhere around a thousand. | ||
4554 | |||
4555 | @item Average 22, std dev: 0.2 Here the estimate would be 2^22 = 4 Million peers.@ | ||
4556 | The range in which we can be 99.7% sure is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. | ||
4557 | We can be sure that the network size is around four million, with absolutely | ||
4558 | way of it being 1 million. | ||
4559 | |||
4560 | @end table | ||
4561 | |||
4562 | To put this in perspective, if someone remembers the LHC Higgs boson results, | ||
4563 | were announced with "5 sigma" and "6 sigma" certainties. In this case a 5 sigma | ||
4564 | minimum would be 2 million and a 6 sigma minimum, 1.8 million. | ||
4565 | |||
4566 | @node The NSE Client-Service Protocol | ||
4567 | @subsection The NSE Client-Service Protocol | ||
4568 | |||
4569 | @c %**end of header | ||
4570 | |||
4571 | As with the API, the client-service protocol is very simple, only has 2 | ||
4572 | different messages, defined in @code{src/nse/nse.h}: | ||
4573 | |||
4574 | @itemize @bullet | ||
4575 | @item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters and | ||
4576 | is sent from the client to the service upon connection. | ||
4577 | @item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from the | ||
4578 | service to the client for every new estimate and upon connection. Contains a | ||
4579 | timestamp for the estimate, the average and the standard deviation for the | ||
4580 | respective round. | ||
4581 | @end itemize | ||
4582 | |||
4583 | When the @code{GNUNET_NSE_disconnect} API call is executed, the client simply | ||
4584 | disconnects from the service, with no message involved. | ||
4585 | |||
4586 | @node The NSE Peer-to-Peer Protocol | ||
4587 | @subsection The NSE Peer-to-Peer Protocol | ||
4588 | |||
4589 | @c %**end of header | ||
4590 | |||
4591 | The NSE subsystem only has one message in the P2P protocol, the | ||
4592 | @code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message. | ||
4593 | |||
4594 | This message key contents are the timestamp to identify the round (differences | ||
4595 | in system clocks may cause some peers to send messages way too early or way too | ||
4596 | late, so the timestamp allows other peers to identify such messages easily), | ||
4597 | the @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work} | ||
4598 | used to make it difficult to mount a | ||
4599 | @uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the public | ||
4600 | key, which is used to verify the signature on the message. | ||
4601 | |||
4602 | Every peer stores a message for the previous, current and next round. The | ||
4603 | messages for the previous and current round are given to peers that connect to | ||
4604 | us. The message for the next round is simply stored until our system clock | ||
4605 | advances to the next round. The message for the current round is what we are | ||
4606 | flooding the network with right now. At the beginning of each round the peer | ||
4607 | does the following: | ||
4608 | |||
4609 | @itemize @bullet | ||
4610 | @item calculates his own distance to the target value | ||
4611 | @item creates, signs and stores the message for the current round (unless it | ||
4612 | has a better message in the "next round" slot which came early in the previous | ||
4613 | round) | ||
4614 | @item calculates, based on the stored round message (own or received) when to | ||
4615 | stard flooding it to its neighbors | ||
4616 | @end itemize | ||
4617 | |||
4618 | Upon receiving a message the peer checks the validity of the message (round, | ||
4619 | proof of work, signature). The next action depends on the contents of the | ||
4620 | incoming message: | ||
4621 | |||
4622 | @itemize @bullet | ||
4623 | @item if the message is worse than the current stored message, the peer sends | ||
4624 | the current message back immediately, to stop the other peer from spreading | ||
4625 | suboptimal results | ||
4626 | @item if the message is better than the current stored message, the peer stores | ||
4627 | the new message and calculates the new target time to start spreading it to its | ||
4628 | neighbors (excluding the one the message came from) | ||
4629 | @item if the message is for the previous round, it is compared to the message | ||
4630 | stored in the "previous round slot", which may then be updated | ||
4631 | @item if the message is for the next round, it is compared to the message | ||
4632 | stored in the "next round slot", which again may then be updated | ||
4633 | @end itemize | ||
4634 | |||
4635 | Finally, when it comes to send the stored message for the current round to the | ||
4636 | neighbors there is a random delay added for each neighbor, to avoid traffic | ||
4637 | spikes and minimize cross-messages. | ||
4638 | |||
4639 | @node GNUnet's HOSTLIST subsystem | ||
4640 | @section GNUnet's HOSTLIST subsystem | ||
4641 | |||
4642 | @c %**end of header | ||
4643 | |||
4644 | Peers in the GNUnet overlay network need address information so that they can | ||
4645 | connect with other peers. GNUnet uses so called HELLO messages to store and | ||
4646 | exchange peer addresses. GNUnet provides several methods for peers to obtain | ||
4647 | this information: | ||
4648 | |||
4649 | @itemize @bullet | ||
4650 | @item out-of-band exchange of HELLO messages (manually, using for example | ||
4651 | gnunet-peerinfo) | ||
4652 | @item HELLO messages shipped with GNUnet (automatic with distribution) | ||
4653 | @item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast) | ||
4654 | @item topology gossiping (learning from other peers we already connected to), | ||
4655 | and | ||
4656 | @item the HOSTLIST daemon covered in this section, which is particularly | ||
4657 | relevant for bootstrapping new peers. | ||
4658 | @end itemize | ||
4659 | |||
4660 | New peers have no existing connections (and thus cannot learn from gossip among | ||
4661 | peers), may not have other peers in their LAN and might be started with an | ||
4662 | outdated set of HELLO messages from the distribution. In this case, getting new | ||
4663 | peers to connect to the network requires either manual effort or the use of a | ||
4664 | HOSTLIST to obtain HELLOs. | ||
4665 | |||
4666 | @menu | ||
4667 | * HELLOs:: | ||
4668 | * Overview for the HOSTLIST subsystem:: | ||
4669 | * Interacting with the HOSTLIST daemon:: | ||
4670 | * Hostlist security address validation:: | ||
4671 | * The HOSTLIST daemon:: | ||
4672 | * The HOSTLIST server:: | ||
4673 | * The HOSTLIST client:: | ||
4674 | * Usage:: | ||
4675 | @end menu | ||
4676 | |||
4677 | @node HELLOs | ||
4678 | @subsection HELLOs | ||
4679 | |||
4680 | @c %**end of header | ||
4681 | |||
4682 | The basic information peers require to connect to other peers are contained in | ||
4683 | so called HELLO messages you can think of as a business card. Besides the | ||
4684 | identity of the peer (based on the cryptographic public key) a HELLO message | ||
4685 | may contain address information that specifies ways to contact a peer. By | ||
4686 | obtaining HELLO messages, a peer can learn how to contact other peers. | ||
4687 | |||
4688 | @node Overview for the HOSTLIST subsystem | ||
4689 | @subsection Overview for the HOSTLIST subsystem | ||
4690 | |||
4691 | @c %**end of header | ||
4692 | |||
4693 | The HOSTLIST subsystem provides a way to distribute and obtain contact | ||
4694 | information to connect to other peers using a simple HTTP GET request. It's | ||
4695 | implementation is split in three parts, the main file for the daemon itself | ||
4696 | (gnunet-daemon-hostlist.c), the HTTP client used to download peer information | ||
4697 | (hostlist-client.c) and the server component used to provide this information | ||
4698 | to other peers (hostlist-server.c). The server is basically a small HTTP web | ||
4699 | server (based on GNU libmicrohttpd) which provides a list of HELLOs known to | ||
4700 | the local peer for download. The client component is basically a HTTP client | ||
4701 | (based on libcurl) which can download hostlists from one or more websites. The | ||
4702 | hostlist format is a binary blob containing a sequence of HELLO messages. Note | ||
4703 | that any HTTP server can theoretically serve a hostlist, the build-in hostlist | ||
4704 | server makes it simply convenient to offer this service. | ||
4705 | |||
4706 | |||
4707 | @menu | ||
4708 | * Features:: | ||
4709 | * Limitations2:: | ||
4710 | @end menu | ||
4711 | |||
4712 | @node Features | ||
4713 | @subsubsection Features | ||
4714 | |||
4715 | @c %**end of header | ||
4716 | |||
4717 | The HOSTLIST daemon can: | ||
4718 | |||
4719 | @itemize @bullet | ||
4720 | @item provide HELLO messages with validated addresses obtained from PEERINFO to | ||
4721 | download for other peers | ||
4722 | @item download HELLO messages and forward these message to the TRANSPORT | ||
4723 | subsystem for validation | ||
4724 | @item advertises the URL of this peer's hostlist address to other peers via | ||
4725 | gossip | ||
4726 | @item automatically learn about hostlist servers from the gossip of other peers | ||
4727 | @end itemize | ||
4728 | |||
4729 | @node Limitations2 | ||
4730 | @subsubsection Limitations2 | ||
4731 | |||
4732 | @c %**end of header | ||
4733 | |||
4734 | The HOSTLIST daemon does not: | ||
4735 | |||
4736 | @itemize @bullet | ||
4737 | @item verify the cryptographic information in the HELLO messages | ||
4738 | @item verify the address information in the HELLO messages | ||
4739 | @end itemize | ||
4740 | |||
4741 | @node Interacting with the HOSTLIST daemon | ||
4742 | @subsection Interacting with the HOSTLIST daemon | ||
4743 | |||
4744 | @c %**end of header | ||
4745 | |||
4746 | The HOSTLIST subsystem is currently implemented as a daemon, so there is no | ||
4747 | need for the user to interact with it and therefore there is no command line | ||
4748 | tool and no API to communicate with the daemon. In the future, we can envision | ||
4749 | changing this to allow users to manually trigger the download of a hostlist. | ||
4750 | |||
4751 | Since there is no command line interface to interact with HOSTLIST, the only | ||
4752 | way to interact with the hostlist is to use STATISTICS to obtain or modify | ||
4753 | information about the status of HOSTLIST: | ||
4754 | @example | ||
4755 | $ gnunet-statistics -s hostlist | ||
4756 | @end example | ||
4757 | |||
4758 | In particular, HOSTLIST includes a @strong{persistent} value in statistics that | ||
4759 | specifies when the hostlist server might be queried next. As this value is | ||
4760 | exponentially increasing during runtime, developers may want to reset or | ||
4761 | manually adjust it. Note that HOSTLIST (but not STATISTICS) needs to be | ||
4762 | shutdown if changes to this value are to have any effect on the daemon (as | ||
4763 | HOSTLIST does not monitor STATISTICS for changes to the download | ||
4764 | frequency). | ||
4765 | |||
4766 | @node Hostlist security address validation | ||
4767 | @subsection Hostlist security address validation | ||
4768 | |||
4769 | @c %**end of header | ||
4770 | |||
4771 | Since information obtained from other parties cannot be trusted without | ||
4772 | validation, we have to distinguish between @emph{validated} and @emph{not | ||
4773 | validated} addresses. Before using (and so trusting) information from other | ||
4774 | parties, this information has to be double-checked (validated). Address | ||
4775 | validation is not done by HOSTLIST but by the TRANSPORT service. | ||
4776 | |||
4777 | The HOSTLIST component is functionally located between the PEERINFO and the | ||
4778 | TRANSPORT subsystem. When acting as a server, the daemon obtains valid | ||
4779 | (@emph{validated}) peer information (HELLO messages) from the PEERINFO service | ||
4780 | and provides it to other peers. When acting as a client, it contacts the | ||
4781 | HOSTLIST servers specified in the configuration, downloads the (unvalidated) | ||
4782 | list of HELLO messages and forwards these information to the TRANSPORT server | ||
4783 | to validate the addresses. | ||
4784 | |||
4785 | @node The HOSTLIST daemon | ||
4786 | @subsection The HOSTLIST daemon | ||
4787 | |||
4788 | @c %**end of header | ||
4789 | |||
4790 | The hostlist daemon is the main component of the HOSTLIST subsystem. It is | ||
4791 | started by the ARM service and (if configured) starts the HOSTLIST client and | ||
4792 | server components. | ||
4793 | |||
4794 | If the daemon provides a hostlist itself it can advertise it's own hostlist to | ||
4795 | other peers. To do so it sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT | ||
4796 | message to other peers when they connect to this peer on the CORE level. This | ||
4797 | hostlist advertisement message contains the URL to access the HOSTLIST HTTP | ||
4798 | server of the sender. The daemon may also subscribe to this type of message | ||
4799 | from CORE service, and then forward these kind of message to the HOSTLIST | ||
4800 | client. The client then uses all available URLs to download peer information | ||
4801 | when necessary. | ||
4802 | |||
4803 | When starting, the HOSTLIST daemon first connects to the CORE subsystem and if | ||
4804 | hostlist learning is enabled, registers a CORE handler to receive this kind of | ||
4805 | messages. Next it starts (if configured) the client and server. It passes | ||
4806 | pointers to CORE connect and disconnect and receive handlers where the client | ||
4807 | and server store their functions, so the daemon can notify them about CORE | ||
4808 | events. | ||
4809 | |||
4810 | To clean up on shutdown, the daemon has a cleaning task, shutting down all | ||
4811 | subsystems and disconnecting from CORE. | ||
4812 | |||
4813 | @node The HOSTLIST server | ||
4814 | @subsection The HOSTLIST server | ||
4815 | |||
4816 | @c %**end of header | ||
4817 | |||
4818 | The server provides a way for other peers to obtain HELLOs. Basically it is a | ||
4819 | small web server other peers can connect to and download a list of HELLOs using | ||
4820 | standard HTTP; it may also advertise the URL of the hostlist to other peers | ||
4821 | connecting on CORE level. | ||
4822 | |||
4823 | |||
4824 | @menu | ||
4825 | * The HTTP Server:: | ||
4826 | * Advertising the URL:: | ||
4827 | @end menu | ||
4828 | |||
4829 | @node The HTTP Server | ||
4830 | @subsubsection The HTTP Server | ||
4831 | |||
4832 | @c %**end of header | ||
4833 | |||
4834 | During startup, the server starts a web server listening on the port specified | ||
4835 | with the HTTPPORT value (default 8080). In addition it connects to the PEERINFO | ||
4836 | service to obtain peer information. The HOSTLIST server uses the | ||
4837 | GNUNET_PEERINFO_iterate function to request HELLO information for all peers and | ||
4838 | adds their information to a new hostlist if they are suitable (expired | ||
4839 | addresses and HELLOs without addresses are both not suitable) and the maximum | ||
4840 | size for a hostlist is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When | ||
4841 | PEERINFO finishes (with a last NULL callback), the server destroys the previous | ||
4842 | hostlist response available for download on the web server and replaces it with | ||
4843 | the updated hostlist. The hostlist format is basically a sequence of HELLO | ||
4844 | messages (as obtained from PEERINFO) without any special tokenization. Since | ||
4845 | each HELLO message contains a size field, the response can easily be split into | ||
4846 | separate HELLO messages by the client. | ||
4847 | |||
4848 | A HOSTLIST client connecting to the HOSTLIST server will receive the hostlist | ||
4849 | as a HTTP response and the the server will terminate the connection with the | ||
4850 | result code HTTP 200 OK. The connection will be closed immediately if no | ||
4851 | hostlist is available. | ||
4852 | |||
4853 | @node Advertising the URL | ||
4854 | @subsubsection Advertising the URL | ||
4855 | |||
4856 | @c %**end of header | ||
4857 | |||
4858 | The server also advertises the URL to download the hostlist to other peers if | ||
4859 | hostlist advertisement is enabled. When a new peer connects and has hostlist | ||
4860 | learning enabled, the server sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT | ||
4861 | message to this peer using the CORE service. | ||
4862 | |||
4863 | @node The HOSTLIST client | ||
4864 | @subsection The HOSTLIST client | ||
4865 | |||
4866 | @c %**end of header | ||
4867 | |||
4868 | The client provides the functionality to download the list of HELLOs from a set | ||
4869 | of URLs. It performs a standard HTTP request to the URLs configured and learned | ||
4870 | from advertisement messages received from other peers. When a HELLO is | ||
4871 | downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT service for | ||
4872 | validation. | ||
4873 | |||
4874 | The client supports two modes of operation: download of HELLOs (bootstrapping) | ||
4875 | and learning of URLs. | ||
4876 | |||
4877 | |||
4878 | @menu | ||
4879 | * Bootstrapping:: | ||
4880 | * Learning:: | ||
4881 | @end menu | ||
4882 | |||
4883 | @node Bootstrapping | ||
4884 | @subsubsection Bootstrapping | ||
4885 | |||
4886 | @c %**end of header | ||
4887 | |||
4888 | For bootstrapping, it schedules a task to download the hostlist from the set of | ||
4889 | known URLs. The downloads are only performed if the number of current | ||
4890 | connections is smaller than a minimum number of connections (at the moment 4). | ||
4891 | The interval between downloads increases exponentially; however, the | ||
4892 | exponential growth is limited if it becomes longer than an hour. At that point, | ||
4893 | the frequency growth is capped at (#number of connections * 1h). | ||
4894 | |||
4895 | Once the decision has been taken to download HELLOs, the daemon chooses a | ||
4896 | random URL from the list of known URLs. URLs can be configured in the | ||
4897 | configuration or be learned from advertisement messages. The client uses a HTTP | ||
4898 | client library (libcurl) to initiate the download using the libcurl multi | ||
4899 | interface. Libcurl passes the data to the callback_download function which | ||
4900 | stores the data in a buffer if space is available and the maximum size for a | ||
4901 | hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When a | ||
4902 | full HELLO was downloaded, the HOSTLIST client offers this HELLO message to the | ||
4903 | TRANSPORT service for validation. When the download is finished or failed, | ||
4904 | statistical information about the quality of this URL is updated. | ||
4905 | |||
4906 | @node Learning | ||
4907 | @subsubsection Learning | ||
4908 | |||
4909 | @c %**end of header | ||
4910 | |||
4911 | The client also manages hostlist advertisements from other peers. The HOSTLIST | ||
4912 | daemon forwards GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT messages to the | ||
4913 | client subsystem, which extracts the URL from the message. Next, a test of the | ||
4914 | newly obtained URL is performed by triggering a download from the new URL. If | ||
4915 | the URL works correctly, it is added to the list of working URLs. | ||
4916 | |||
4917 | The size of the list of URLs is restricted, so if an additional server is added | ||
4918 | and the list is full, the URL with the worst quality ranking (determined | ||
4919 | through successful downloads and number of HELLOs e.g.) is discarded. During | ||
4920 | shutdown the list of URLs is saved to a file for persistance and loaded on | ||
4921 | startup. URLs from the configuration file are never discarded. | ||
4922 | |||
4923 | @node Usage | ||
4924 | @subsection Usage | ||
4925 | |||
4926 | @c %**end of header | ||
4927 | |||
4928 | To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES section | ||
4929 | for the ARM services. This is done in the default configuration. | ||
4930 | |||
4931 | For more information on how to configure the HOSTLIST subsystem see the | ||
4932 | installation handbook:@ Configuring the hostlist to bootstrap@ Configuring your | ||
4933 | peer to provide a hostlist | ||
4934 | |||
4935 | @node GNUnet's IDENTITY subsystem | ||
4936 | @section GNUnet's IDENTITY subsystem | ||
4937 | |||
4938 | @c %**end of header | ||
4939 | |||
4940 | Identities of "users" in GNUnet are called egos. Egos can be used as pseudonyms | ||
4941 | (fake names) or be tied to an organization (for example, GNU) or even the | ||
4942 | actual identity of a human. GNUnet users are expected to have many egos. They | ||
4943 | might have one tied to their real identity, some for organizations they manage, | ||
4944 | and more for different domains where they want to operate under a pseudonym. | ||
4945 | |||
4946 | The IDENTITY service allows users to manage their egos. The identity service | ||
4947 | manages the private keys egos of the local user; it does not manage identities | ||
4948 | of other users (public keys). Public keys for other users need names to become | ||
4949 | manageable. GNUnet uses the GNU Name System (GNS) to give names to other users | ||
4950 | and manage their public keys securely. This chapter is about the IDENTITY | ||
4951 | service, which is about the management of private keys. | ||
4952 | |||
4953 | On the network, an ego corresponds to an ECDSA key (over Curve25519, using RFC | ||
4954 | 6979, as required by GNS). Thus, users can perform actions under a particular | ||
4955 | ego by using (signing with) a particular private key. Other users can then | ||
4956 | confirm that the action was really performed by that ego by checking the | ||
4957 | signature against the respective public key. | ||
4958 | |||
4959 | The IDENTITY service allows users to associate a human-readable name with each | ||
4960 | ego. This way, users can use names that will remind them of the purpose of a | ||
4961 | particular ego. The IDENTITY service will store the respective private keys and | ||
4962 | allows applications to access key information by name. Users can change the | ||
4963 | name that is locally (!) associated with an ego. Egos can also be deleted, | ||
4964 | which means that the private key will be removed and it thus will not be | ||
4965 | possible to perform actions with that ego in the future. | ||
4966 | |||
4967 | Additionally, the IDENTITY subsystem can associate service functions with egos. | ||
4968 | For example, GNS requires the ego that should be used for the shorten zone. GNS | ||
4969 | will ask IDENTITY for an ego for the "gns-short" service. The IDENTITY service | ||
4970 | has a mapping of such service strings to the name of the ego that the user | ||
4971 | wants to use for this service, for example "my-short-zone-ego". | ||
4972 | |||
4973 | Finally, the IDENTITY API provides access to a special ego, the anonymous ego. | ||
4974 | The anonymous ego is special in that its private key is not really private, but | ||
4975 | fixed and known to everyone. Thus, anyone can perform actions as anonymous. | ||
4976 | This can be useful as with this trick, code does not have to contain a special | ||
4977 | case to distinguish between anonymous and pseudonymous egos. | ||
4978 | |||
4979 | @menu | ||
4980 | * libgnunetidentity:: | ||
4981 | * The IDENTITY Client-Service Protocol:: | ||
4982 | @end menu | ||
4983 | |||
4984 | @node libgnunetidentity | ||
4985 | @subsection libgnunetidentity | ||
4986 | @c %**end of header | ||
4987 | |||
4988 | |||
4989 | @menu | ||
4990 | * Connecting to the service:: | ||
4991 | * Operations on Egos:: | ||
4992 | * The anonymous Ego:: | ||
4993 | * Convenience API to lookup a single ego:: | ||
4994 | * Associating egos with service functions:: | ||
4995 | @end menu | ||
4996 | |||
4997 | @node Connecting to the service | ||
4998 | @subsubsection Connecting to the service | ||
4999 | |||
5000 | @c %**end of header | ||
5001 | |||
5002 | First, typical clients connect to the identity service using | ||
5003 | @code{GNUNET_IDENTITY_connect}. This function takes a callback as a parameter. | ||
5004 | If the given callback parameter is non-null, it will be invoked to notify the | ||
5005 | application about the current state of the identities in the system. | ||
5006 | |||
5007 | @itemize @bullet | ||
5008 | @item First, it will be invoked on all known egos at the time of the | ||
5009 | connection. For each ego, a handle to the ego and the user's name for the ego | ||
5010 | will be passed to the callback. Furthermore, a @code{void **} context argument | ||
5011 | will be provided which gives the client the opportunity to associate some state | ||
5012 | with the ego. | ||
5013 | @item Second, the callback will be invoked with NULL for the ego, the name and | ||
5014 | the context. This signals that the (initial) iteration over all egos has | ||
5015 | completed. | ||
5016 | @item Then, the callback will be invoked whenever something changes about an | ||
5017 | ego. If an ego is renamed, the callback is invoked with the ego handle of the | ||
5018 | ego that was renamed, and the new name. If an ego is deleted, the callback is | ||
5019 | invoked with the ego handle and a name of NULL. In the deletion case, the | ||
5020 | application should also release resources stored in the context. | ||
5021 | @item When the application destroys the connection to the identity service | ||
5022 | using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked with the | ||
5023 | ego and a name of NULL (equivalent to deletion of the egos). This should again | ||
5024 | be used to clean up the per-ego context. | ||
5025 | @end itemize | ||
5026 | |||
5027 | The ego handle passed to the callback remains valid until the callback is | ||
5028 | invoked with a name of NULL, so it is safe to store a reference to the ego's | ||
5029 | handle. | ||
5030 | |||
5031 | @node Operations on Egos | ||
5032 | @subsubsection Operations on Egos | ||
5033 | |||
5034 | @c %**end of header | ||
5035 | |||
5036 | Given an ego handle, the main operations are to get its associated private key | ||
5037 | using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated public key | ||
5038 | using @code{GNUNET_IDENTITY_ego_get_public_key}. | ||
5039 | |||
5040 | The other operations on egos are pretty straightforward. Using | ||
5041 | @code{GNUNET_IDENTITY_create}, an application can request the creation of an | ||
5042 | ego by specifying the desired name. The operation will fail if that name is | ||
5043 | already in use. Using @code{GNUNET_IDENTITY_rename} the name of an existing ego | ||
5044 | can be changed. Finally, egos can be deleted using | ||
5045 | @code{GNUNET_IDENTITY_delete}. All of these operations will trigger updates to | ||
5046 | the callback given to the @code{GNUNET_IDENTITY_connect} function of all | ||
5047 | applications that are connected with the identity service at the time. | ||
5048 | @code{GNUNET_IDENTITY_cancel} can be used to cancel the operations before the | ||
5049 | respective continuations would be called. It is not guaranteed that the | ||
5050 | operation will not be completed anyway, only the continuation will no longer be | ||
5051 | called. | ||
5052 | |||
5053 | @node The anonymous Ego | ||
5054 | @subsubsection The anonymous Ego | ||
5055 | |||
5056 | @c %**end of header | ||
5057 | |||
5058 | A special way to obtain an ego handle is to call | ||
5059 | @code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the | ||
5060 | "anonymous" user --- anyone knows and can get the private key for this user, so | ||
5061 | it is suitable for operations that are supposed to be anonymous but require | ||
5062 | signatures (for example, to avoid a special path in the code). The anonymous | ||
5063 | ego is always valid and accessing it does not require a connection to the | ||
5064 | identity service. | ||
5065 | |||
5066 | @node Convenience API to lookup a single ego | ||
5067 | @subsubsection Convenience API to lookup a single ego | ||
5068 | |||
5069 | |||
5070 | As applications commonly simply have to lookup a single ego, there is a | ||
5071 | convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to | ||
5072 | lookup a single ego by name. Note that this is the user's name for the ego, not | ||
5073 | the service function. The resulting ego will be returned via a callback and | ||
5074 | will only be valid during that callback. The operation can be cancelled via | ||
5075 | @code{GNUNET_IDENTITY_ego_lookup_cancel} (cancellation is only legal before the | ||
5076 | callback is invoked). | ||
5077 | |||
5078 | @node Associating egos with service functions | ||
5079 | @subsubsection Associating egos with service functions | ||
5080 | |||
5081 | |||
5082 | The @code{GNUNET_IDENTITY_set} function is used to associate a particular ego | ||
5083 | with a service function. The name used by the service and the ego are given as | ||
5084 | arguments. Afterwards, the service can use its name to lookup the associated | ||
5085 | ego using @code{GNUNET_IDENTITY_get}. | ||
5086 | |||
5087 | @node The IDENTITY Client-Service Protocol | ||
5088 | @subsection The IDENTITY Client-Service Protocol | ||
5089 | |||
5090 | @c %**end of header | ||
5091 | |||
5092 | A client connecting to the identity service first sends a message with type | ||
5093 | @code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the | ||
5094 | client will receive information about changes to the egos by receiving messages | ||
5095 | of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}. Those messages contain the | ||
5096 | private key of the ego and the user's name of the ego (or zero bytes for the | ||
5097 | name to indicate that the ego was deleted). A special bit @code{end_of_list} is | ||
5098 | used to indicate the end of the initial iteration over the identity service's | ||
5099 | egos. | ||
5100 | |||
5101 | The client can trigger changes to the egos by sending CREATE, RENAME or DELETE | ||
5102 | messages. The CREATE message contains the private key and the desired name. The | ||
5103 | RENAME message contains the old name and the new name. The DELETE message only | ||
5104 | needs to include the name of the ego to delete. The service responds to each of | ||
5105 | these messages with a RESULT_CODE message which indicates success or error of | ||
5106 | the operation, and possibly a human-readable error message. | ||
5107 | |||
5108 | Finally, the client can bind the name of a service function to an ego by | ||
5109 | sending a SET_DEFAULT message with the name of the service function and the | ||
5110 | private key of the ego. Such bindings can then be resolved using a GET_DEFAULT | ||
5111 | message, which includes the name of the service function. The identity service | ||
5112 | will respond to a GET_DEFAULT request with a SET_DEFAULT message containing the | ||
5113 | respective information, or with a RESULT_CODE to indicate an error. | ||
5114 | |||
5115 | @node GNUnet's NAMESTORE Subsystem | ||
5116 | @section GNUnet's NAMESTORE Subsystem | ||
5117 | |||
5118 | @c %**end of header | ||
5119 | |||
5120 | The NAMESTORE subsystem provides persistent storage for local GNS zone | ||
5121 | information. All local GNS zone information are managed by NAMESTORE. It | ||
5122 | provides both the functionality to administer local GNS information (e.g. | ||
5123 | delete and add records) as well as to retrieve GNS information (e.g to list | ||
5124 | name information in a client). NAMESTORE does only manage the persistent | ||
5125 | storage of zone information belonging to the user running the service: GNS | ||
5126 | information from other users obtained from the DHT are stored by the NAMECACHE | ||
5127 | subsystem. | ||
5128 | |||
5129 | NAMESTORE uses a plugin-based database backend to store GNS information with | ||
5130 | good performance. Here sqlite, MySQL and PostgreSQL are supported database | ||
5131 | backends. NAMESTORE clients interact with the IDENTITY subsystem to obtain | ||
5132 | cryptographic information about zones based on egos as described with the | ||
5133 | IDENTITY subsystem., but internally NAMESTORE refers to zones using the ECDSA | ||
5134 | private key. In addition, it collaborates with the NAMECACHE subsystem and | ||
5135 | stores zone information when local information are modified in the GNS cache to | ||
5136 | increase look-up performance for local information. | ||
5137 | |||
5138 | NAMESTORE provides functionality to look-up and store records, to iterate over | ||
5139 | a specific or all zones and to monitor zones for changes. NAMESTORE | ||
5140 | functionality can be accessed using the NAMESTORE api or the NAMESTORE command | ||
5141 | line tool. | ||
5142 | |||
5143 | @menu | ||
5144 | * libgnunetnamestore:: | ||
5145 | @end menu | ||
5146 | |||
5147 | @node libgnunetnamestore | ||
5148 | @subsection libgnunetnamestore | ||
5149 | |||
5150 | @c %**end of header | ||
5151 | |||
5152 | To interact with NAMESTORE clients first connect to the NAMESTORE service using | ||
5153 | the @code{GNUNET_NAMESTORE_connect} passing a configuration handle. As a result | ||
5154 | they obtain a NAMESTORE handle, they can use for operations, or NULL is | ||
5155 | returned if the connection failed. | ||
5156 | |||
5157 | To disconnect from NAMESTORE, clients use @code{GNUNET_NAMESTORE_disconnect} | ||
5158 | and specify the handle to disconnect. | ||
5159 | |||
5160 | NAMESTORE internally uses the ECDSA private key to refer to zones. These | ||
5161 | private keys can be obtained from the IDENTITY subsytem. Here @emph{egos@emph{ | ||
5162 | can be used to refer to zones or the default ego assigned to the GNS subsystem | ||
5163 | can be used to obtained the master zone's private key.}} | ||
5164 | |||
5165 | |||
5166 | @menu | ||
5167 | * Editing Zone Information:: | ||
5168 | * Iterating Zone Information:: | ||
5169 | * Monitoring Zone Information:: | ||
5170 | @end menu | ||
5171 | |||
5172 | @node Editing Zone Information | ||
5173 | @subsubsection Editing Zone Information | ||
5174 | |||
5175 | @c %**end of header | ||
5176 | |||
5177 | NAMESTORE provides functions to lookup records stored under a label in a zone | ||
5178 | and to store records under a label in a zone. | ||
5179 | |||
5180 | To store (and delete) records, the client uses the | ||
5181 | @code{GNUNET_NAMESTORE_records_store} function and has to provide namestore | ||
5182 | handle to use, the private key of the zone, the label to store the records | ||
5183 | under, the records and number of records plus an callback function. After the | ||
5184 | operation is performed NAMESTORE will call the provided callback function with | ||
5185 | the result GNUNET_SYSERR on failure (including timeout/queue drop/failure to | ||
5186 | validate), GNUNET_NO if content was already there or not found GNUNET_YES (or | ||
5187 | other positive value) on success plus an additional error message. | ||
5188 | |||
5189 | Records are deleted by using the store command with 0 records to store. It is | ||
5190 | important to note, that records are not merged when records exist with the | ||
5191 | label. So a client has first to retrieve records, merge with existing records | ||
5192 | and then store the result. | ||
5193 | |||
5194 | To perform a lookup operation, the client uses the | ||
5195 | @code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the | ||
5196 | namestore handle, the private key of the zone and the label. He also has to | ||
5197 | provide a callback function which will be called with the result of the lookup | ||
5198 | operation: the zone for the records, the label, and the records including the | ||
5199 | number of records included. | ||
5200 | |||
5201 | A special operation is used to set the preferred nickname for a zone. This | ||
5202 | nickname is stored with the zone and is automatically merged with all labels | ||
5203 | and records stored in a zone. Here the client uses the | ||
5204 | @code{GNUNET_NAMESTORE_set_nick} function and passes the private key of the | ||
5205 | zone, the nickname as string plus a the callback with the result of the | ||
5206 | operation. | ||
5207 | |||
5208 | @node Iterating Zone Information | ||
5209 | @subsubsection Iterating Zone Information | ||
5210 | |||
5211 | @c %**end of header | ||
5212 | |||
5213 | A client can iterate over all information in a zone or all zones managed by | ||
5214 | NAMESTORE. Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start} | ||
5215 | function and passes the namestore handle, the zone to iterate over and a | ||
5216 | callback function to call with the result. If the client wants to iterate over | ||
5217 | all the, he passes NULL for the zone. A @code{GNUNET_NAMESTORE_ZoneIterator} | ||
5218 | handle is returned to be used to continue iteration. | ||
5219 | |||
5220 | NAMESTORE calls the callback for every result and expects the client to call@ | ||
5221 | @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or | ||
5222 | @code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration. When | ||
5223 | NAMESTORE reached the last item it will call the callback with a NULL value to | ||
5224 | indicate. | ||
5225 | |||
5226 | @node Monitoring Zone Information | ||
5227 | @subsubsection Monitoring Zone Information | ||
5228 | |||
5229 | @c %**end of header | ||
5230 | |||
5231 | Clients can also monitor zones to be notified about changes. Here the clients | ||
5232 | uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and passes the | ||
5233 | private key of the zone and and a callback function to call with updates for a | ||
5234 | zone. The client can specify to obtain zone information first by iterating over | ||
5235 | the zone and specify a synchronization callback to be called when the client | ||
5236 | and the namestore are synced. | ||
5237 | |||
5238 | On an update, NAMESTORE will call the callback with the private key of the | ||
5239 | zone, the label and the records and their number. | ||
5240 | |||
5241 | To stop monitoring, the client call @code{GNUNET_NAMESTORE_zone_monitor_stop} | ||
5242 | and passes the handle obtained from the function to start the monitoring. | ||
5243 | |||
5244 | @node GNUnet's PEERINFO subsystem | ||
5245 | @section GNUnet's PEERINFO subsystem | ||
5246 | |||
5247 | @c %**end of header | ||
5248 | |||
5249 | The PEERINFO subsystem is used to store verified (validated) information about | ||
5250 | known peers in a persistent way. It obtains these addresses for example from | ||
5251 | TRANSPORT service which is in charge of address validation. Validation means | ||
5252 | that the information in the HELLO message are checked by connecting to the | ||
5253 | addresses and performing a cryptographic handshake to authenticate the peer | ||
5254 | instance stating to be reachable with these addresses. Peerinfo does not | ||
5255 | validate the HELLO messages itself but only stores them and gives them to | ||
5256 | interested clients. | ||
5257 | |||
5258 | As future work, we think about moving from storing just HELLO messages to | ||
5259 | providing a generic persistent per-peer information store. More and more | ||
5260 | subsystems tend to need to store per-peer information in persistent way. To not | ||
5261 | duplicate this functionality we plan to provide a PEERSTORE service providing | ||
5262 | this functionality | ||
5263 | |||
5264 | @menu | ||
5265 | * Features2:: | ||
5266 | * Limitations3:: | ||
5267 | * DeveloperPeer Information:: | ||
5268 | * Startup:: | ||
5269 | * Managing Information:: | ||
5270 | * Obtaining Information:: | ||
5271 | * The PEERINFO Client-Service Protocol:: | ||
5272 | * libgnunetpeerinfo:: | ||
5273 | @end menu | ||
5274 | |||
5275 | @node Features2 | ||
5276 | @subsection Features2 | ||
5277 | |||
5278 | @c %**end of header | ||
5279 | |||
5280 | @itemize @bullet | ||
5281 | @item Persistent storage | ||
5282 | @item Client notification mechanism on update | ||
5283 | @item Periodic clean up for expired information | ||
5284 | @item Differentiation between public and friend-only HELLO | ||
5285 | @end itemize | ||
5286 | |||
5287 | @node Limitations3 | ||
5288 | @subsection Limitations3 | ||
5289 | |||
5290 | |||
5291 | @itemize @bullet | ||
5292 | @item Does not perform HELLO validation | ||
5293 | @end itemize | ||
5294 | |||
5295 | @node DeveloperPeer Information | ||
5296 | @subsection DeveloperPeer Information | ||
5297 | |||
5298 | @c %**end of header | ||
5299 | |||
5300 | The PEERINFO subsystem stores these information in the form of HELLO messages | ||
5301 | you can think of as business cards. These HELLO messages contain the public key | ||
5302 | of a peer and the addresses a peer can be reached under. The addresses include | ||
5303 | an expiration date describing how long they are valid. This information is | ||
5304 | updated regularly by the TRANSPORT service by revalidating the address. If an | ||
5305 | address is expired and not renewed, it can be removed from the HELLO message. | ||
5306 | |||
5307 | Some peer do not want to have their HELLO messages distributed to other peers , | ||
5308 | especially when GNUnet's friend-to-friend modus is enabled. To prevent this | ||
5309 | undesired distribution. PEERINFO distinguishes between @emph{public} and | ||
5310 | @emph{friend-only} HELLO messages. Public HELLO messages can be freely | ||
5311 | distributed to other (possibly unknown) peers (for example using the hostlist, | ||
5312 | gossiping, broadcasting), whereas friend-only HELLO messages may not be | ||
5313 | distributed to other peers. Friend-only HELLO messages have an additional flag | ||
5314 | @code{friend_only} set internally. For public HELLO message this flag is not | ||
5315 | set. PEERINFO does and cannot not check if a client is allowed to obtain a | ||
5316 | specific HELLO type. | ||
5317 | |||
5318 | The HELLO messages can be managed using the GNUnet HELLO library. Other GNUnet | ||
5319 | systems can obtain these information from PEERINFO and use it for their | ||
5320 | purposes. Clients are for example the HOSTLIST component providing these | ||
5321 | information to other peers in form of a hostlist or the TRANSPORT subsystem | ||
5322 | using these information to maintain connections to other peers. | ||
5323 | |||
5324 | @node Startup | ||
5325 | @subsection Startup | ||
5326 | |||
5327 | @c %**end of header | ||
5328 | |||
5329 | During startup the PEERINFO services loads persistent HELLOs from disk. First | ||
5330 | PEERINFO parses the directory configured in the HOSTS value of the | ||
5331 | @code{PEERINFO} configuration section to store PEERINFO information.@ For all | ||
5332 | files found in this directory valid HELLO messages are extracted. In addition | ||
5333 | it loads HELLO messages shipped with the GNUnet distribution. These HELLOs are | ||
5334 | used to simplify network bootstrapping by providing valid peer information with | ||
5335 | the distribution. The use of these HELLOs can be prevented by setting the | ||
5336 | @code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to | ||
5337 | @code{NO}. Files containing invalid information are removed. | ||
5338 | |||
5339 | @node Managing Information | ||
5340 | @subsection Managing Information | ||
5341 | |||
5342 | @c %**end of header | ||
5343 | |||
5344 | The PEERINFO services stores information about known PEERS and a single HELLO | ||
5345 | message for every peer. A peer does not need to have a HELLO if no information | ||
5346 | are available. HELLO information from different sources, for example a HELLO | ||
5347 | obtained from a remote HOSTLIST and a second HELLO stored on disk, are combined | ||
5348 | and merged into one single HELLO message per peer which will be given to | ||
5349 | clients. During this merge process the HELLO is immediately written to disk to | ||
5350 | ensure persistence. | ||
5351 | |||
5352 | PEERINFO in addition periodically scans the directory where information are | ||
5353 | stored for empty HELLO messages with expired TRANSPORT addresses.@ This | ||
5354 | periodic task scans all files in the directory and recreates the HELLO messages | ||
5355 | it finds. Expired TRANSPORT addresses are removed from the HELLO and if the | ||
5356 | HELLO does not contain any valid addresses, it is discarded and removed from | ||
5357 | disk. | ||
5358 | |||
5359 | @node Obtaining Information | ||
5360 | @subsection Obtaining Information | ||
5361 | |||
5362 | @c %**end of header | ||
5363 | |||
5364 | When a client requests information from PEERINFO, PEERINFO performs a lookup | ||
5365 | for the respective peer or all peers if desired and transmits this information | ||
5366 | to the client. The client can specify if friend-only HELLOs have to be included | ||
5367 | or not and PEERINFO filters the respective HELLO messages before transmitting | ||
5368 | information. | ||
5369 | |||
5370 | To notify clients about changes to PEERINFO information, PEERINFO maintains a | ||
5371 | list of clients interested in this notifications. Such a notification occurs if | ||
5372 | a HELLO for a peer was updated (due to a merge for example) or a new peer was | ||
5373 | added. | ||
5374 | |||
5375 | @node The PEERINFO Client-Service Protocol | ||
5376 | @subsection The PEERINFO Client-Service Protocol | ||
5377 | |||
5378 | @c %**end of header | ||
5379 | |||
5380 | To connect and disconnect to and from the PEERINFO Service PEERINFO utilizes | ||
5381 | the util client/server infrastructure, so no special messages types are used | ||
5382 | here. | ||
5383 | |||
5384 | To add information for a peer, the plain HELLO message is transmitted to the | ||
5385 | service without any wrapping. Alle information required are stored within the | ||
5386 | HELLO message. The PEERINFO service provides a message handler accepting and | ||
5387 | processing these HELLO messages. | ||
5388 | |||
5389 | When obtaining PEERINFO information using the iterate functionality specific | ||
5390 | messages are used. To obtain information for all peers, a @code{struct | ||
5391 | ListAllPeersMessage} with message type | ||
5392 | @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag include_friend_only to | ||
5393 | indicate if friend-only HELLO messages should be included are transmitted. If | ||
5394 | information for a specific peer is required a @code{struct ListAllPeersMessage} | ||
5395 | with @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is | ||
5396 | used. | ||
5397 | |||
5398 | For both variants the PEERINFO service replies for each HELLO message he wants | ||
5399 | to transmit with a @code{struct ListAllPeersMessage} with type | ||
5400 | @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO. The final | ||
5401 | message is @code{struct GNUNET_MessageHeader} with type | ||
5402 | @code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this message, | ||
5403 | he can proceed with the next request if any is pending | ||
5404 | |||
5405 | @node libgnunetpeerinfo | ||
5406 | @subsection libgnunetpeerinfo | ||
5407 | |||
5408 | @c %**end of header | ||
5409 | |||
5410 | The PEERINFO API consists mainly of three different functionalities: | ||
5411 | maintaining a connection to the service, adding new information and retrieving | ||
5412 | information form the PEERINFO service. | ||
5413 | |||
5414 | |||
5415 | @menu | ||
5416 | * Connecting to the Service:: | ||
5417 | * Adding Information:: | ||
5418 | * Obtaining Information2:: | ||
5419 | @end menu | ||
5420 | |||
5421 | @node Connecting to the Service | ||
5422 | @subsubsection Connecting to the Service | ||
5423 | |||
5424 | @c %**end of header | ||
5425 | |||
5426 | To connect to the PEERINFO service the function @code{GNUNET_PEERINFO_connect} | ||
5427 | is used, taking a configuration handle as an argument, and to disconnect from | ||
5428 | PEERINFO the function @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO | ||
5429 | handle returned from the connect function has to be called. | ||
5430 | |||
5431 | @node Adding Information | ||
5432 | @subsubsection Adding Information | ||
5433 | |||
5434 | @c %**end of header | ||
5435 | |||
5436 | @code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem | ||
5437 | storage. This function takes the PEERINFO handle as an argument, the HELLO | ||
5438 | message to store and a continuation with a closure to be called with the result | ||
5439 | of the operation. The @code{GNUNET_PEERINFO_add_peer} returns a handle to this | ||
5440 | operation allowing to cancel the operation with the respective cancel function | ||
5441 | @code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from PEERINFO | ||
5442 | you can iterate over all information stored with PEERINFO or you can tell | ||
5443 | PEERINFO to notify if new peer information are available. | ||
5444 | |||
5445 | @node Obtaining Information2 | ||
5446 | @subsubsection Obtaining Information2 | ||
5447 | |||
5448 | @c %**end of header | ||
5449 | |||
5450 | To iterate over information in PEERINFO you use @code{GNUNET_PEERINFO_iterate}. | ||
5451 | This function expects the PEERINFO handle, a flag if HELLO messages intended | ||
5452 | for friend only mode should be included, a timeout how long the operation | ||
5453 | should take and a callback with a callback closure to be called for the | ||
5454 | results. If you want to obtain information for a specific peer, you can specify | ||
5455 | the peer identity, if this identity is NULL, information for all peers are | ||
5456 | returned. The function returns a handle to allow to cancel the operation using | ||
5457 | @code{GNUNET_PEERINFO_iterate_cancel}. | ||
5458 | |||
5459 | To get notified when peer information changes, you can use | ||
5460 | @code{GNUNET_PEERINFO_notify}. This function expects a configuration handle and | ||
5461 | a flag if friend-only HELLO messages should be included. The PEERINFO service | ||
5462 | will notify you about every change and the callback function will be called to | ||
5463 | notify you about changes. The function returns a handle to cancel notifications | ||
5464 | with @code{GNUNET_PEERINFO_notify_cancel}. | ||
5465 | |||
5466 | |||
5467 | @node GNUnet's PEERSTORE subsystem | ||
5468 | @section GNUnet's PEERSTORE subsystem | ||
5469 | |||
5470 | @c %**end of header | ||
5471 | |||
5472 | GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other | ||
5473 | GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently store | ||
5474 | and retrieve arbitrary data. Each data record stored with PEERSTORE contains | ||
5475 | the following fields: | ||
5476 | |||
5477 | @itemize @bullet | ||
5478 | @item subsystem: Name of the subsystem responsible for the record. | ||
5479 | @item peerid: Identity of the peer this record is related to. | ||
5480 | @item key: a key string identifying the record. | ||
5481 | @item value: binary record value. | ||
5482 | @item expiry: record expiry date. | ||
5483 | @end itemize | ||
5484 | |||
5485 | @menu | ||
5486 | * Functionality:: | ||
5487 | * Architecture:: | ||
5488 | * libgnunetpeerstore:: | ||
5489 | @end menu | ||
5490 | |||
5491 | @node Functionality | ||
5492 | @subsection Functionality | ||
5493 | |||
5494 | @c %**end of header | ||
5495 | |||
5496 | Subsystems can store any type of value under a (subsystem, peerid, key) | ||
5497 | combination. A "replace" flag set during store operations forces the PEERSTORE | ||
5498 | to replace any old values stored under the same (subsystem, peerid, key) | ||
5499 | combination with the new value. Additionally, an expiry date is set after which | ||
5500 | the record is *possibly* deleted by PEERSTORE. | ||
5501 | |||
5502 | Subsystems can iterate over all values stored under any of the following | ||
5503 | combination of fields: | ||
5504 | |||
5505 | @itemize @bullet | ||
5506 | @item (subsystem) | ||
5507 | @item (subsystem, peerid) | ||
5508 | @item (subsystem, key) | ||
5509 | @item (subsystem, peerid, key) | ||
5510 | @end itemize | ||
5511 | |||
5512 | Subsystems can also request to be notified about any new values stored under a | ||
5513 | (subsystem, peerid, key) combination by sending a "watch" request to | ||
5514 | PEERSTORE. | ||
5515 | |||
5516 | @node Architecture | ||
5517 | @subsection Architecture | ||
5518 | |||
5519 | @c %**end of header | ||
5520 | |||
5521 | PEERSTORE implements the following components: | ||
5522 | |||
5523 | @itemize @bullet | ||
5524 | @item PEERSTORE service: Handles store, iterate and watch operations. | ||
5525 | @item PEERSTORE API: API to be used by other subsystems to communicate and | ||
5526 | issue commands to the PEERSTORE service. | ||
5527 | @item PEERSTORE plugins: Handles the persistent storage. At the moment, only an | ||
5528 | "sqlite" plugin is implemented. | ||
5529 | @end itemize | ||
5530 | |||
5531 | @node libgnunetpeerstore | ||
5532 | @subsection libgnunetpeerstore | ||
5533 | |||
5534 | @c %**end of header | ||
5535 | |||
5536 | libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems | ||
5537 | wishing to communicate with the PEERSTORE service use this API to open a | ||
5538 | connection to PEERSTORE. This is done by calling | ||
5539 | @code{GNUNET_PEERSTORE_connect} which returns a handle to the newly created | ||
5540 | connection. This handle has to be used with any further calls to the API. | ||
5541 | |||
5542 | To store a new record, the function @code{GNUNET_PEERSTORE_store} is to be used | ||
5543 | which requires the record fields and a continuation function that will be | ||
5544 | called by the API after the STORE request is sent to the PEERSTORE service. | ||
5545 | Note that calling the continuation function does not mean that the record is | ||
5546 | successfully stored, only that the STORE request has been successfully sent to | ||
5547 | the PEERSTORE service. @code{GNUNET_PEERSTORE_store_cancel} can be called to | ||
5548 | cancel the STORE request only before the continuation function has been called. | ||
5549 | |||
5550 | To iterate over stored records, the function @code{GNUNET_PEERSTORE_iterate} is | ||
5551 | to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator | ||
5552 | callback function will be called with each matching record found and a NULL | ||
5553 | record at the end to signal the end of result set. | ||
5554 | @code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE | ||
5555 | request before the iterator callback is called with a NULL record. | ||
5556 | |||
5557 | To be notified with new values stored under a (subsystem, peerid, key) | ||
5558 | combination, the function @code{GNUNET_PEERSTORE_watch} is to be used. This | ||
5559 | will register the watcher with the PEERSTORE service, any new records matching | ||
5560 | the given combination will trigger the callback function passed to | ||
5561 | @code{GNUNET_PEERSTORE_watch}. This continues until | ||
5562 | @code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the service | ||
5563 | is destroyed. | ||
5564 | |||
5565 | After the connection is no longer needed, the function | ||
5566 | @code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the | ||
5567 | PEERSTORE service. Any pending ITERATE or WATCH requests will be destroyed. If | ||
5568 | the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will delay the | ||
5569 | disconnection until all pending STORE requests are sent to the PEERSTORE | ||
5570 | service, otherwise, the pending STORE requests will be destroyed as well. | ||
5571 | |||
5572 | @node GNUnet's SET Subsystem | ||
5573 | @section GNUnet's SET Subsystem | ||
5574 | |||
5575 | @c %**end of header | ||
5576 | |||
5577 | The SET service implements efficient set operations between two peers over a | ||
5578 | mesh tunnel. Currently, set union and set intersection are the only supported | ||
5579 | operations. Elements of a set consist of an @emph{element type} and arbitrary | ||
5580 | binary @emph{data}. The size of an element's data is limited to around 62 | ||
5581 | KB. | ||
5582 | |||
5583 | @menu | ||
5584 | * Local Sets:: | ||
5585 | * Set Modifications:: | ||
5586 | * Set Operations:: | ||
5587 | * Result Elements:: | ||
5588 | * libgnunetset:: | ||
5589 | * The SET Client-Service Protocol:: | ||
5590 | * The SET Intersection Peer-to-Peer Protocol:: | ||
5591 | * The SET Union Peer-to-Peer Protocol:: | ||
5592 | @end menu | ||
5593 | |||
5594 | @node Local Sets | ||
5595 | @subsection Local Sets | ||
5596 | |||
5597 | @c %**end of header | ||
5598 | |||
5599 | Sets created by a local client can be modified and reused for multiple | ||
5600 | operations. As each set operation requires potentially expensive special | ||
5601 | auxilliary data to be computed for each element of a set, a set can only | ||
5602 | participate in one type of set operation (i.e. union or intersection). The type | ||
5603 | of a set is determined upon its creation. If a the elements of a set are needed | ||
5604 | for an operation of a different type, all of the set's element must be copied | ||
5605 | to a new set of appropriate type. | ||
5606 | |||
5607 | @node Set Modifications | ||
5608 | @subsection Set Modifications | ||
5609 | |||
5610 | @c %**end of header | ||
5611 | |||
5612 | Even when set operations are active, one can add to and remove elements from a | ||
5613 | set. However, these changes will only be visible to operations that have been | ||
5614 | created after the changes have taken place. That is, every set operation only | ||
5615 | sees a snapshot of the set from the time the operation was started. This | ||
5616 | mechanism is @emph{not} implemented by copying the whole set, but by attaching | ||
5617 | @emph{generation information} to each element and operation. | ||
5618 | |||
5619 | @node Set Operations | ||
5620 | @subsection Set Operations | ||
5621 | |||
5622 | @c %**end of header | ||
5623 | |||
5624 | Set operations can be started in two ways: Either by accepting an operation | ||
5625 | request from a remote peer, or by requesting a set operation from a remote | ||
5626 | peer. Set operations are uniquely identified by the involved @emph{peers}, an | ||
5627 | @emph{application id} and the @emph{operation type}. | ||
5628 | |||
5629 | The client is notified of incoming set operations by @emph{set listeners}. A | ||
5630 | set listener listens for incoming operations of a specific operation type and | ||
5631 | application id. Once notified of an incoming set request, the client can | ||
5632 | accept the set request (providing a local set for the operation) or reject | ||
5633 | it. | ||
5634 | |||
5635 | @node Result Elements | ||
5636 | @subsection Result Elements | ||
5637 | |||
5638 | @c %**end of header | ||
5639 | |||
5640 | The SET service has three @emph{result modes} that determine how an operation's | ||
5641 | result set is delivered to the client: | ||
5642 | |||
5643 | @itemize @bullet | ||
5644 | @item @strong{Full Result Set.} All elements of set resulting from the set | ||
5645 | operation are returned to the client. | ||
5646 | @item @strong{Added Elements.} Only elements that result from the operation and | ||
5647 | are not already in the local peer's set are returned. Note that for some | ||
5648 | operations (like set intersection) this result mode will never return any | ||
5649 | elements. This can be useful if only the remove peer is actually interested in | ||
5650 | the result of the set operation. | ||
5651 | @item @strong{Removed Elements.} Only elements that are in the local peer's | ||
5652 | initial set but not in the operation's result set are returned. Note that for | ||
5653 | some operations (like set union) this result mode will never return any | ||
5654 | elements. This can be useful if only the remove peer is actually interested in | ||
5655 | the result of the set operation. | ||
5656 | @end itemize | ||
5657 | |||
5658 | @node libgnunetset | ||
5659 | @subsection libgnunetset | ||
5660 | |||
5661 | @c %**end of header | ||
5662 | |||
5663 | @menu | ||
5664 | * Sets:: | ||
5665 | * Listeners:: | ||
5666 | * Operations:: | ||
5667 | * Supplying a Set:: | ||
5668 | * The Result Callback:: | ||
5669 | @end menu | ||
5670 | |||
5671 | @node Sets | ||
5672 | @subsubsection Sets | ||
5673 | |||
5674 | @c %**end of header | ||
5675 | |||
5676 | New sets are created with @code{GNUNET_SET_create}. Both the local peer's | ||
5677 | configuration (as each set has its own client connection) and the operation | ||
5678 | type must be specified. The set exists until either the client calls | ||
5679 | @code{GNUNET_SET_destroy} or the client's connection to the service is | ||
5680 | disrupted. In the latter case, the client is notified by the return value of | ||
5681 | functions dealing with sets. This return value must always be checked. | ||
5682 | |||
5683 | Elements are added and removed with @code{GNUNET_SET_add_element} and | ||
5684 | @code{GNUNET_SET_remove_element}. | ||
5685 | |||
5686 | @node Listeners | ||
5687 | @subsubsection Listeners | ||
5688 | |||
5689 | @c %**end of header | ||
5690 | |||
5691 | Listeners are created with @code{GNUNET_SET_listen}. Each time time a remote | ||
5692 | peer suggests a set operation with an application id and operation type | ||
5693 | matching a listener, the listener's callack is invoked. The client then must | ||
5694 | synchronously call either @code{GNUNET_SET_accept} or @code{GNUNET_SET_reject}. | ||
5695 | Note that the operation will not be started until the client calls | ||
5696 | @code{GNUNET_SET_commit} (see Section "Supplying a Set"). | ||
5697 | |||
5698 | @node Operations | ||
5699 | @subsubsection Operations | ||
5700 | |||
5701 | @c %**end of header | ||
5702 | |||
5703 | Operations to be initiated by the local peer are created with | ||
5704 | @code{GNUNET_SET_prepare}. Note that the operation will not be started until | ||
5705 | the client calls @code{GNUNET_SET_commit} (see Section "Supplying a | ||
5706 | Set"). | ||
5707 | |||
5708 | @node Supplying a Set | ||
5709 | @subsubsection Supplying a Set | ||
5710 | |||
5711 | @c %**end of header | ||
5712 | |||
5713 | To create symmetry between the two ways of starting a set operation (accepting | ||
5714 | and nitiating it), the operation handles returned by @code{GNUNET_SET_accept} | ||
5715 | and @code{GNUNET_SET_prepare} do not yet have a set to operate on, thus they | ||
5716 | can not do any work yet. | ||
5717 | |||
5718 | The client must call @code{GNUNET_SET_commit} to specify a set to use for an | ||
5719 | operation. @code{GNUNET_SET_commit} may only be called once per set | ||
5720 | operation. | ||
5721 | |||
5722 | @node The Result Callback | ||
5723 | @subsubsection The Result Callback | ||
5724 | |||
5725 | @c %**end of header | ||
5726 | |||
5727 | Clients must specify both a result mode and a result callback with | ||
5728 | @code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result callback | ||
5729 | with a status indicating either that an element was received, or the operation | ||
5730 | failed or succeeded. The interpretation of the received element depends on the | ||
5731 | result mode. The callback needs to know which result mode it is used in, as the | ||
5732 | arguments do not indicate if an element is part of the full result set, or if | ||
5733 | it is in the difference between the original set and the final set. | ||
5734 | |||
5735 | @node The SET Client-Service Protocol | ||
5736 | @subsection The SET Client-Service Protocol | ||
5737 | |||
5738 | @c %**end of header | ||
5739 | |||
5740 | @menu | ||
5741 | * Creating Sets:: | ||
5742 | * Listeners2:: | ||
5743 | * Initiating Operations:: | ||
5744 | * Modifying Sets:: | ||
5745 | * Results and Operation Status:: | ||
5746 | * Iterating Sets:: | ||
5747 | @end menu | ||
5748 | |||
5749 | @node Creating Sets | ||
5750 | @subsubsection Creating Sets | ||
5751 | |||
5752 | @c %**end of header | ||
5753 | |||
5754 | For each set of a client, there exists a client connection to the service. Sets | ||
5755 | are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message over a new | ||
5756 | client connection. Multiple operations for one set are multiplexed over one | ||
5757 | client connection, using a request id supplied by the client. | ||
5758 | |||
5759 | @node Listeners2 | ||
5760 | @subsubsection Listeners2 | ||
5761 | |||
5762 | @c %**end of header | ||
5763 | |||
5764 | Each listener also requires a seperate client connection. By sending the | ||
5765 | @code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service of | ||
5766 | the application id and operation type it is interested in. A client rejects an | ||
5767 | incoming request by sending @code{GNUNET_SERVICE_SET_REJECT} on the listener's | ||
5768 | client connection. In contrast, when accepting an incoming request, a a | ||
5769 | @code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that is | ||
5770 | supplied for the set operation. | ||
5771 | |||
5772 | @node Initiating Operations | ||
5773 | @subsubsection Initiating Operations | ||
5774 | |||
5775 | @c %**end of header | ||
5776 | |||
5777 | Operations with remote peers are initiated by sending a | ||
5778 | @code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client | ||
5779 | connection that this message is sent by determines the set to use. | ||
5780 | |||
5781 | @node Modifying Sets | ||
5782 | @subsubsection Modifying Sets | ||
5783 | |||
5784 | @c %**end of header | ||
5785 | |||
5786 | Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and | ||
5787 | @code{GNUNET_SERVICE_SET_REMOVE} messages. | ||
5788 | |||
5789 | |||
5790 | @c %@menu | ||
5791 | @c %* Results and Operation Status:: | ||
5792 | @c %* Iterating Sets:: | ||
5793 | @c %@end menu | ||
5794 | |||
5795 | @node Results and Operation Status | ||
5796 | @subsubsection Results and Operation Status | ||
5797 | @c %**end of header | ||
5798 | |||
5799 | The service notifies the client of result elements and success/failure of a set | ||
5800 | operation with the @code{GNUNET_SERVICE_SET_RESULT} message. | ||
5801 | |||
5802 | @node Iterating Sets | ||
5803 | @subsubsection Iterating Sets | ||
5804 | |||
5805 | @c %**end of header | ||
5806 | |||
5807 | All elements of a set can be requested by sending | ||
5808 | @code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with | ||
5809 | @code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the iteration | ||
5810 | with @code{GNUNET_SERVICE_SET_ITER_DONE}. After each received element, the | ||
5811 | client@ must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set | ||
5812 | iteration may be active for a set at any given time. | ||
5813 | |||
5814 | @node The SET Intersection Peer-to-Peer Protocol | ||
5815 | @subsection The SET Intersection Peer-to-Peer Protocol | ||
5816 | |||
5817 | @c %**end of header | ||
5818 | |||
5819 | The intersection protocol operates over CADET and starts with a | ||
5820 | GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating | ||
5821 | the operation to the peer listening for inbound requests. It includes the | ||
5822 | number of elements of the initiating peer, which is used to decide which side | ||
5823 | will send a Bloom filter first. | ||
5824 | |||
5825 | The listening peer checks if the operation type and application identifier are | ||
5826 | acceptable for its current state. If not, it responds with a | ||
5827 | GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and | ||
5828 | terminates the CADET channel). | ||
5829 | |||
5830 | If the application accepts the request, the listener sends back a@ | ||
5831 | GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO if it has more elements | ||
5832 | in the set than the client. Otherwise, it immediately starts with the Bloom | ||
5833 | filter exchange. If the initiator receives a | ||
5834 | GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO response, it beings the | ||
5835 | Bloom filter exchange, unless the set size is indicated to be zero, in which | ||
5836 | case the intersection is considered finished after just the initial | ||
5837 | handshake. | ||
5838 | |||
5839 | |||
5840 | @menu | ||
5841 | * The Bloom filter exchange:: | ||
5842 | * Salt:: | ||
5843 | @end menu | ||
5844 | |||
5845 | @node The Bloom filter exchange | ||
5846 | @subsubsection The Bloom filter exchange | ||
5847 | |||
5848 | @c %**end of header | ||
5849 | |||
5850 | In this phase, each peer transmits a Bloom filter over the remaining keys of | ||
5851 | the local set to the other peer using a | ||
5852 | GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF message. This message additionally | ||
5853 | includes the number of elements left in the sender's set, as well as the XOR | ||
5854 | over all of the keys in that set. | ||
5855 | |||
5856 | The number of bits 'k' set per element in the Bloom filter is calculated based | ||
5857 | on the relative size of the two sets. Furthermore, the size of the Bloom filter | ||
5858 | is calculated based on 'k' and the number of elements in the set to maximize | ||
5859 | the amount of data filtered per byte transmitted on the wire (while avoiding an | ||
5860 | excessively high number of iterations). | ||
5861 | |||
5862 | The receiver of the message removes all elements from its local set that do not | ||
5863 | pass the Bloom filter test. It then checks if the set size of the sender and | ||
5864 | the XOR over the keys match what is left of his own set. If they do, he sends | ||
5865 | a@ GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE back to indicate that the | ||
5866 | latest set is the final result. Otherwise, the receiver starts another Bloom | ||
5867 | fitler exchange, except this time as the sender. | ||
5868 | |||
5869 | @node Salt | ||
5870 | @subsubsection Salt | ||
5871 | |||
5872 | @c %**end of header | ||
5873 | |||
5874 | Bloomfilter operations are probablistic: With some non-zero probability the | ||
5875 | test may incorrectly say an element is in the set, even though it is not. | ||
5876 | |||
5877 | To mitigate this problem, the intersection protocol iterates exchanging Bloom | ||
5878 | filters using a different random 32-bit salt in each iteration (the salt is | ||
5879 | also included in the message). With different salts, set operations may fail | ||
5880 | for different elements. Merging the results from the executions, the | ||
5881 | probability of failure drops to zero. | ||
5882 | |||
5883 | The iterations terminate once both peers have established that they have sets | ||
5884 | of the same size, and where the XOR over all keys computes the same 512-bit | ||
5885 | value (leaving a failure probability of 2-511). | ||
5886 | |||
5887 | @node The SET Union Peer-to-Peer Protocol | ||
5888 | @subsection The SET Union Peer-to-Peer Protocol | ||
5889 | |||
5890 | @c %**end of header | ||
5891 | |||
5892 | The SET union protocol is based on Eppstein's efficient set reconciliation | ||
5893 | without prior context. You should read this paper first if you want to | ||
5894 | understand the protocol. | ||
5895 | |||
5896 | The union protocol operates over CADET and starts with a | ||
5897 | GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating | ||
5898 | the operation to the peer listening for inbound requests. It includes the | ||
5899 | number of elements of the initiating peer, which is currently not used. | ||
5900 | |||
5901 | The listening peer checks if the operation type and application identifier are | ||
5902 | acceptable for its current state. If not, it responds with a | ||
5903 | GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and | ||
5904 | terminates the CADET channel). | ||
5905 | |||
5906 | If the application accepts the request, it sends back a strata estimator using | ||
5907 | a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The initiator evaluates | ||
5908 | the strata estimator and initiates the exchange of invertible Bloom filters, | ||
5909 | sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. | ||
5910 | |||
5911 | During the IBF exchange, if the receiver cannot invert the Bloom filter or | ||
5912 | detects a cycle, it sends a larger IBF in response (up to a defined maximum | ||
5913 | limit; if that limit is reached, the operation fails). Elements decoded while | ||
5914 | processing the IBF are transmitted to the other peer using | ||
5915 | GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the other peer using | ||
5916 | GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, depending on the sign | ||
5917 | observed during decoding of the IBF. Peers respond to a | ||
5918 | GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message with the respective | ||
5919 | element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS message. If the IBF fully | ||
5920 | decodes, the peer responds with a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE | ||
5921 | message instead of another GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. | ||
5922 | |||
5923 | All Bloom filter operations use a salt to mingle keys before hasing them into | ||
5924 | buckets, such that future iterations have a fresh chance of succeeding if they | ||
5925 | failed due to collisions before. | ||
5926 | |||
5927 | @node GNUnet's STATISTICS subsystem | ||
5928 | @section GNUnet's STATISTICS subsystem | ||
5929 | |||
5930 | @c %**end of header | ||
5931 | |||
5932 | In GNUnet, the STATISTICS subsystem offers a central place for all subsystems | ||
5933 | to publish unsigned 64-bit integer run-time statistics. Keeping this | ||
5934 | information centrally means that there is a unified way for the user to obtain | ||
5935 | data on all subsystems, and individual subsystems do not have to always include | ||
5936 | a custom data export method for performance metrics and other statistics. For | ||
5937 | example, the TRANSPORT system uses STATISTICS to update information about the | ||
5938 | number of directly connected peers and the bandwidth that has been consumed by | ||
5939 | the various plugins. This information is valuable for diagnosing connectivity | ||
5940 | and performance issues. | ||
5941 | |||
5942 | Following the GNUnet service architecture, the STATISTICS subsystem is divided | ||
5943 | into an API which is exposed through the header | ||
5944 | @strong{gnunet_statistics_service.h} and the STATISTICS service | ||
5945 | @strong{gnunet-service-statistics}. The @strong{gnunet-statistics} command-line | ||
5946 | tool can be used to obtain (and change) information about the values stored by | ||
5947 | the STATISTICS service. The STATISTICS service does not communicate with other | ||
5948 | peers. | ||
5949 | |||
5950 | Data is stored in the STATISTICS service in the form of tuples | ||
5951 | @strong{(subsystem, name, value, persistence)}. The subsystem determines to | ||
5952 | which other GNUnet's subsystem the data belongs. name is the name through which | ||
5953 | value is associated. It uniquely identifies the record from among other records | ||
5954 | belonging to the same subsystem. In some parts of the code, the pair | ||
5955 | @strong{(subsystem, name)} is called a @strong{statistic} as it identifies the | ||
5956 | values stored in the STATISTCS service.The persistence flag determines if the | ||
5957 | record has to be preserved across service restarts. A record is said to be | ||
5958 | persistent if this flag is set for it; if not, the record is treated as a | ||
5959 | non-persistent record and it is lost after service restart. Persistent records | ||
5960 | are written to and read from the file @strong{statistics.data} before shutdown | ||
5961 | and upon startup. The file is located in the HOME directory of the peer. | ||
5962 | |||
5963 | An anomaly of the STATISTICS service is that it does not terminate immediately | ||
5964 | upon receiving a shutdown signal if it has any clients connected to it. It | ||
5965 | waits for all the clients that are not monitors to close their connections | ||
5966 | before terminating itself. This is to prevent the loss of data during peer | ||
5967 | shutdown --- delaying the STATISTICS service shutdown helps other services to | ||
5968 | store important data to STATISTICS during shutdown. | ||
5969 | |||
5970 | @menu | ||
5971 | * libgnunetstatistics:: | ||
5972 | * The STATISTICS Client-Service Protocol:: | ||
5973 | @end menu | ||
5974 | |||
5975 | @node libgnunetstatistics | ||
5976 | @subsection libgnunetstatistics | ||
5977 | |||
5978 | @c %**end of header | ||
5979 | |||
5980 | @strong{libgnunetstatistics} is the library containing the API for the | ||
5981 | STATISTICS subsystem. Any process requiring to use STATISTICS should use this | ||
5982 | API by to open a connection to the STATISTICS service. This is done by calling | ||
5983 | the function @code{GNUNET_STATISTICS_create()}. This function takes the | ||
5984 | subsystem's name which is trying to use STATISTICS and a configuration. All | ||
5985 | values written to STATISTICS with this connection will be placed in the section | ||
5986 | corresponding to the given subsystem's name. The connection to STATISTICS can | ||
5987 | be destroyed with the function GNUNET_STATISTICS_destroy(). This function | ||
5988 | allows for the connection to be destroyed immediately or upon transferring all | ||
5989 | pending write requests to the service. | ||
5990 | |||
5991 | Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES} | ||
5992 | under the @code{[STATISTICS]} section in the configuration. With such a | ||
5993 | configuration all calls to @code{GNUNET_STATISTICS_create()} return @code{NULL} | ||
5994 | as the STATISTICS subsystem is unavailable and no other functions from the API | ||
5995 | can be used. | ||
5996 | |||
5997 | |||
5998 | @menu | ||
5999 | * Statistics retrieval:: | ||
6000 | * Setting statistics and updating them:: | ||
6001 | * Watches:: | ||
6002 | @end menu | ||
6003 | |||
6004 | @node Statistics retrieval | ||
6005 | @subsubsection Statistics retrieval | ||
6006 | |||
6007 | @c %**end of header | ||
6008 | |||
6009 | Once a connection to the statistics service is obtained, information about any | ||
6010 | other system which uses statistics can be retrieved with the function | ||
6011 | GNUNET_STATISTICS_get(). This function takes the connection handle, the name of | ||
6012 | the subsystem whose information we are interested in (a @code{NULL} value will | ||
6013 | retrieve information of all available subsystems using STATISTICS), the name of | ||
6014 | the statistic we are interested in (a @code{NULL} value will retrieve all | ||
6015 | available statistics), a continuation callback which is called when all of | ||
6016 | requested information is retrieved, an iterator callback which is called for | ||
6017 | each parameter in the retrieved information and a closure for the | ||
6018 | aforementioned callbacks. The library then invokes the iterator callback for | ||
6019 | each value matching the request. | ||
6020 | |||
6021 | Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be canceled with | ||
6022 | the function @code{GNUNET_STATISTICS_get_cancel()}. This is helpful when | ||
6023 | retrieving statistics takes too long and especially when we want to shutdown | ||
6024 | and cleanup everything. | ||
6025 | |||
6026 | @node Setting statistics and updating them | ||
6027 | @subsubsection Setting statistics and updating them | ||
6028 | |||
6029 | @c %**end of header | ||
6030 | |||
6031 | So far we have seen how to retrieve statistics, here we will learn how we can | ||
6032 | set statistics and update them so that other subsystems can retrieve them. | ||
6033 | |||
6034 | A new statistic can be set using the function @code{GNUNET_STATISTICS_set()}. | ||
6035 | This function takes the name of the statistic and its value and a flag to make | ||
6036 | the statistic persistent. The value of the statistic should be of the type | ||
6037 | @code{uint64_t}. The function does not take the name of the subsystem; it is | ||
6038 | determined from the previous @code{GNUNET_STATISTICS_create()} invocation. If | ||
6039 | the given statistic is already present, its value is overwritten. | ||
6040 | |||
6041 | An existing statistics can be updated, i.e its value can be increased or | ||
6042 | decreased by an amount with the function @code{GNUNET_STATISTICS_update()}. The | ||
6043 | parameters to this function are similar to @code{GNUNET_STATISTICS_set()}, | ||
6044 | except that it takes the amount to be changed as a type @code{int64_t} instead | ||
6045 | of the value. | ||
6046 | |||
6047 | The library will combine multiple set or update operations into one message if | ||
6048 | the client performs requests at a rate that is faster than the available IPC | ||
6049 | with the STATISTICS service. Thus, the client does not have to worry about | ||
6050 | sending requests too quickly. | ||
6051 | |||
6052 | @node Watches | ||
6053 | @subsubsection Watches | ||
6054 | |||
6055 | @c %**end of header | ||
6056 | |||
6057 | As interesting feature of STATISTICS lies in serving notifications whenever a | ||
6058 | statistic of our interest is modified. This is achieved by registering a watch | ||
6059 | through the function @code{GNUNET_STATISTICS_watch()}. The parameters of this | ||
6060 | function are similar to those of @code{GNUNET_STATISTICS_get()}. Changes to the | ||
6061 | respective statistic's value will then cause the given iterator callback to be | ||
6062 | called. Note: A watch can only be registered for a specific statistic. Hence | ||
6063 | the subsystem name and the parameter name cannot be @code{NULL} in a call to | ||
6064 | @code{GNUNET_STATISTICS_watch()}. | ||
6065 | |||
6066 | A registered watch will keep notifying any value changes until | ||
6067 | @code{GNUNET_STATISTICS_watch_cancel()} is called with the same parameters that | ||
6068 | are used for registering the watch. | ||
6069 | |||
6070 | @node The STATISTICS Client-Service Protocol | ||
6071 | @subsection The STATISTICS Client-Service Protocol | ||
6072 | @c %**end of header | ||
6073 | |||
6074 | |||
6075 | @menu | ||
6076 | * Statistics retrieval2:: | ||
6077 | * Setting and updating statistics:: | ||
6078 | * Watching for updates:: | ||
6079 | @end menu | ||
6080 | |||
6081 | @node Statistics retrieval2 | ||
6082 | @subsubsection Statistics retrieval2 | ||
6083 | |||
6084 | @c %**end of header | ||
6085 | |||
6086 | To retrieve statistics, the client transmits a message of type | ||
6087 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem name | ||
6088 | and statistic parameter to the STATISTICS service. The service responds with a | ||
6089 | message of type @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the | ||
6090 | statistics parameters that match the client request for the client. The end of | ||
6091 | information retrieved is signaled by the service by sending a message of type | ||
6092 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}. | ||
6093 | |||
6094 | @node Setting and updating statistics | ||
6095 | @subsubsection Setting and updating statistics | ||
6096 | |||
6097 | @c %**end of header | ||
6098 | |||
6099 | The subsystem name, parameter name, its value and the persistence flag are | ||
6100 | communicated to the service through the message | ||
6101 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}. | ||
6102 | |||
6103 | When the service receives a message of type | ||
6104 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem name and | ||
6105 | checks for a statistic parameter with matching the name given in the message. | ||
6106 | If a statistic parameter is found, the value is overwritten by the new value | ||
6107 | from the message; if not found then a new statistic parameter is created with | ||
6108 | the given name and value. | ||
6109 | |||
6110 | In addition to just setting an absolute value, it is possible to perform a | ||
6111 | relative update by sending a message of type | ||
6112 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag | ||
6113 | (@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in the | ||
6114 | message should be treated as an update value. | ||
6115 | |||
6116 | @node Watching for updates | ||
6117 | @subsubsection Watching for updates | ||
6118 | |||
6119 | @c %**end of header | ||
6120 | |||
6121 | The function registers the watch at the service by sending a message of type | ||
6122 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends | ||
6123 | notifications through messages of type | ||
6124 | @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic | ||
6125 | parameter's value is changed. | ||
6126 | |||
6127 | @node GNUnet's Distributed Hash Table (DHT) | ||
6128 | @section GNUnet's Distributed Hash Table (DHT) | ||
6129 | |||
6130 | @c %**end of header | ||
6131 | |||
6132 | GNUnet includes a generic distributed hash table that can be used by developers | ||
6133 | building P2P applications in the framework. This section documents high-level | ||
6134 | features and how developers are expected to use the DHT. We have a research | ||
6135 | paper detailing how the DHT works. Also, Nate's thesis includes a detailed | ||
6136 | description and performance analysis (in chapter 6). | ||
6137 | |||
6138 | Key features of GNUnet's DHT include: | ||
6139 | |||
6140 | @itemize @bullet | ||
6141 | @item stores key-value pairs with values up to (approximately) 63k in size | ||
6142 | @item works with many underlay network topologies (small-world, random graph), | ||
6143 | underlay does not need to be a full mesh / clique | ||
6144 | @item support for extended queries (more than just a simple 'key'), filtering | ||
6145 | duplicate replies within the network (bloomfilter) and content validation (for | ||
6146 | details, please read the subsection on the block library) | ||
6147 | @item can (optionally) return paths taken by the PUT and GET operations to the | ||
6148 | application | ||
6149 | @item provides content replication to handle churn | ||
6150 | @end itemize | ||
6151 | |||
6152 | GNUnet's DHT is randomized and unreliable. Unreliable means that there is no | ||
6153 | strict guarantee that a value stored in the DHT is always found --- values are | ||
6154 | only found with high probability. While this is somewhat true in all P2P DHTs, | ||
6155 | GNUnet developers should be particularly wary of this fact (this will help you | ||
6156 | write secure, fault-tolerant code). Thus, when writing any application using | ||
6157 | the DHT, you should always consider the possibility that a value stored in the | ||
6158 | DHT by you or some other peer might simply not be returned, or returned with a | ||
6159 | significant delay. Your application logic must be written to tolerate this | ||
6160 | (naturally, some loss of performance or quality of service is expected in this | ||
6161 | case). | ||
6162 | |||
6163 | @menu | ||
6164 | * Block library and plugins:: | ||
6165 | * libgnunetdht:: | ||
6166 | * The DHT Client-Service Protocol:: | ||
6167 | * The DHT Peer-to-Peer Protocol:: | ||
6168 | @end menu | ||
6169 | |||
6170 | @node Block library and plugins | ||
6171 | @subsection Block library and plugins | ||
6172 | |||
6173 | @c %**end of header | ||
6174 | |||
6175 | @menu | ||
6176 | * What is a Block?:: | ||
6177 | * The API of libgnunetblock:: | ||
6178 | * Queries:: | ||
6179 | * Sample Code:: | ||
6180 | * Conclusion2:: | ||
6181 | @end menu | ||
6182 | |||
6183 | @node What is a Block? | ||
6184 | @subsubsection What is a Block? | ||
6185 | |||
6186 | @c %**end of header | ||
6187 | |||
6188 | Blocks are small (< 63k) pieces of data stored under a key (struct | ||
6189 | GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines | ||
6190 | their data format. Blocks are used in GNUnet as units of static data exchanged | ||
6191 | between peers and stored (or cached) locally. Uses of blocks include | ||
6192 | file-sharing (the files are broken up into blocks), the VPN (DNS information is | ||
6193 | stored in blocks) and the DHT (all information in the DHT and meta-information | ||
6194 | for the maintenance of the DHT are both stored using blocks). The block | ||
6195 | subsystem provides a few common functions that must be available for any type | ||
6196 | of block. | ||
6197 | |||
6198 | @node The API of libgnunetblock | ||
6199 | @subsubsection The API of libgnunetblock | ||
6200 | |||
6201 | @c %**end of header | ||
6202 | |||
6203 | The block library requires for each (family of) block type(s) a block plugin | ||
6204 | (implementing gnunet_block_plugin.h) that provides basic functions that are | ||
6205 | needed by the DHT (and possibly other subsystems) to manage the block. These | ||
6206 | block plugins are typically implemented within their respective subsystems.@ | ||
6207 | The main block library is then used to locate, load and query the appropriate | ||
6208 | block plugin. Which plugin is appropriate is determined by the block type | ||
6209 | (which is just a 32-bit integer). Block plugins contain code that specifies | ||
6210 | which block types are supported by a given plugin. The block library loads all | ||
6211 | block plugins that are installed at the local peer and forwards the application | ||
6212 | request to the respective plugin. | ||
6213 | |||
6214 | The central functions of the block APIs (plugin and main library) are to allow | ||
6215 | the mapping of blocks to their respective key (if possible) and the ability to | ||
6216 | check that a block is well-formed and matches a given request (again, if | ||
6217 | possible). This way, GNUnet can avoid storing invalid blocks, storing blocks | ||
6218 | under the wrong key and forwarding blocks in response to a query that they do | ||
6219 | not answer. | ||
6220 | |||
6221 | One key function of block plugins is that it allows GNUnet to detect duplicate | ||
6222 | replies (via the Bloom filter). All plugins MUST support detecting duplicate | ||
6223 | replies (by adding the current response to the Bloom filter and rejecting it if | ||
6224 | it is encountered again). If a plugin fails to do this, responses may loop in | ||
6225 | the network. | ||
6226 | |||
6227 | @node Queries | ||
6228 | @subsubsection Queries | ||
6229 | @c %**end of header | ||
6230 | |||
6231 | The query format for any block in GNUnet consists of four main components. | ||
6232 | First, the type of the desired block must be specified. Second, the query must | ||
6233 | contain a hash code. The hash code is used for lookups in hash tables and | ||
6234 | databases and must not be unique for the block (however, if possible a unique | ||
6235 | hash should be used as this would be best for performance). Third, an optional | ||
6236 | Bloom filter can be specified to exclude known results; replies that hash to | ||
6237 | the bits set in the Bloom filter are considered invalid. False-positives can be | ||
6238 | eliminated by sending the same query again with a different Bloom filter | ||
6239 | mutator value, which parameterizes the hash function that is used. Finally, an | ||
6240 | optional application-specific "eXtended query" (xquery) can be specified to | ||
6241 | further constrain the results. It is entirely up to the type-specific plugin to | ||
6242 | determine whether or not a given block matches a query (type, hash, Bloom | ||
6243 | filter, and xquery). Naturally, not all xquery's are valid and some types of | ||
6244 | blocks may not support Bloom filters either, so the plugin also needs to check | ||
6245 | if the query is valid in the first place. | ||
6246 | |||
6247 | Depending on the results from the plugin, the DHT will then discard the | ||
6248 | (invalid) query, forward the query, discard the (invalid) reply, cache the | ||
6249 | (valid) reply, and/or forward the (valid and non-duplicate) reply. | ||
6250 | |||
6251 | @node Sample Code | ||
6252 | @subsubsection Sample Code | ||
6253 | |||
6254 | @c %**end of header | ||
6255 | |||
6256 | The source code in @strong{plugin_block_test.c} is a good starting point for | ||
6257 | new block plugins --- it does the minimal work by implementing a plugin that | ||
6258 | performs no validation at all. The respective @strong{Makefile.am} shows how to | ||
6259 | build and install a block plugin. | ||
6260 | |||
6261 | @node Conclusion2 | ||
6262 | @subsubsection Conclusion2 | ||
6263 | |||
6264 | @c %**end of header | ||
6265 | |||
6266 | In conclusion, GNUnet subsystems that want to use the DHT need to define a | ||
6267 | block format and write a plugin to match queries and replies. For testing, the | ||
6268 | "GNUNET_BLOCK_TYPE_TEST" block type can be used; it accepts any query as valid | ||
6269 | and any reply as matching any query. This type is also used for the DHT command | ||
6270 | line tools. However, it should NOT be used for normal applications due to the | ||
6271 | lack of error checking that results from this primitive implementation. | ||
6272 | |||
6273 | @node libgnunetdht | ||
6274 | @subsection libgnunetdht | ||
6275 | |||
6276 | @c %**end of header | ||
6277 | |||
6278 | The DHT API itself is pretty simple and offers the usual GET and PUT functions | ||
6279 | that work as expected. The specified block type refers to the block library | ||
6280 | which allows the DHT to run application-specific logic for data stored in the | ||
6281 | network. | ||
6282 | |||
6283 | |||
6284 | @menu | ||
6285 | * GET:: | ||
6286 | * PUT:: | ||
6287 | * MONITOR:: | ||
6288 | * DHT Routing Options:: | ||
6289 | @end menu | ||
6290 | |||
6291 | @node GET | ||
6292 | @subsubsection GET | ||
6293 | |||
6294 | @c %**end of header | ||
6295 | |||
6296 | When using GET, the main consideration for developers (other than the block | ||
6297 | library) should be that after issuing a GET, the DHT will continuously cause | ||
6298 | (small amounts of) network traffic until the operation is explicitly canceled. | ||
6299 | So GET does not simply send out a single network request once; instead, the | ||
6300 | DHT will continue to search for data. This is needed to achieve good success | ||
6301 | rates and also handles the case where the respective PUT operation happens | ||
6302 | after the GET operation was started. Developers should not cancel an existing | ||
6303 | GET operation and then explicitly re-start it to trigger a new round of | ||
6304 | network requests; this is simply inefficient, especially as the internal | ||
6305 | automated version can be more efficient, for example by filtering results in | ||
6306 | the network that have already been returned. | ||
6307 | |||
6308 | If an application that performs a GET request has a set of replies that it | ||
6309 | already knows and would like to filter, it can call@ | ||
6310 | @code{GNUNET_DHT_get_filter_known_results} with an array of hashes over the | ||
6311 | respective blocks to tell the DHT that these results are not desired (any | ||
6312 | more). This way, the DHT will filter the respective blocks using the block | ||
6313 | library in the network, which may result in a significant reduction in | ||
6314 | bandwidth consumption. | ||
6315 | |||
6316 | @node PUT | ||
6317 | @subsubsection PUT | ||
6318 | |||
6319 | @c %**end of header | ||
6320 | |||
6321 | In contrast to GET operations, developers @strong{must} manually re-run PUT | ||
6322 | operations periodically (if they intend the content to continue to be | ||
6323 | available). Content stored in the DHT expires or might be lost due to churn. | ||
6324 | Furthermore, GNUnet's DHT typically requires multiple rounds of PUT operations | ||
6325 | before a key-value pair is consistently available to all peers (the DHT | ||
6326 | randomizes paths and thus storage locations, and only after multiple rounds of | ||
6327 | PUTs there will be a sufficient number of replicas in large DHTs). An explicit | ||
6328 | PUT operation using the DHT API will only cause network traffic once, so in | ||
6329 | order to ensure basic availability and resistance to churn (and adversaries), | ||
6330 | PUTs must be repeated. While the exact frequency depends on the application, a | ||
6331 | rule of thumb is that there should be at least a dozen PUT operations within | ||
6332 | the content lifetime. Content in the DHT typically expires after one day, so | ||
6333 | DHT PUT operations should be repeated at least every 1-2 hours. | ||
6334 | |||
6335 | @node MONITOR | ||
6336 | @subsubsection MONITOR | ||
6337 | |||
6338 | @c %**end of header | ||
6339 | |||
6340 | The DHT API also allows applications to monitor messages crossing the local | ||
6341 | DHT service. The types of messages used by the DHT are GET, PUT and RESULT | ||
6342 | messages. Using the monitoring API, applications can choose to monitor these | ||
6343 | requests, possibly limiting themselves to requests for a particular block | ||
6344 | type. | ||
6345 | |||
6346 | The monitoring API is not only usefu only for diagnostics, it can also be used | ||
6347 | to trigger application operations based on PUT operations. For example, an | ||
6348 | application may use PUTs to distribute work requests to other peers. The | ||
6349 | workers would then monitor for PUTs that give them work, instead of looking | ||
6350 | for work using GET operations. This can be beneficial, especially if the | ||
6351 | workers have no good way to guess the keys under which work would be stored. | ||
6352 | Naturally, additional protocols might be needed to ensure that the desired | ||
6353 | number of workers will process the distributed workload. | ||
6354 | |||
6355 | @node DHT Routing Options | ||
6356 | @subsubsection DHT Routing Options | ||
6357 | |||
6358 | @c %**end of header | ||
6359 | |||
6360 | There are two important options for GET and PUT requests: | ||
6361 | |||
6362 | @table @asis | ||
6363 | @item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all peers | ||
6364 | should process the request, even if their peer ID is not closest to the key. | ||
6365 | For a PUT request, this means that all peers that a request traverses may make | ||
6366 | a copy of the data. Similarly for a GET request, all peers will check their | ||
6367 | local database for a result. Setting this option can thus significantly improve | ||
6368 | caching and reduce bandwidth consumption --- at the expense of a larger DHT | ||
6369 | database. If in doubt, we recommend that this option should be used. | ||
6370 | @item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record the path | ||
6371 | that a GET or a PUT request is taking through the overlay network. The | ||
6372 | resulting paths are then returned to the application with the respective | ||
6373 | result. This allows the receiver of a result to construct a path to the | ||
6374 | originator of the data, which might then be used for routing. Naturally, | ||
6375 | setting this option requires additional bandwidth and disk space, so | ||
6376 | applications should only set this if the paths are needed by the application | ||
6377 | logic. | ||
6378 | @item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by | ||
6379 | the DHT's peer discovery mechanism and should not be used by applications. | ||
6380 | @item GNUNET_DHT_RO_BART This option is currently not implemented. It may in | ||
6381 | the future offer performance improvements for clique topologies. | ||
6382 | @end table | ||
6383 | |||
6384 | @node The DHT Client-Service Protocol | ||
6385 | @subsection The DHT Client-Service Protocol | ||
6386 | |||
6387 | @c %**end of header | ||
6388 | |||
6389 | @menu | ||
6390 | * PUTting data into the DHT:: | ||
6391 | * GETting data from the DHT:: | ||
6392 | * Monitoring the DHT:: | ||
6393 | @end menu | ||
6394 | |||
6395 | @node PUTting data into the DHT | ||
6396 | @subsubsection PUTting data into the DHT | ||
6397 | |||
6398 | @c %**end of header | ||
6399 | |||
6400 | To store (PUT) data into the DHT, the client sends a@ @code{struct | ||
6401 | GNUNET_DHT_ClientPutMessage} to the service. This message specifies the block | ||
6402 | type, routing options, the desired replication level, the expiration time, key, | ||
6403 | value and a 64-bit unique ID for the operation. The service responds with a@ | ||
6404 | @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same 64-bit | ||
6405 | unique ID. Note that the service sends the confirmation as soon as it has | ||
6406 | locally processed the PUT request. The PUT may still be propagating through the | ||
6407 | network at this time. | ||
6408 | |||
6409 | In the future, we may want to change this to provide (limited) feedback to the | ||
6410 | client, for example if we detect that the PUT operation had no effect because | ||
6411 | the same key-value pair was already stored in the DHT. However, changing this | ||
6412 | would also require additional state and messages in the P2P | ||
6413 | interaction. | ||
6414 | |||
6415 | @node GETting data from the DHT | ||
6416 | @subsubsection GETting data from the DHT | ||
6417 | |||
6418 | @c %**end of header | ||
6419 | |||
6420 | To retrieve (GET) data from the DHT, the client sends a@ @code{struct | ||
6421 | GNUNET_DHT_ClientGetMessage} to the service. The message specifies routing | ||
6422 | options, a replication level (for replicating the GET, not the content), the | ||
6423 | desired block type, the key, the (optional) extended query and unique 64-bit | ||
6424 | request ID. | ||
6425 | |||
6426 | Additionally, the client may send any number of@ @code{struct | ||
6427 | GNUNET_DHT_ClientGetResultSeenMessage}s to notify the service about results | ||
6428 | that the client is already aware of. These messages consist of the key, the | ||
6429 | unique 64-bit ID of the request, and an arbitrary number of hash codes over the | ||
6430 | blocks that the client is already aware of. As messages are restricted to 64k, | ||
6431 | a client that already knows more than about a thousand blocks may need to send | ||
6432 | several of these messages. Naturally, the client should transmit these messages | ||
6433 | as quickly as possible after the original GET request such that the DHT can | ||
6434 | filter those results in the network early on. Naturally, as these messages are | ||
6435 | send after the original request, it is conceivalbe that the DHT service may | ||
6436 | return blocks that match those already known to the client anyway. | ||
6437 | |||
6438 | In response to a GET request, the service will send @code{struct | ||
6439 | GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the | ||
6440 | block type, expiration, key, unique ID of the request and of course the value | ||
6441 | (a block). Depending on the options set for the respective operations, the | ||
6442 | replies may also contain the path the GET and/or the PUT took through the | ||
6443 | network. | ||
6444 | |||
6445 | A client can stop receiving replies either by disconnecting or by sending a | ||
6446 | @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the key and | ||
6447 | the 64-bit unique ID of the original request. Using an explicit "stop" message | ||
6448 | is more common as this allows a client to run many concurrent GET operations | ||
6449 | over the same connection with the DHT service --- and to stop them | ||
6450 | individually. | ||
6451 | |||
6452 | @node Monitoring the DHT | ||
6453 | @subsubsection Monitoring the DHT | ||
6454 | |||
6455 | @c %**end of header | ||
6456 | |||
6457 | To begin monitoring, the client sends a @code{struct | ||
6458 | GNUNET_DHT_MonitorStartStop} message to the DHT service. In this message, flags | ||
6459 | can be set to enable (or disable) monitoring of GET, PUT and RESULT messages | ||
6460 | that pass through a peer. The message can also restrict monitoring to a | ||
6461 | particular block type or a particular key. Once monitoring is enabled, the DHT | ||
6462 | service will notify the client about any matching event using @code{struct | ||
6463 | GNUNET_DHT_MonitorGetMessage}s for GET events, @code{struct | ||
6464 | GNUNET_DHT_MonitorPutMessage} for PUT events and@ @code{struct | ||
6465 | GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of these messages contains | ||
6466 | all of the information about the event. | ||
6467 | |||
6468 | @node The DHT Peer-to-Peer Protocol | ||
6469 | @subsection The DHT Peer-to-Peer Protocol | ||
6470 | @c %**end of header | ||
6471 | |||
6472 | |||
6473 | @menu | ||
6474 | * Routing GETs or PUTs:: | ||
6475 | * PUTting data into the DHT2:: | ||
6476 | * GETting data from the DHT2:: | ||
6477 | @end menu | ||
6478 | |||
6479 | @node Routing GETs or PUTs | ||
6480 | @subsubsection Routing GETs or PUTs | ||
6481 | |||
6482 | @c %**end of header | ||
6483 | |||
6484 | When routing GETs or PUTs, the DHT service selects a suitable subset of | ||
6485 | neighbours for forwarding. The exact number of neighbours can be zero or more | ||
6486 | and depends on the hop counter of the query (initially zero) in relation to the | ||
6487 | (log of) the network size estimate, the desired replication level and the | ||
6488 | peer's connectivity. Depending on the hop counter and our network size | ||
6489 | estimate, the selection of the peers maybe randomized or by proximity to the | ||
6490 | key. Furthermore, requests include a set of peers that a request has already | ||
6491 | traversed; those peers are also excluded from the selection. | ||
6492 | |||
6493 | @node PUTting data into the DHT2 | ||
6494 | @subsubsection PUTting data into the DHT2 | ||
6495 | |||
6496 | @c %**end of header | ||
6497 | |||
6498 | To PUT data into the DHT, the service sends a @code{struct PeerPutMessage} of | ||
6499 | type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective neighbour. In | ||
6500 | addition to the usual information about the content (type, routing options, | ||
6501 | desired replication level for the content, expiration time, key and value), the | ||
6502 | message contains a fixed-size Bloom filter with information about which peers | ||
6503 | (may) have already seen this request. This Bloom filter is used to ensure that | ||
6504 | DHT messages never loop back to a peer that has already processed the request. | ||
6505 | Additionally, the message includes the current hop counter and, depending on | ||
6506 | the routing options, the message may include the full path that the message has | ||
6507 | taken so far. The Bloom filter should already contain the identity of the | ||
6508 | previous hop; however, the path should not include the identity of the previous | ||
6509 | hop and the receiver should append the identity of the sender to the path, not | ||
6510 | its own identity (this is done to reduce bandwidth). | ||
6511 | |||
6512 | @node GETting data from the DHT2 | ||
6513 | @subsubsection GETting data from the DHT2 | ||
6514 | |||
6515 | @c %**end of header | ||
6516 | |||
6517 | A peer can search the DHT by sending @code{struct PeerGetMessage}s of type | ||
6518 | @code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the usual | ||
6519 | information about the request (type, routing options, desired replication level | ||
6520 | for the request, the key and the extended query), a GET request also again | ||
6521 | contains a hop counter, a Bloom filter over the peers that have processed the | ||
6522 | request already and depending on the routing options the full path traversed by | ||
6523 | the GET. Finally, a GET request includes a variable-size second Bloom filter | ||
6524 | and a so-called Bloom filter mutator value which together indicate which | ||
6525 | replies the sender has already seen. During the lookup, each block that matches | ||
6526 | they block type, key and extended query is additionally subjected to a test | ||
6527 | against this Bloom filter. The block plugin is expected to take the hash of the | ||
6528 | block and combine it with the mutator value and check if the result is not yet | ||
6529 | in the Bloom filter. The originator of the query will from time to time modify | ||
6530 | the mutator to (eventually) allow false-positives filtered by the Bloom filter | ||
6531 | to be returned. | ||
6532 | |||
6533 | Peers that receive a GET request perform a local lookup (depending on their | ||
6534 | proximity to the key and the query options) and forward the request to other | ||
6535 | peers. They then remember the request (including the Bloom filter for blocking | ||
6536 | duplicate results) and when they obtain a matching, non-filtered response a | ||
6537 | @code{struct PeerResultMessage} of type@ | ||
6538 | @code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous hop. | ||
6539 | Whenver a result is forwarded, the block plugin is used to update the Bloom | ||
6540 | filter accordingly, to ensure that the same result is never forwarded more than | ||
6541 | once. The DHT service may also cache forwarded results locally if the | ||
6542 | "CACHE_RESULTS" option is set to "YES" in the configuration. | ||
6543 | |||
6544 | @node The GNU Name System (GNS) | ||
6545 | @section The GNU Name System (GNS) | ||
6546 | |||
6547 | @c %**end of header | ||
6548 | |||
6549 | The GNU Name System (GNS) is a decentralized database that enables users to | ||
6550 | securely resolve names to values. Names can be used to identify other users | ||
6551 | (for example, in social networking), or network services (for example, VPN | ||
6552 | services running at a peer in GNUnet, or purely IP-based services on the | ||
6553 | Internet). Users interact with GNS by typing in a hostname that ends in ".gnu" | ||
6554 | or ".zkey". | ||
6555 | |||
6556 | Videos giving an overview of most of the GNS and the motivations behind it is | ||
6557 | available here and here. The remainder of this chapter targets developers that | ||
6558 | are familiar with high level concepts of GNS as presented in these talks. | ||
6559 | |||
6560 | GNS-aware applications should use the GNS resolver to obtain the respective | ||
6561 | records that are stored under that name in GNS. Each record consists of a type, | ||
6562 | value, expiration time and flags. | ||
6563 | |||
6564 | The type specifies the format of the value. Types below 65536 correspond to DNS | ||
6565 | record types, larger values are used for GNS-specific records. Applications can | ||
6566 | define new GNS record types by reserving a number and implementing a plugin | ||
6567 | (which mostly needs to convert the binary value representation to a | ||
6568 | human-readable text format and vice-versa). The expiration time specifies how | ||
6569 | long the record is to be valid. The GNS API ensures that applications are only | ||
6570 | given non-expired values. The flags are typically irrelevant for applications, | ||
6571 | as GNS uses them internally to control visibility and validity of records. | ||
6572 | |||
6573 | Records are stored along with a signature. The signature is generated using the | ||
6574 | private key of the authoritative zone. This allows any GNS resolver to verify | ||
6575 | the correctness of a name-value mapping. | ||
6576 | |||
6577 | Internally, GNS uses the NAMECACHE to cache information obtained from other | ||
6578 | users, the NAMESTORE to store information specific to the local users, and the | ||
6579 | DHT to exchange data between users. A plugin API is used to enable applications | ||
6580 | to define new GNS record types. | ||
6581 | |||
6582 | @menu | ||
6583 | * libgnunetgns:: | ||
6584 | * libgnunetgnsrecord:: | ||
6585 | * GNS plugins:: | ||
6586 | * The GNS Client-Service Protocol:: | ||
6587 | * Hijacking the DNS-Traffic using gnunet-service-dns:: | ||
6588 | * Serving DNS lookups via GNS on W32:: | ||
6589 | @end menu | ||
6590 | |||
6591 | @node libgnunetgns | ||
6592 | @subsection libgnunetgns | ||
6593 | |||
6594 | @c %**end of header | ||
6595 | |||
6596 | The GNS API itself is extremely simple. Clients first connec to the GNS service | ||
6597 | using @code{GNUNET_GNS_connect}. They can then perform lookups using | ||
6598 | @code{GNUNET_GNS_lookup} or cancel pending lookups using | ||
6599 | @code{GNUNET_GNS_lookup_cancel}. Once finished, clients disconnect using | ||
6600 | @code{GNUNET_GNS_disconnect}. | ||
6601 | |||
6602 | |||
6603 | @menu | ||
6604 | * Looking up records:: | ||
6605 | * Accessing the records:: | ||
6606 | * Creating records:: | ||
6607 | * Future work:: | ||
6608 | @end menu | ||
6609 | |||
6610 | @node Looking up records | ||
6611 | @subsubsection Looking up records | ||
6612 | |||
6613 | @c %**end of header | ||
6614 | |||
6615 | @code{GNUNET_GNS_lookup} takes a number of arguments: | ||
6616 | |||
6617 | @table @asis | ||
6618 | @item handle This is simply the GNS connection handle from | ||
6619 | @code{GNUNET_GNS_connect}. | ||
6620 | @item name The client needs to specify the name to | ||
6621 | be resolved. This can be any valid DNS or GNS hostname. | ||
6622 | @item zone The client | ||
6623 | needs to specify the public key of the GNS zone against which the resolution | ||
6624 | should be done (the ".gnu" zone). Note that a key must be provided, even if the | ||
6625 | name ends in ".zkey". This should typically be the public key of the | ||
6626 | master-zone of the user. | ||
6627 | @item type This is the desired GNS or DNS record type | ||
6628 | to look for. While all records for the given name will be returned, this can be | ||
6629 | important if the client wants to resolve record types that themselves delegate | ||
6630 | resolution, such as CNAME, PKEY or GNS2DNS. Resolving a record of any of these | ||
6631 | types will only work if the respective record type is specified in the request, | ||
6632 | as the GNS resolver will otherwise follow the delegation and return the records | ||
6633 | from the respective destination, instead of the delegating record. | ||
6634 | @item only_cached This argument should typically be set to @code{GNUNET_NO}. Setting | ||
6635 | it to @code{GNUNET_YES} disables resolution via the overlay network. | ||
6636 | @item shorten_zone_key If GNS encounters new names during resolution, their | ||
6637 | respective zones can automatically be learned and added to the "shorten zone". | ||
6638 | If this is desired, clients must pass the private key of the shorten zone. If | ||
6639 | NULL is passed, shortening is disabled. | ||
6640 | @item proc This argument identifies | ||
6641 | the function to call with the result. It is given proc_cls, the number of | ||
6642 | records found (possilby zero) and the array of the records as arguments. proc | ||
6643 | will only be called once. After proc,> has been called, the lookup must no | ||
6644 | longer be cancelled. | ||
6645 | @item proc_cls The closure for proc. | ||
6646 | @end table | ||
6647 | |||
6648 | @node Accessing the records | ||
6649 | @subsubsection Accessing the records | ||
6650 | |||
6651 | @c %**end of header | ||
6652 | |||
6653 | The @code{libgnunetgnsrecord} library provides an API to manipulate the GNS | ||
6654 | record array that is given to proc. In particular, it offers functions such as | ||
6655 | converting record values to human-readable strings (and back). However, most | ||
6656 | @code{libgnunetgnsrecord} functions are not interesting to GNS client | ||
6657 | applications. | ||
6658 | |||
6659 | For DNS records, the @code{libgnunetdnsparser} library provides functions for | ||
6660 | parsing (and serializing) common types of DNS records. | ||
6661 | |||
6662 | @node Creating records | ||
6663 | @subsubsection Creating records | ||
6664 | |||
6665 | @c %**end of header | ||
6666 | |||
6667 | Creating GNS records is typically done by building the respective record | ||
6668 | information (possibly with the help of @code{libgnunetgnsrecord} and | ||
6669 | @code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to | ||
6670 | publish the information. The GNS API is not involved in this | ||
6671 | operation. | ||
6672 | |||
6673 | @node Future work | ||
6674 | @subsubsection Future work | ||
6675 | |||
6676 | @c %**end of header | ||
6677 | |||
6678 | In the future, we want to expand @code{libgnunetgns} to allow applications to | ||
6679 | observe shortening operations performed during GNS resolution, for example so | ||
6680 | that users can receive visual feedback when this happens. | ||
6681 | |||
6682 | @node libgnunetgnsrecord | ||
6683 | @subsection libgnunetgnsrecord | ||
6684 | |||
6685 | @c %**end of header | ||
6686 | |||
6687 | The @code{libgnunetgnsrecord} library is used to manipulate GNS records (in | ||
6688 | plaintext or in their encrypted format). Applications mostly interact with | ||
6689 | @code{libgnunetgnsrecord} by using the functions to convert GNS record values | ||
6690 | to strings or vice-versa, or to lookup a GNS record type number by name (or | ||
6691 | vice-versa). The library also provides various other functions that are mostly | ||
6692 | used internally within GNS, such as converting keys to names, checking for | ||
6693 | expiration, encrypting GNS records to GNS blocks, verifying GNS block | ||
6694 | signatures and decrypting GNS records from GNS blocks. | ||
6695 | |||
6696 | We will now discuss the four commonly used functions of the API.@ | ||
6697 | @code{libgnunetgnsrecord} does not perform these operations itself, but instead | ||
6698 | uses plugins to perform the operation. GNUnet includes plugins to support | ||
6699 | common DNS record types as well as standard GNS record types. | ||
6700 | |||
6701 | |||
6702 | @menu | ||
6703 | * Value handling:: | ||
6704 | * Type handling:: | ||
6705 | @end menu | ||
6706 | |||
6707 | @node Value handling | ||
6708 | @subsubsection Value handling | ||
6709 | |||
6710 | @c %**end of header | ||
6711 | |||
6712 | @code{GNUNET_GNSRECORD_value_to_string} can be used to convert the (binary) | ||
6713 | representation of a GNS record value to a human readable, 0-terminated UTF-8 | ||
6714 | string. NULL is returned if the specified record type is not supported by any | ||
6715 | available plugin. | ||
6716 | |||
6717 | @code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a human | ||
6718 | readable string to the respective (binary) representation of a GNS record | ||
6719 | value. | ||
6720 | |||
6721 | @node Type handling | ||
6722 | @subsubsection Type handling | ||
6723 | |||
6724 | @c %**end of header | ||
6725 | |||
6726 | @code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the numeric | ||
6727 | value associated with a given typename. For example, given the typename "A" | ||
6728 | (for DNS A reocrds), the function will return the number 1. A list of common | ||
6729 | DNS record types is | ||
6730 | @uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here. Note that | ||
6731 | not all DNS record types are supported by GNUnet GNSRECORD plugins at this | ||
6732 | time.} | ||
6733 | |||
6734 | @code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the typename | ||
6735 | associated with a given numeric value. For example, given the type number 1, | ||
6736 | the function will return the typename "A". | ||
6737 | |||
6738 | @node GNS plugins | ||
6739 | @subsection GNS plugins | ||
6740 | |||
6741 | @c %**end of header | ||
6742 | |||
6743 | Adding a new GNS record type typically involves writing (or extending) a | ||
6744 | GNSRECORD plugin. The plugin needs to implement the | ||
6745 | @code{gnunet_gnsrecord_plugin.h} API which provides basic functions that are | ||
6746 | needed by GNSRECORD to convert typenames and values of the respective record | ||
6747 | type to strings (and back). These gnsrecord plugins are typically implemented | ||
6748 | within their respective subsystems. Examples for such plugins can be found in | ||
6749 | the GNSRECORD, GNS and CONVERSATION subsystems. | ||
6750 | |||
6751 | The @code{libgnunetgnsrecord} library is then used to locate, load and query | ||
6752 | the appropriate gnsrecord plugin. Which plugin is appropriate is determined by | ||
6753 | the record type (which is just a 32-bit integer). The @code{libgnunetgnsrecord} | ||
6754 | library loads all block plugins that are installed at the local peer and | ||
6755 | forwards the application request to the plugins. If the record type is not | ||
6756 | supported by the plugin, it should simply return an error code. | ||
6757 | |||
6758 | The central functions of the block APIs (plugin and main library) are the same | ||
6759 | four functions for converting between values and strings, and typenames and | ||
6760 | numbers documented in the previous subsection. | ||
6761 | |||
6762 | @node The GNS Client-Service Protocol | ||
6763 | @subsection The GNS Client-Service Protocol | ||
6764 | |||
6765 | @c %**end of header | ||
6766 | |||
6767 | The GNS client-service protocol consists of two simple messages, the | ||
6768 | @code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP} message | ||
6769 | contains a unique 32-bit identifier, which will be included in the | ||
6770 | corresponding response. Thus, clients can send many lookup requests in parallel | ||
6771 | and receive responses out-of-order. A @code{LOOKUP} request also includes the | ||
6772 | public key of the GNS zone, the desired record type and fields specifying | ||
6773 | whether shortening is enabled or networking is disabled. Finally, the | ||
6774 | @code{LOOKUP} message includes the name to be resolved. | ||
6775 | |||
6776 | The response includes the number of records and the records themselves in the | ||
6777 | format created by @code{GNUNET_GNSRECORD_records_serialize}. They can thus be | ||
6778 | deserialized using @code{GNUNET_GNSRECORD_records_deserialize}. | ||
6779 | |||
6780 | @node Hijacking the DNS-Traffic using gnunet-service-dns | ||
6781 | @subsection Hijacking the DNS-Traffic using gnunet-service-dns | ||
6782 | |||
6783 | @c %**end of header | ||
6784 | |||
6785 | This section documents how the gnunet-service-dns (and the gnunet-helper-dns) | ||
6786 | intercepts DNS queries from the local system.@ This is merely one method for | ||
6787 | how we can obtain GNS queries. It is also possible to change @code{resolv.conf} | ||
6788 | to point to a machine running @code{gnunet-dns2gns} or to modify libc's name | ||
6789 | system switch (NSS) configuration to include a GNS resolution plugin. The | ||
6790 | method described in this chaper is more of a last-ditch catch-all approach. | ||
6791 | |||
6792 | @code{gnunet-service-dns} enables intercepting DNS traffic using policy based | ||
6793 | routing. We MARK every outgoing DNS-packet if it was not sent by our | ||
6794 | application. Using a second routing table in the Linux kernel these marked | ||
6795 | packets are then routed through our virtual network interface and can thus be | ||
6796 | captured unchanged. | ||
6797 | |||
6798 | Our application then reads the query and decides how to handle it: A query to | ||
6799 | an address ending in ".gnu" or ".zkey" is hijacked by @code{gnunet-service-gns} | ||
6800 | and resolved internally using GNS. In the future, a reverse query for an | ||
6801 | address of the configured virtual network could be answered with records kept | ||
6802 | about previous forward queries. Queries that are not hijacked by some | ||
6803 | application using the DNS service will be sent to the original recipient. The | ||
6804 | answer to the query will always be sent back through the virtual interface with | ||
6805 | the original nameserver as source address. | ||
6806 | |||
6807 | |||
6808 | @menu | ||
6809 | * Network Setup Details:: | ||
6810 | @end menu | ||
6811 | |||
6812 | @node Network Setup Details | ||
6813 | @subsubsection Network Setup Details | ||
6814 | |||
6815 | @c %**end of header | ||
6816 | |||
6817 | The DNS interceptor adds the following rules to the Linux kernel: | ||
6818 | @example | ||
6819 | iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 -j | ||
6820 | ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK --set-mark 3 ip | ||
6821 | rule add fwmark 3 table2 ip route add default via $VIRTUALDNS table2 | ||
6822 | @end example | ||
6823 | |||
6824 | Line 1 makes sure that all packets coming from a port our application opened | ||
6825 | beforehand (@code{$LOCALPORT}) will be routed normally. Line 2 marks every | ||
6826 | other packet to a DNS-Server with mark 3 (chosen arbitrarily). The third line | ||
6827 | adds a routing policy based on this mark 3 via the routing table. | ||
6828 | |||
6829 | @node Serving DNS lookups via GNS on W32 | ||
6830 | @subsection Serving DNS lookups via GNS on W32 | ||
6831 | |||
6832 | @c %**end of header | ||
6833 | |||
6834 | This section documents how the libw32nsp (and gnunet-gns-helper-service-w32) do | ||
6835 | DNS resolutions of DNS queries on the local system. This only applies to GNUnet | ||
6836 | running on W32. | ||
6837 | |||
6838 | W32 has a concept of "Namespaces" and "Namespace providers". These are used to | ||
6839 | present various name systems to applications in a generic way. Namespaces | ||
6840 | include DNS, mDNS, NLA and others. For each namespace any number of providers | ||
6841 | could be registered, and they are queried in an order of priority (which is | ||
6842 | adjustable). | ||
6843 | |||
6844 | Applications can resolve names by using WSALookupService*() family of | ||
6845 | functions. | ||
6846 | |||
6847 | However, these are WSA-only facilities. Common BSD socket functions for | ||
6848 | namespace resolutions are gethostbyname and getaddrinfo (among others). These | ||
6849 | functions are implemented internally (by default - by mswsock, which also | ||
6850 | implements the default DNS provider) as wrappers around WSALookupService*() | ||
6851 | functions (see "Sample Code for a Service Provider" on MSDN). | ||
6852 | |||
6853 | On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be | ||
6854 | installed into the system by using w32nsp-install (and uninstalled by | ||
6855 | w32nsp-uninstall), as described in "Installation Handbook". | ||
6856 | |||
6857 | libw32nsp is very simple and has almost no dependencies. As a response to | ||
6858 | NSPLookupServiceBegin(), it only checks that the provider GUID passed to it by | ||
6859 | the caller matches GNUnet DNS Provider GUID, checks that name being resolved | ||
6860 | ends in ".gnu" or ".zkey", then connects to gnunet-gns-helper-service-w32 at | ||
6861 | 127.0.0.1:5353 (hardcoded) and sends the name resolution request there, | ||
6862 | returning the connected socket to the caller. | ||
6863 | |||
6864 | When the caller invokes NSPLookupServiceNext(), libw32nsp reads a completely | ||
6865 | formed reply from that socket, unmarshalls it, then gives it back to the | ||
6866 | caller. | ||
6867 | |||
6868 | At the moment gnunet-gns-helper-service-w32 is implemented to ever give only | ||
6869 | one reply, and subsequent calls to NSPLookupServiceNext() will fail with | ||
6870 | WSA_NODATA (first call to NSPLookupServiceNext() might also fail if GNS failed | ||
6871 | to find the name, or there was an error connecting to it). | ||
6872 | |||
6873 | gnunet-gns-helper-service-w32 does most of the processing: | ||
6874 | |||
6875 | @itemize @bullet | ||
6876 | @item Maintains a connection to GNS. | ||
6877 | @item Reads GNS config and loads appropriate keys. | ||
6878 | @item Checks service GUID and decides on the type of record to look up, | ||
6879 | refusing to make a lookup outright when unsupported service GUID is passed. | ||
6880 | @item Launches the lookup | ||
6881 | @end itemize | ||
6882 | |||
6883 | When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete | ||
6884 | reply (including filling a WSAQUERYSETW structure and, possibly, a binary blob | ||
6885 | with a hostent structure for gethostbyname() client), marshalls it, and sends | ||
6886 | it back to libw32nsp. If no records were found, it sends an empty header. | ||
6887 | |||
6888 | This works for most normal applications that use gethostbyname() or | ||
6889 | getaddrinfo() to resolve names, but fails to do anything with applications that | ||
6890 | use alternative means of resolving names (such as sending queries to a DNS | ||
6891 | server directly by themselves). This includes some of well known utilities, | ||
6892 | like "ping" and "nslookup". | ||
6893 | |||
6894 | @node The GNS Namecache | ||
6895 | @section The GNS Namecache | ||
6896 | |||
6897 | @c %**end of header | ||
6898 | |||
6899 | The NAMECACHE subsystem is responsible for caching (encrypted) resolution | ||
6900 | results of the GNU Name System (GNS). GNS makes zone information available to | ||
6901 | other users via the DHT. However, as accessing the DHT for every lookup is | ||
6902 | expensive (and as the DHT's local cache is lost whenever the peer is | ||
6903 | restarted), GNS uses the NAMECACHE as a more persistent cache for DHT lookups. | ||
6904 | Thus, instead of always looking up every name in the DHT, GNS first checks if | ||
6905 | the result is already available locally in the NAMECACHE. Only if there is no | ||
6906 | result in the NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the | ||
6907 | same (encrypted) format as the DHT. It thus makes no sense to iterate over all | ||
6908 | items in the NAMECACHE --- the NAMECACHE does not have a way to provide the | ||
6909 | keys required to decrypt the entries. | ||
6910 | |||
6911 | Blocks in the NAMECACHE share the same expiration mechanism as blocks in the | ||
6912 | DHT --- the block expires wheneever any of the records in the (encrypted) block | ||
6913 | expires. The expiration time of the block is the only information stored in | ||
6914 | plaintext. The NAMECACHE service internally performs all of the required work | ||
6915 | to expire blocks, clients do not have to worry about this. Also, given that | ||
6916 | NAMECACHE stores only GNS blocks that local users requested, there is no | ||
6917 | configuration option to limit the size of the NAMECACHE. It is assumed to be | ||
6918 | always small enough (a few MB) to fit on the drive. | ||
6919 | |||
6920 | The NAMECACHE supports the use of different database backends via a plugin API. | ||
6921 | |||
6922 | @menu | ||
6923 | * libgnunetnamecache:: | ||
6924 | * The NAMECACHE Client-Service Protocol:: | ||
6925 | * The NAMECACHE Plugin API:: | ||
6926 | @end menu | ||
6927 | |||
6928 | @node libgnunetnamecache | ||
6929 | @subsection libgnunetnamecache | ||
6930 | |||
6931 | @c %**end of header | ||
6932 | |||
6933 | The NAMECACHE API consists of five simple functions. First, there is | ||
6934 | @code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service. This | ||
6935 | returns the handle required for all other operations on the NAMECACHE. Using | ||
6936 | @code{GNUNET_NAMECACHE_block_cache} clients can insert a block into the cache. | ||
6937 | @code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that were | ||
6938 | stored in the NAMECACHE. Both operations can be cancelled using | ||
6939 | @code{GNUNET_NAMECACHE_cancel}. Note that cancelling a | ||
6940 | @code{GNUNET_NAMECACHE_block_cache} operation can result in the block being | ||
6941 | stored in the NAMECACHE --- or not. Cancellation primarily ensures that the | ||
6942 | continuation function with the result of the operation will no longer be | ||
6943 | invoked. Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to | ||
6944 | the NAMECACHE. | ||
6945 | |||
6946 | The maximum size of a block that can be stored in the NAMECACHE is | ||
6947 | @code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB. | ||
6948 | |||
6949 | @node The NAMECACHE Client-Service Protocol | ||
6950 | @subsection The NAMECACHE Client-Service Protocol | ||
6951 | |||
6952 | @c %**end of header | ||
6953 | |||
6954 | All messages in the NAMECACHE IPC protocol start with the @code{struct | ||
6955 | GNUNET_NAMECACHE_Header} which adds a request ID (32-bit integer) to the | ||
6956 | standard message header. The request ID is used to match requests with the | ||
6957 | respective responses from the NAMECACHE, as they are allowed to happen | ||
6958 | out-of-order. | ||
6959 | |||
6960 | |||
6961 | @menu | ||
6962 | * Lookup:: | ||
6963 | * Store:: | ||
6964 | @end menu | ||
6965 | |||
6966 | @node Lookup | ||
6967 | @subsubsection Lookup | ||
6968 | |||
6969 | @c %**end of header | ||
6970 | |||
6971 | The @code{struct LookupBlockMessage} is used to lookup a block stored in the | ||
6972 | cache. It contains the query hash. The NAMECACHE always responds with a | ||
6973 | @code{struct LookupBlockResponseMessage}. If the NAMECACHE has no response, it | ||
6974 | sets the expiration time in the response to zero. Otherwise, the response is | ||
6975 | expected to contain the expiration time, the ECDSA signature, the derived key | ||
6976 | and the (variable-size) encrypted data of the block. | ||
6977 | |||
6978 | @node Store | ||
6979 | @subsubsection Store | ||
6980 | |||
6981 | @c %**end of header | ||
6982 | |||
6983 | The @code{struct BlockCacheMessage} is used to cache a block in the NAMECACHE. | ||
6984 | It has the same structure as the @code{struct LookupBlockResponseMessage}. The | ||
6985 | service responds with a @code{struct BlockCacheResponseMessage} which contains | ||
6986 | the result of the operation (success or failure). In the future, we might want | ||
6987 | to make it possible to provide an error message as well. | ||
6988 | |||
6989 | @node The NAMECACHE Plugin API | ||
6990 | @subsection The NAMECACHE Plugin API | ||
6991 | @c %**end of header | ||
6992 | |||
6993 | The NAMECACHE plugin API consists of two functions, @code{cache_block} to store | ||
6994 | a block in the database, and @code{lookup_block} to lookup a block in the | ||
6995 | database. | ||
6996 | |||
6997 | |||
6998 | @menu | ||
6999 | * Lookup2:: | ||
7000 | * Store2:: | ||
7001 | @end menu | ||
7002 | |||
7003 | @node Lookup2 | ||
7004 | @subsubsection Lookup2 | ||
7005 | |||
7006 | @c %**end of header | ||
7007 | |||
7008 | The @code{lookup_block} function is expected to return at most one block to the | ||
7009 | iterator, and return @code{GNUNET_NO} if there were no non-expired results. If | ||
7010 | there are multiple non-expired results in the cache, the lookup is supposed to | ||
7011 | return the result with the largest expiration time. | ||
7012 | |||
7013 | @node Store2 | ||
7014 | @subsubsection Store2 | ||
7015 | |||
7016 | @c %**end of header | ||
7017 | |||
7018 | The @code{cache_block} function is expected to try to store the block in the | ||
7019 | database, and return @code{GNUNET_SYSERR} if this was not possible for any | ||
7020 | reason. Furthermore, @code{cache_block} is expected to implicitly perform cache | ||
7021 | maintenance and purge blocks from the cache that have expired. Note that | ||
7022 | @code{cache_block} might encounter the case where the database already has | ||
7023 | another block stored under the same key. In this case, the plugin must ensure | ||
7024 | that the block with the larger expiration time is preserved. Obviously, this | ||
7025 | can done either by simply adding new blocks and selecting for the most recent | ||
7026 | expiration time during lookup, or by checking which block is more recent during | ||
7027 | the store operation. | ||
7028 | |||
7029 | @node The REVOCATION Subsystem | ||
7030 | @section The REVOCATION Subsystem | ||
7031 | @c %**end of header | ||
7032 | |||
7033 | The REVOCATION subsystem is responsible for key revocation of Egos. If a user | ||
7034 | learns that his private key has been compromised or has lost it, he can use the | ||
7035 | REVOCATION system to inform all of the other users that this private key is no | ||
7036 | longer valid. The subsystem thus includes ways to query for the validity of | ||
7037 | keys and to propagate revocation messages. | ||
7038 | |||
7039 | @menu | ||
7040 | * Dissemination:: | ||
7041 | * Revocation Message Design Requirements:: | ||
7042 | * libgnunetrevocation:: | ||
7043 | * The REVOCATION Client-Service Protocol:: | ||
7044 | * The REVOCATION Peer-to-Peer Protocol:: | ||
7045 | @end menu | ||
7046 | |||
7047 | @node Dissemination | ||
7048 | @subsection Dissemination | ||
7049 | |||
7050 | @c %**end of header | ||
7051 | |||
7052 | When a revocation is performed, the revocation is first of all disseminated by | ||
7053 | flooding the overlay network. The goal is to reach every peer, so that when a | ||
7054 | peer needs to check if a key has been revoked, this will be purely a local | ||
7055 | operation where the peer looks at his local revocation list. Flooding the | ||
7056 | network is also the most robust form of key revocation --- an adversary would | ||
7057 | have to control a separator of the overlay graph to restrict the propagation of | ||
7058 | the revocation message. Flooding is also very easy to implement --- peers that | ||
7059 | receive a revocation message for a key that they have never seen before simply | ||
7060 | pass the message to all of their neighbours. | ||
7061 | |||
7062 | Flooding can only distribute the revocation message to peers that are online. | ||
7063 | In order to notify peers that join the network later, the revocation service | ||
7064 | performs efficient set reconciliation over the sets of known revocation | ||
7065 | messages whenever two peers (that both support REVOCATION dissemination) | ||
7066 | connect. The SET service is used to perform this operation | ||
7067 | efficiently. | ||
7068 | |||
7069 | @node Revocation Message Design Requirements | ||
7070 | @subsection Revocation Message Design Requirements | ||
7071 | |||
7072 | @c %**end of header | ||
7073 | |||
7074 | However, flooding is also quite costly, creating O(|E|) messages on a network | ||
7075 | with |E| edges. Thus, revocation messages are required to contain a | ||
7076 | proof-of-work, the result of an expensive computation (which, however, is cheap | ||
7077 | to verify). Only peers that have expended the CPU time necessary to provide | ||
7078 | this proof will be able to flood the network with the revocation message. This | ||
7079 | ensures that an attacker cannot simply flood the network with millions of | ||
7080 | revocation messages. The proof-of-work required by GNUnet is set to take days | ||
7081 | on a typical PC to compute; if the ability to quickly revoke a key is needed, | ||
7082 | users have the option to pre-compute revocation messages to store off-line and | ||
7083 | use instantly after their key has expired. | ||
7084 | |||
7085 | Revocation messages must also be signed by the private key that is being | ||
7086 | revoked. Thus, they can only be created while the private key is in the | ||
7087 | possession of the respective user. This is another reason to create a | ||
7088 | revocation message ahead of time and store it in a secure location. | ||
7089 | |||
7090 | @node libgnunetrevocation | ||
7091 | @subsection libgnunetrevocation | ||
7092 | |||
7093 | @c %**end of header | ||
7094 | |||
7095 | The REVOCATION API consists of two parts, to query and to issue | ||
7096 | revocations. | ||
7097 | |||
7098 | |||
7099 | @menu | ||
7100 | * Querying for revoked keys:: | ||
7101 | * Preparing revocations:: | ||
7102 | * Issuing revocations:: | ||
7103 | @end menu | ||
7104 | |||
7105 | @node Querying for revoked keys | ||
7106 | @subsubsection Querying for revoked keys | ||
7107 | |||
7108 | @c %**end of header | ||
7109 | |||
7110 | @code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public key has | ||
7111 | been revoked. The given callback will be invoked with the result of the check. | ||
7112 | The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on the | ||
7113 | return value. | ||
7114 | |||
7115 | @node Preparing revocations | ||
7116 | @subsubsection Preparing revocations | ||
7117 | |||
7118 | @c %**end of header | ||
7119 | |||
7120 | It is often desirable to create a revocation record ahead-of-time and store it | ||
7121 | in an off-line location to be used later in an emergency. This is particularly | ||
7122 | true for GNUnet revocations, where performing the revocation operation itself | ||
7123 | is computationally expensive and thus is likely to take some time. Thus, if | ||
7124 | users want the ability to perform revocations quickly in an emergency, they | ||
7125 | must pre-compute the revocation message. The revocation API enables this with | ||
7126 | two functions that are used to compute the revocation message, but not trigger | ||
7127 | the actual revocation operation. | ||
7128 | |||
7129 | @code{GNUNET_REVOCATION_check_pow} should be used to calculate the | ||
7130 | proof-of-work required in the revocation message. This function takes the | ||
7131 | public key, the required number of bits for the proof of work (which in GNUnet | ||
7132 | is a network-wide constant) and finally a proof-of-work number as arguments. | ||
7133 | The function then checks if the given proof-of-work number is a valid proof of | ||
7134 | work for the given public key. Clients preparing a revocation are expected to | ||
7135 | call this function repeatedly (typically with a monotonically increasing | ||
7136 | sequence of numbers of the proof-of-work number) until a given number satisfies | ||
7137 | the check. That number should then be saved for later use in the revocation | ||
7138 | operation. | ||
7139 | |||
7140 | @code{GNUNET_REVOCATION_sign_revocation} is used to generate the signature that | ||
7141 | is required in a revocation message. It takes the private key that (possibly in | ||
7142 | the future) is to be revoked and returns the signature. The signature can again | ||
7143 | be saved to disk for later use, which will then allow performing a revocation | ||
7144 | even without access to the private key. | ||
7145 | |||
7146 | @node Issuing revocations | ||
7147 | @subsubsection Issuing revocations | ||
7148 | |||
7149 | |||
7150 | Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign} and | ||
7151 | the proof-of-work, @code{GNUNET_REVOCATION_revoke} can be used to perform the | ||
7152 | actual revocation. The given callback is called upon completion of the | ||
7153 | operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the | ||
7154 | library from calling the continuation; however, in that case it is undefined | ||
7155 | whether or not the revocation operation will be executed. | ||
7156 | |||
7157 | @node The REVOCATION Client-Service Protocol | ||
7158 | @subsection The REVOCATION Client-Service Protocol | ||
7159 | |||
7160 | |||
7161 | The REVOCATION protocol consists of four simple messages. | ||
7162 | |||
7163 | A @code{QueryMessage} containing a public ECDSA key is used to check if a | ||
7164 | particular key has been revoked. The service responds with a | ||
7165 | @code{QueryResponseMessage} which simply contains a bit that says if the given | ||
7166 | public key is still valid, or if it has been revoked. | ||
7167 | |||
7168 | The second possible interaction is for a client to revoke a key by passing a | ||
7169 | @code{RevokeMessage} to the service. The @code{RevokeMessage} contains the | ||
7170 | ECDSA public key to be revoked, a signature by the corresponding private key | ||
7171 | and the proof-of-work, The service responds with a | ||
7172 | @code{RevocationResponseMessage} which can be used to indicate that the | ||
7173 | @code{RevokeMessage} was invalid (i.e. proof of work incorrect), or otherwise | ||
7174 | indicates that the revocation has been processed successfully. | ||
7175 | |||
7176 | @node The REVOCATION Peer-to-Peer Protocol | ||
7177 | @subsection The REVOCATION Peer-to-Peer Protocol | ||
7178 | |||
7179 | @c %**end of header | ||
7180 | |||
7181 | Revocation uses two disjoint ways to spread revocation information among peers. | ||
7182 | First of all, P2P gossip exchanged via CORE-level neighbours is used to quickly | ||
7183 | spread revocations to all connected peers. Second, whenever two peers (that | ||
7184 | both support revocations) connect, the SET service is used to compute the union | ||
7185 | of the respective revocation sets. | ||
7186 | |||
7187 | In both cases, the exchanged messages are @code{RevokeMessage}s which contain | ||
7188 | the public key that is being revoked, a matching ECDSA signature, and a | ||
7189 | proof-of-work. Whenever a peer learns about a new revocation this way, it first | ||
7190 | validates the signature and the proof-of-work, then stores it to disk | ||
7191 | (typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally spreads the | ||
7192 | information to all directly connected neighbours. | ||
7193 | |||
7194 | For computing the union using the SET service, the peer with the smaller hashed | ||
7195 | peer identity will connect (as a "client" in the two-party set protocol) to the | ||
7196 | other peer after one second (to reduce traffic spikes on connect) and initiate | ||
7197 | the computation of the set union. All revocation services use a common hash to | ||
7198 | identify the SET operation over revocation sets. | ||
7199 | |||
7200 | The current implementation accepts revocation set union operations from all | ||
7201 | peers at any time; however, well-behaved peers should only initiate this | ||
7202 | operation once after establishing a connection to a peer with a larger hashed | ||
7203 | peer identity. | ||
7204 | |||
7205 | @node GNUnet's File-sharing (FS) Subsystem | ||
7206 | @section GNUnet's File-sharing (FS) Subsystem | ||
7207 | |||
7208 | @c %**end of header | ||
7209 | |||
7210 | This chapter describes the details of how the file-sharing service works. As | ||
7211 | with all services, it is split into an API (libgnunetfs), the service process | ||
7212 | (gnunet-service-fs) and user interface(s). The file-sharing service uses the | ||
7213 | datastore service to store blocks and the DHT (and indirectly datacache) for | ||
7214 | lookups for non-anonymous file-sharing.@ Furthermore, the file-sharing service | ||
7215 | uses the block library (and the block fs plugin) for validation of DHT | ||
7216 | operations. | ||
7217 | |||
7218 | In contrast to many other services, libgnunetfs is rather complex since the | ||
7219 | client library includes a large number of high-level abstractions; this is | ||
7220 | necessary since the Fs service itself largely only operates on the block level. | ||
7221 | The FS library is responsible for providing a file-based abstraction to | ||
7222 | applications, including directories, meta data, keyword search, verification, | ||
7223 | and so on. | ||
7224 | |||
7225 | The method used by GNUnet to break large files into blocks and to use keyword | ||
7226 | search is called the "Encoding for Censorship Resistant Sharing" (ECRS). ECRS | ||
7227 | is largely implemented in the fs library; block validation is also reflected in | ||
7228 | the block FS plugin and the FS service. ECRS on-demand encoding is implemented | ||
7229 | in the FS service. | ||
7230 | |||
7231 | NOTE: The documentation in this chapter is quite incomplete. | ||
7232 | |||
7233 | @menu | ||
7234 | * Encoding for Censorship-Resistant Sharing (ECRS):: | ||
7235 | * File-sharing persistence directory structure:: | ||
7236 | @end menu | ||
7237 | |||
7238 | @node Encoding for Censorship-Resistant Sharing (ECRS) | ||
7239 | @subsection Encoding for Censorship-Resistant Sharing (ECRS) | ||
7240 | |||
7241 | @c %**end of header | ||
7242 | |||
7243 | When GNUnet shares files, it uses a content encoding that is called ECRS, the | ||
7244 | Encoding for Censorship-Resistant Sharing. Most of ECRS is described in the | ||
7245 | (so far unpublished) research paper attached to this page. ECRS obsoletes the | ||
7246 | previous ESED and ESED II encodings which were used in GNUnet before version | ||
7247 | 0.7.0.@ @ The rest of this page assumes that the reader is familiar with the | ||
7248 | attached paper. What follows is a description of some minor extensions that | ||
7249 | GNUnet makes over what is described in the paper. The reason why these | ||
7250 | extensions are not in the paper is that we felt that they were obvious or | ||
7251 | trivial extensions to the original scheme and thus did not warrant space in | ||
7252 | the research report. | ||
7253 | |||
7254 | |||
7255 | @menu | ||
7256 | * Namespace Advertisements:: | ||
7257 | * KSBlocks:: | ||
7258 | @end menu | ||
7259 | |||
7260 | @node Namespace Advertisements | ||
7261 | @subsubsection Namespace Advertisements | ||
7262 | |||
7263 | @c %**end of header | ||
7264 | |||
7265 | An @code{SBlock} with identifier â²all zerosâ² is a signed | ||
7266 | advertisement for a namespace. This special @code{SBlock} contains metadata | ||
7267 | describing the content of the namespace. Instead of the name of the identifier | ||
7268 | for a potential update, it contains the identifier for the root of the | ||
7269 | namespace. The URI should always be empty. The @code{SBlock} is signed with | ||
7270 | the content provderâ²s RSA private key (just like any other SBlock). Peers | ||
7271 | can search for @code{SBlock}s in order to find out more about a namespace. | ||
7272 | |||
7273 | @node KSBlocks | ||
7274 | @subsubsection KSBlocks | ||
7275 | |||
7276 | @c %**end of header | ||
7277 | |||
7278 | GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead of | ||
7279 | encrypting a CHK and metadata, encrypt an @code{SBlock} instead. In other | ||
7280 | words, @code{KSBlocks} enable GNUnet to find @code{SBlocks} using the global | ||
7281 | keyword search. Usually the encrypted @code{SBlock} is a namespace | ||
7282 | advertisement. The rationale behind @code{KSBlock}s and @code{SBlock}s is to | ||
7283 | enable peers to discover namespaces via keyword searches, and, to associate | ||
7284 | useful information with namespaces. When GNUnet finds @code{KSBlocks} during a | ||
7285 | normal keyword search, it adds the information to an internal list of | ||
7286 | discovered namespaces. Users looking for interesting namespaces can then | ||
7287 | inspect this list, reducing the need for out-of-band discovery of namespaces. | ||
7288 | Naturally, namespaces (or more specifically, namespace advertisements) can | ||
7289 | also be referenced from directories, but @code{KSBlock}s should make it easier | ||
7290 | to advertise namespaces for the owner of the pseudonym since they eliminate | ||
7291 | the need to first create a directory. | ||
7292 | |||
7293 | Collections are also advertised using @code{KSBlock}s. | ||
7294 | |||
7295 | @table @asis | ||
7296 | @item Attachment Size | ||
7297 | @item ecrs.pdf 270.68 KB | ||
7298 | @item https://gnunet.org/sites/default/files/ecrs.pdf | ||
7299 | @end table | ||
7300 | |||
7301 | @node File-sharing persistence directory structure | ||
7302 | @subsection File-sharing persistence directory structure | ||
7303 | |||
7304 | @c %**end of header | ||
7305 | |||
7306 | This section documents how the file-sharing library implements persistence of | ||
7307 | file-sharing operations and specifically the resulting directory structure. | ||
7308 | This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag was set | ||
7309 | when calling @code{GNUNET_FS_start}. In this case, the file-sharing library | ||
7310 | will try hard to ensure that all major operations (searching, downloading, | ||
7311 | publishing, unindexing) are persistent, that is, can live longer than the | ||
7312 | process itself. More specifically, an operation is supposed to live until it is | ||
7313 | explicitly stopped. | ||
7314 | |||
7315 | If @code{GNUNET_FS_stop} is called before an operation has been stopped, a | ||
7316 | @code{SUSPEND} event is generated and then when the process calls | ||
7317 | @code{GNUNET_FS_start} next time, a @code{RESUME} event is generated. | ||
7318 | Additionally, even if an application crashes (segfault, SIGKILL, system crash) | ||
7319 | and hence @code{GNUNET_FS_stop} is never called and no @code{SUSPEND} events | ||
7320 | are generated, operations are still resumed (with @code{RESUME} events). This | ||
7321 | is implemented by constantly writing the current state of the file-sharing | ||
7322 | operations to disk. Specifically, the current state is always written to disk | ||
7323 | whenever anything significant changes (the exception are block-wise progress in | ||
7324 | publishing and unindexing, since those operations would be slowed down | ||
7325 | significantly and can be resumed cheaply even without detailed accounting). | ||
7326 | Note that@ if the process crashes (or is killed) during a serialization | ||
7327 | operation, FS does not guarantee that this specific operation is recoverable | ||
7328 | (no strict transactional semantics, again for performance reasons). However, | ||
7329 | all other unrelated operations should resume nicely. | ||
7330 | |||
7331 | Since we need to serialize the state continuously and want to recover as much | ||
7332 | as possible even after crashing during a serialization operation, we do not use | ||
7333 | one large file for serialization. Instead, several directories are used for the | ||
7334 | various operations. When @code{GNUNET_FS_start} executes, the master | ||
7335 | directories are scanned for files describing operations to resume. Sometimes, | ||
7336 | these operations can refer to related operations in child directories which may | ||
7337 | also be resumed at this point. Note that corrupted files are cleaned up | ||
7338 | automatically. However, dangling files in child directories (those that are not | ||
7339 | referenced by files from the master directories) are not automatically removed. | ||
7340 | |||
7341 | Persistence data is kept in a directory that begins with the "STATE_DIR" prefix | ||
7342 | from the configuration file (by default, "$SERVICEHOME/persistence/") followed | ||
7343 | by the name of the client as given to @code{GNUNET_FS_start} (for example, | ||
7344 | "gnunet-gtk") followed by the actual name of the master or child directory. | ||
7345 | |||
7346 | The names for the master directories follow the names of the operations: | ||
7347 | |||
7348 | @itemize @bullet | ||
7349 | @item "search" | ||
7350 | @item "download" | ||
7351 | @item "publish" | ||
7352 | @item "unindex" | ||
7353 | @end itemize | ||
7354 | |||
7355 | Each of the master directories contains names (chosen at random) for each | ||
7356 | active top-level (master) operation. Note that a download that is associated | ||
7357 | with a search result is not a top-level operation. | ||
7358 | |||
7359 | In contrast to the master directories, the child directories are only consulted | ||
7360 | when another operation refers to them. For each search, a subdirectory (named | ||
7361 | after the master search synchronization file) contains the search results. | ||
7362 | Search results can have an associated download, which is then stored in the | ||
7363 | general "download-child" directory. Downloads can be recursive, in which case | ||
7364 | children are stored in subdirectories mirroring the structure of the recursive | ||
7365 | download (either starting in the master "download" directory or in the | ||
7366 | "download-child" directory depending on how the download was initiated). For | ||
7367 | publishing operations, the "publish-file" directory contains information about | ||
7368 | the individual files and directories that are part of the publication. However, | ||
7369 | this directory structure is flat and does not mirror the structure of the | ||
7370 | publishing operation. Note that unindex operations cannot have associated child | ||
7371 | operations. | ||
7372 | |||
7373 | @node GNUnet's REGEX Subsystem | ||
7374 | @section GNUnet's REGEX Subsystem | ||
7375 | |||
7376 | @c %**end of header | ||
7377 | |||
7378 | Using the REGEX subsystem, you can discover peers that offer a particular | ||
7379 | service using regular expressions. The peers that offer a service specify it | ||
7380 | using a regular expressions. Peers that want to patronize a service search | ||
7381 | using a string. The REGEX subsystem will then use the DHT to return a set of | ||
7382 | matching offerers to the patrons. | ||
7383 | |||
7384 | For the technical details, we have "Max's defense talk and Max's Master's | ||
7385 | thesis. An additional publication is under preparation and available to team | ||
7386 | members (in Git). | ||
7387 | |||
7388 | @menu | ||
7389 | * How to run the regex profiler:: | ||
7390 | @end menu | ||
7391 | |||
7392 | @node How to run the regex profiler | ||
7393 | @subsection How to run the regex profiler | ||
7394 | |||
7395 | @c %**end of header | ||
7396 | |||
7397 | The gnunet-regex-profiler can be used to profile the usage of mesh/regex for a | ||
7398 | given set of regular expressions and strings. Mesh/regex allows you to announce | ||
7399 | your peer ID under a certain regex and search for peers matching a particular | ||
7400 | regex using a string. See https://gnunet.org/szengel2012ms for a full | ||
7401 | introduction. | ||
7402 | |||
7403 | First of all, the regex profiler uses GNUnet testbed, thus all the implications | ||
7404 | for testbed also apply to the regex profiler (for example you need | ||
7405 | password-less ssh login to the machines listed in your hosts file). | ||
7406 | |||
7407 | @strong{Configuration} | ||
7408 | |||
7409 | Moreover, an appropriate configuration file is needed. Generally you can refer | ||
7410 | to SVN HEAD: contrib/regex_profiler_infiniband.conf for an example | ||
7411 | configuration. In the following paragraph the important details are | ||
7412 | highlighted. | ||
7413 | |||
7414 | Announcing of the regular expressions is done by the | ||
7415 | gnunet-daemon-regexprofiler, therefore you have to make sure it is started, by | ||
7416 | adding it to the AUTOSTART set of ARM:@ | ||
7417 | @code{ | ||
7418 | [regexprofiler]@ | ||
7419 | AUTOSTART = YES@ | ||
7420 | } | ||
7421 | |||
7422 | Furthermore you have to specify the location of the binary: | ||
7423 | @example | ||
7424 | [regexprofiler] | ||
7425 | # Location of the gnunet-daemon-regexprofiler binary. | ||
7426 | BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler | ||
7427 | # Regex prefix that will be applied to all regular expressions and | ||
7428 | # search string. | ||
7429 | REGEX_PREFIX = "GNVPN-0001-PAD" | ||
7430 | @end example | ||
7431 | |||
7432 | When running the profiler with a large scale deployment, you probably want to | ||
7433 | reduce the workload of each peer. Use the following options to do this.@ | ||
7434 | @example | ||
7435 | [dht]@ | ||
7436 | # Force network size estimation@ | ||
7437 | FORCE_NSE = 1 | ||
7438 | |||
7439 | [dhtcache] | ||
7440 | DATABASE = heap@ | ||
7441 | # Disable RC-file for Bloom filter? (for benchmarking with limited IO | ||
7442 | # availability)@ | ||
7443 | DISABLE_BF_RC = YES@ | ||
7444 | # Disable Bloom filter entirely@ | ||
7445 | DISABLE_BF = YES | ||
7446 | |||
7447 | [nse]@ | ||
7448 | # Minimize proof-of-work CPU consumption by NSE@ | ||
7449 | WORKBITS = 1 | ||
7450 | @end example | ||
7451 | |||
7452 | |||
7453 | @strong{Options} | ||
7454 | |||
7455 | To finally run the profiler some options and the input data need to be | ||
7456 | specified on the command line. | ||
7457 | @code{@ gnunet-regex-profiler -c config-file -d | ||
7458 | log-file -n num-links -p@ path-compression-length -s search-delay -t | ||
7459 | matching-timeout -a num-search-strings hosts-file policy-dir | ||
7460 | search-strings-file@ } | ||
7461 | |||
7462 | @code{config-file} the configuration file created earlier.@ @code{log-file} | ||
7463 | file where to write statistics output.@ @code{num-links} number of random links | ||
7464 | between started peers.@ @code{path-compression-length} maximum path compression | ||
7465 | length in the DFA.@ @code{search-delay} time to wait between peers finished | ||
7466 | linking and@ starting to match strings.@ @code{matching-timeout} timeout after | ||
7467 | witch to cancel the searching.@ @code{num-search-strings} number of strings in | ||
7468 | the search-strings-file. | ||
7469 | |||
7470 | The @code{hosts-file} should contain a list of hosts for the testbed, one per | ||
7471 | line in the following format. @code{user@@host_ip:port}. | ||
7472 | |||
7473 | The @code{policy-dir} is a folder containing text files containing one or more | ||
7474 | regular expressions. A peer is started for each file in that folder and the | ||
7475 | regular expressions in the corresponding file are announced by this peer. | ||
7476 | |||
7477 | The @code{search-strings-file} is a text file containing search strings, one in | ||
7478 | each line. | ||
7479 | |||
7480 | You can create regular expressions and search strings for every AS in the@ | ||
7481 | Internet using the attached scripts. You need one of the | ||
7482 | @uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA | ||
7483 | routeviews prefix2as} data files for this. Run @code{create_regex.py <filename> | ||
7484 | <output path>} to create the regular expressions and @code{create_strings.py | ||
7485 | <input path> <outfile>} to create a search strings file from the previously | ||
7486 | created regular expressions. | ||