aboutsummaryrefslogtreecommitdiff
path: root/doc/chapters/developer.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/chapters/developer.texi')
-rw-r--r--doc/chapters/developer.texi7486
1 files changed, 7486 insertions, 0 deletions
diff --git a/doc/chapters/developer.texi b/doc/chapters/developer.texi
new file mode 100644
index 000000000..ce6b16087
--- /dev/null
+++ b/doc/chapters/developer.texi
@@ -0,0 +1,7486 @@
1@c ***************************************************************************
2@node GNUnet Developer Handbook
3@chapter GNUnet Developer Handbook
4
5This book is intended to be an introduction for programmers that want to
6extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
7application. For developers, GNUnet is:
8
9@itemize @bullet
10@item Free software under the GNU General Public License, with a community
11that believes in the GNU philosophy
12@item
13A set of standards, including coding conventions and architectural rules
14@item
15A set of layered protocols, both specifying the communication between peers as
16well as the communication between components of a single peer.
17@item
18A set of libraries with well-defined APIs suitable for writing extensions
19@end itemize
20
21In particular, the architecture specifies that a peer consists of many
22processes communicating via protocols. Processes can be written in almost
23any language. C and Java APIs exist for accessing existing services and for
24writing extensions. It is possible to write extensions in other languages by
25implementing the necessary IPC protocols.
26
27GNUnet can be extended and improved along many possible dimensions, and anyone
28interested in free software and freedom-enhancing networking is welcome to
29join the effort. This developer handbook attempts to provide an initial
30introduction to some of the key design choices and central components of the
31system. This manual is far from complete, and we welcome informed
32contributions, be it in the form of new chapters or insightful comments.
33
34However, the website is experiencing a constant onslaught of sophisticated
35link-spam entered manually by exploited workers solving puzzles and
36customizing text. To limit this commercial defacement, we are strictly
37moderating comments and have disallowed "normal" users from posting new
38content. However, this is really only intended to keep the spam at bay. If
39you are a real user or aspiring developer, please drop us a note (IRC, e-mail,
40contact form) with your user profile ID number included. We will then relax
41these restrictions on your account. We're sorry for this inconvenience;
42however, few people would want to read this site if 99% of it was
43advertisements for bogus websites.
44
45
46
47@c ***************************************************************************
48
49
50
51
52
53
54
55
56@menu
57* Developer Introduction::
58* Code overview::
59* System Architecture::
60* Subsystem stability::
61* Naming conventions and coding style guide::
62* Build-system::
63* Developing extensions for GNUnet using the gnunet-ext template::
64* Writing testcases::
65* GNUnet's TESTING library::
66* Performance regression analysis with Gauger::
67* GNUnet's TESTBED Subsystem::
68* libgnunetutil::
69* The Automatic Restart Manager (ARM)::
70* GNUnet's TRANSPORT Subsystem::
71* NAT library::
72* Distance-Vector plugin::
73* SMTP plugin::
74* Bluetooth plugin::
75* WLAN plugin::
76* The ATS Subsystem::
77* GNUnet's CORE Subsystem::
78* GNUnet's CADET subsystem::
79* GNUnet's NSE subsystem::
80* GNUnet's HOSTLIST subsystem::
81* GNUnet's IDENTITY subsystem::
82* GNUnet's NAMESTORE Subsystem::
83* GNUnet's PEERINFO subsystem::
84* GNUnet's PEERSTORE subsystem::
85* GNUnet's SET Subsystem::
86* GNUnet's STATISTICS subsystem::
87* GNUnet's Distributed Hash Table (DHT)::
88* The GNU Name System (GNS)::
89* The GNS Namecache::
90* The REVOCATION Subsystem::
91* GNUnet's File-sharing (FS) Subsystem::
92* GNUnet's REGEX Subsystem::
93@end menu
94
95@node Developer Introduction
96@section Developer Introduction
97
98This developer handbook is intended as first introduction to GNUnet for new
99developers that want to extend the GNUnet framework. After the introduction,
100each of the GNUnet subsystems (directories in the src/ tree) is (supposed to
101be) covered in its own chapter. In addition to this documentation, GNUnet
102developers should be aware of the services available on the GNUnet server to
103them.
104
105New developers can have a look a the GNUnet tutorials for C and java available
106in the src/ directory of the repository or under the following links:
107
108@itemize @bullet
109@item GNUnet C tutorial
110@item GNUnet Java tutorial
111@end itemize
112
113In addition to this book, the GNUnet server contains various resources for
114GNUnet developers. They are all conveniently reachable via the "Developer"
115entry in the navigation menu. Some additional tools (such as static analysis
116reports) require a special developer access to perform certain operations. If
117you feel you need access, you should contact
118@uref{http://grothoff.org/christian/, Christian Grothoff}, GNUnet's maintainer.
119
120The public subsystems on the GNUnet server that help developers are:
121
122@itemize @bullet
123@item The Version control system keeps our code and enables distributed
124development. Only developers with write access can commit code, everyone else
125is encouraged to submit patches to the
126@uref{http://mail.gnu.org/mailman/listinfo/gnunet-developers, developer
127mailinglist}.
128@item The GNUnet bugtracking system is used to track feature requests, open bug
129reports and their resolutions. Anyone can report bugs, only developers can
130claim to have fixed them.
131@item A buildbot is used to check GNUnet builds automatically on a range of
132platforms. Builds are triggered automatically after 30 minutes of no changes to
133Git.
134@item The current quality of our automated test suite is assessed using Code
135coverage analysis. This analysis is run daily; however the webpage is only
136updated if all automated tests pass at that time. Testcases that improve our
137code coverage are always welcome.
138@item We try to automatically find bugs using a static analysis scan. This scan
139is run daily; however the webpage is only updated if all automated tests pass
140at the time. Note that not everything that is flagged by the analysis is a bug,
141sometimes even good code can be marked as possibly problematic. Nevertheless,
142developers are encouraged to at least be aware of all issues in their code that
143are listed.
144@item We use Gauger for automatic performance regression visualization. Details
145on how to use Gauger are here.
146@item We use @uref{http://junit.org/, junit} to automatically test gnunet-java.
147Automatically generated, current reports on the test suite are here.
148@item We use Cobertura to generate test coverage reports for gnunet-java.
149Current reports on test coverage are here.
150@end itemize
151
152
153
154@c ***************************************************************************
155@menu
156* Project overview::
157@end menu
158
159@node Project overview
160@subsection Project overview
161
162The GNUnet project consists at this point of several sub-projects. This section
163is supposed to give an initial overview about the various sub-projects. Note
164that this description also lists projects that are far from complete, including
165even those that have literally not a single line of code in them yet.
166
167GNUnet sub-projects in order of likely relevance are currently:
168
169@table @asis
170
171@item svn/gnunet Core of the P2P framework, including file-sharing, VPN and
172chat applications; this is what the developer handbook covers mostly
173@item svn/gnunet-gtk/ Gtk+-based user interfaces, including gnunet-fs-gtk
174(file-sharing), gnunet-statistics-gtk (statistics over time),
175gnunet-peerinfo-gtk (information about current connections and known peers),
176gnunet-chat-gtk (chat GUI) and gnunet-setup (setup tool for "everything")
177@item svn/gnunet-fuse/ Mounting directories shared via GNUnet's file-sharing on Linux
178@item svn/gnunet-update/ Installation and update tool
179@item svn/gnunet-ext/
180Template for starting 'external' GNUnet projects
181@item svn/gnunet-java/ Java
182APIs for writing GNUnet services and applications
183@item svn/gnunet-www/ Code
184and media helping drive the GNUnet website
185@item svn/eclectic/ Code to run
186GNUnet nodes on testbeds for research, development, testing and evaluation
187@item svn/gnunet-qt/ qt-based GNUnet GUI (dead?)
188@item svn/gnunet-cocoa/
189cocoa-based GNUnet GUI (dead?)
190
191@end table
192
193We are also working on various supporting libraries and tools:
194
195@table @asis
196@item svn/Extractor/ GNU libextractor (meta data extraction)
197@item svn/libmicrohttpd/ GNU libmicrohttpd (embedded HTTP(S) server library)
198@item svn/gauger/ Tool for performance regression analysis
199@item svn/monkey/ Tool for automated debugging of distributed systems
200@item svn/libmwmodem/ Library for accessing satellite connection quality reports
201@end table
202
203Finally, there are various external projects (see links for a list of those
204that have a public website) which build on top of the GNUnet framework.
205
206@c ***************************************************************************
207@node Code overview
208@section Code overview
209
210This section gives a brief overview of the GNUnet source code. Specifically, we
211sketch the function of each of the subdirectories in the @code{gnunet/src/}
212directory. The order given is roughly bottom-up (in terms of the layers of the
213system).
214@table @asis
215
216@item util/ --- libgnunetutil Library with general utility functions, all
217GNUnet binaries link against this library. Anything from memory allocation and
218data structures to cryptography and inter-process communication. The goal is to
219provide an OS-independent interface and more 'secure' or convenient
220implementations of commonly used primitives. The API is spread over more than a
221dozen headers, developers should study those closely to avoid duplicating
222existing functions.
223@item hello/ --- libgnunethello HELLO messages are used to
224describe under which addresses a peer can be reached (for example, protocol,
225IP, port). This library manages parsing and generating of HELLO messages.
226@item block/ --- libgnunetblock The DHT and other components of GNUnet store
227information in units called 'blocks'. Each block has a type and the type
228defines a particular format and how that binary format is to be linked to a
229hash code (the key for the DHT and for databases). The block library is a
230wapper around block plugins which provide the necessary functions for each
231block type.
232@item statistics/ The statistics service enables associating
233values (of type uint64_t) with a componenet name and a string. The main uses is
234debugging (counting events), performance tracking and user entertainment (what
235did my peer do today?).
236@item arm/ The automatic-restart-manager (ARM) service
237is the GNUnet master service. Its role is to start gnunet-services, to re-start
238them when they crashed and finally to shut down the system when requested.
239@item peerinfo/ The peerinfo service keeps track of which peers are known to
240the local peer and also tracks the validated addresses for each peer (in the
241form of a HELLO message) for each of those peers. The peer is not necessarily
242connected to all peers known to the peerinfo service. Peerinfo provides
243persistent storage for peer identities --- peers are not forgotten just because
244of a system restart.
245@item datacache/ --- libgnunetdatacache The datacache
246library provides (temporary) block storage for the DHT. Existing plugins can
247store blocks in Sqlite, Postgres or MySQL databases. All data stored in the
248cache is lost when the peer is stopped or restarted (datacache uses temporary
249tables).
250@item datastore/ The datastore service stores file-sharing blocks in
251databases for extended periods of time. In contrast to the datacache, data is
252not lost when peers restart. However, quota restrictions may still cause old,
253expired or low-priority data to be eventually discarded. Existing plugins can
254store blocks in Sqlite, Postgres or MySQL databases.
255@item template/ Template
256for writing a new service. Does nothing.
257@item ats/ The automatic transport
258selection (ATS) service is responsible for deciding which address (i.e. which
259transport plugin) should be used for communication with other peers, and at
260what bandwidth.
261@item nat/ --- libgnunetnat Library that provides basic
262functions for NAT traversal. The library supports NAT traversal with manual
263hole-punching by the user, UPnP and ICMP-based autonomous NAT traversal. The
264library also includes an API for testing if the current configuration works and
265the @code{gnunet-nat-server} which provides an external service to test the
266local configuration.
267@item fragmentation/ --- libgnunetfragmentation Some
268transports (UDP and WLAN, mostly) have restrictions on the maximum transfer
269unit (MTU) for packets. The fragmentation library can be used to break larger
270packets into chunks of at most 1k and transmit the resulting fragments
271reliabily (with acknowledgement, retransmission, timeouts, etc.).
272@item transport/ The transport service is responsible for managing the basic P2P
273communication. It uses plugins to support P2P communication over TCP, UDP,
274HTTP, HTTPS and other protocols.The transport service validates peer addresses,
275enforces bandwidth restrictions, limits the total number of connections and
276enforces connectivity restrictions (i.e. friends-only).
277@item peerinfo-tool/
278This directory contains the gnunet-peerinfo binary which can be used to inspect
279the peers and HELLOs known to the peerinfo service.
280@item core/ The core
281service is responsible for establishing encrypted, authenticated connections
282with other peers, encrypting and decrypting messages and forwarding messages to
283higher-level services that are interested in them.
284@item testing/ ---
285libgnunettesting The testing library allows starting (and stopping) peers for
286writing testcases.@
287It also supports automatic generation of configurations for
288peers ensuring that the ports and paths are disjoint. libgnunettesting is also
289the foundation for the testbed service
290@item testbed/ The testbed service is
291used for creating small or large scale deployments of GNUnet peers for
292evaluation of protocols. It facilitates peer depolyments on multiple hosts (for
293example, in a cluster) and establishing varous network topologies (both
294underlay and overlay).
295@item nse/ The network size estimation (NSE) service
296implements a protocol for (securely) estimating the current size of the P2P
297network.
298@item dht/ The distributed hash table (DHT) service provides a
299distributed implementation of a hash table to store blocks under hash keys in
300the P2P network.
301@item hostlist/ The hostlist service allows learning about
302other peers in the network by downloading HELLO messages from an HTTP server,
303can be configured to run such an HTTP server and also implements a P2P protocol
304to advertise and automatically learn about other peers that offer a public
305hostlist server.
306@item topology/ The topology service is responsible for
307maintaining the mesh topology. It tries to maintain connections to friends
308(depending on the configuration) and also tries to ensure that the peer has a
309decent number of active connections at all times. If necessary, new connections
310are added. All peers should run the topology service, otherwise they may end up
311not being connected to any other peer (unless some other service ensures that
312core establishes the required connections). The topology service also tells the
313transport service which connections are permitted (for friend-to-friend
314networking)
315@item fs/ The file-sharing (FS) service implements GNUnet's
316file-sharing application. Both anonymous file-sharing (using gap) and
317non-anonymous file-sharing (using dht) are supported.
318@item cadet/ The CADET
319service provides a general-purpose routing abstraction to create end-to-end
320encrypted tunnels in mesh networks. We wrote a paper documenting key aspects of
321the design.
322@item tun/ --- libgnunettun Library for building IPv4, IPv6
323packets and creating checksums for UDP, TCP and ICMP packets. The header
324defines C structs for common Internet packet formats and in particular structs
325for interacting with TUN (virtual network) interfaces.
326@item mysql/ ---
327libgnunetmysql Library for creating and executing prepared MySQL statements and
328to manage the connection to the MySQL database. Essentially a lightweight
329wrapper for the interaction between GNUnet components and libmysqlclient.
330@item dns/ Service that allows intercepting and modifying DNS requests of the
331local machine. Currently used for IPv4-IPv6 protocol translation (DNS-ALG) as
332implemented by "pt/" and for the GNUnet naming system. The service can also be
333configured to offer an exit service for DNS traffic.
334@item vpn/ The virtual
335public network (VPN) service provides a virtual tunnel interface (VTUN) for IP
336routing over GNUnet. Needs some other peers to run an "exit" service to work.
337Can be activated using the "gnunet-vpn" tool or integrated with DNS using the
338"pt" daemon.
339@item exit/ Daemon to allow traffic from the VPN to exit this
340peer to the Internet or to specific IP-based services of the local peer.
341Currently, an exit service can only be restricted to IPv4 or IPv6, not to
342specific ports and or IP address ranges. If this is not acceptable, additional
343firewall rules must be added manually. exit currently only works for normal
344UDP, TCP and ICMP traffic; DNS queries need to leave the system via a DNS
345service.
346@item pt/ protocol translation daemon. This daemon enables 4-to-6,
3476-to-4, 4-over-6 or 6-over-4 transitions for the local system. It essentially
348uses "DNS" to intercept DNS replies and then maps results to those offered by
349the VPN, which then sends them using mesh to some daemon offering an
350appropriate exit service.
351@item identity/ Management of egos (alter egos) of a
352user; identities are essentially named ECC private keys and used for zones in
353the GNU name system and for namespaces in file-sharing, but might find other
354uses later
355@item revocation/ Key revocation service, can be used to revoke the
356private key of an identity if it has been compromised
357@item namecache/ Cache
358for resolution results for the GNU name system; data is encrypted and can be
359shared among users, loss of the data should ideally only result in a
360performance degradation (persistence not required)
361@item namestore/ Database
362for the GNU name system with per-user private information, persistence required
363@item gns/ GNU name system, a GNU approach to DNS and PKI.
364@item dv/ A plugin
365for distance-vector (DV)-based routing. DV consists of a service and a
366transport plugin to provide peers with the illusion of a direct P2P connection
367for connections that use multiple (typically up to 3) hops in the actual
368underlay network.
369@item regex/ Service for the (distributed) evaluation of
370regular expressions.
371@item scalarproduct/ The scalar product service offers an
372API to perform a secure multiparty computation which calculates a scalar
373product between two peers without exposing the private input vectors of the
374peers to each other.
375@item consensus/ The consensus service will allow a set
376of peers to agree on a set of values via a distributed set union computation.
377@item rest/ The rest API allows access to GNUnet services using RESTful
378interaction. The services provide plugins that can exposed by the rest server.
379@item experimentation/ The experimentation daemon coordinates distributed
380experimentation to evaluate transport and ats properties
381@end table
382
383@c ***************************************************************************
384@node System Architecture
385@section System Architecture
386
387GNUnet developers like legos. The blocks are indestructible, can be stacked
388together to construct complex buildings and it is generally easy to swap one
389block for a different one that has the same shape. GNUnet's architecture is
390based on legos:
391
392
393
394This chapter documents the GNUnet lego system, also known as GNUnet's system
395architecture.
396
397The most common GNUnet component is a service. Services offer an API (or
398several, depending on what you count as "an API") which is implemented as a
399library. The library communicates with the main process of the service using a
400service-specific network protocol. The main process of the service typically
401doesn't fully provide everything that is needed --- it has holes to be filled
402by APIs to other services.
403
404A special kind of component in GNUnet are user interfaces and daemons. Like
405services, they have holes to be filled by APIs of other services. Unlike
406services, daemons do not implement their own network protocol and they have no
407API:
408
409The GNUnet system provides a range of services, daemons and user interfaces,
410which are then combined into a layered GNUnet instance (also known as a peer).
411
412Note that while it is generally possible to swap one service for another
413compatible service, there is often only one implementation. However, during
414development we often have a "new" version of a service in parallel with an
415"old" version. While the "new" version is not working, developers working on
416other parts of the service can continue their development by simply using the
417"old" service. Alternative design ideas can also be easily investigated by
418swapping out individual components. This is typically achieved by simply
419changing the name of the "BINARY" in the respective configuration section.
420
421Key properties of GNUnet services are that they must be separate processes and
422that they must protect themselves by applying tight error checking against the
423network protocol they implement (thereby achieving a certain degree of
424robustness).
425
426On the other hand, the APIs are implemented to tolerate failures of the
427service, isolating their host process from errors by the service. If the
428service process crashes, other services and daemons around it should not also
429fail, but instead wait for the service process to be restarted by ARM.
430
431
432@c ***************************************************************************
433@node Subsystem stability
434@section Subsystem stability
435
436This page documents the current stability of the various GNUnet subsystems.
437Stability here describes the expected degree of compatibility with future
438versions of GNUnet. For each subsystem we distinguish between compatibility on
439the P2P network level (communication protocol between peers), the IPC level
440(communication between the service and the service library) and the API level
441(stability of the API). P2P compatibility is relevant in terms of which
442applications are likely going to be able to communicate with future versions of
443the network. IPC communication is relevant for the implementation of language
444bindings that re-implement the IPC messages. Finally, API compatibility is
445relevant to developers that hope to be able to avoid changes to applications
446build on top of the APIs of the framework.
447
448The following table summarizes our current view of the stability of the
449respective protocols or APIs:
450
451@multitable @columnfractions .20 .20 .20 .20
452@headitem Subsystem @tab P2P @tab IPC @tab C API
453@item util @tab n/a @tab n/a @tab stable
454@item arm @tab n/a @tab stable @tab stable
455@item ats @tab n/a @tab unstable @tab testing
456@item block @tab n/a @tab n/a @tab stable
457@item cadet @tab testing @tab testing @tab testing
458@item consensus @tab experimental @tab experimental @tab experimental
459@item core @tab stable @tab stable @tab stable
460@item datacache @tab n/a @tab n/a @tab stable
461@item datastore @tab n/a @tab stable @tab stable
462@item dht @tab stable @tab stable @tab stable
463@item dns @tab stable @tab stable @tab stable
464@item dv @tab testing @tab testing @tab n/a
465@item exit @tab testing @tab n/a @tab n/a
466@item fragmentation @tab stable @tab n/a @tab stable
467@item fs @tab stable @tab stable @tab stable
468@item gns @tab stable @tab stable @tab stable
469@item hello @tab n/a @tab n/a @tab testing
470@item hostlist @tab stable @tab stable @tab n/a
471@item identity @tab stable @tab stable @tab n/a
472@item multicast @tab experimental @tab experimental @tab experimental
473@item mysql @tab stable @tab n/a @tab stable
474@item namestore @tab n/a @tab stable @tab stable
475@item nat @tab n/a @tab n/a @tab stable
476@item nse @tab stable @tab stable @tab stable
477@item peerinfo @tab n/a @tab stable @tab stable
478@item psyc @tab experimental @tab experimental @tab experimental
479@item pt @tab n/a @tab n/a @tab n/a
480@item regex @tab stable @tab stable @tab stable
481@item revocation @tab stable @tab stable @tab stable
482@item social @tab experimental @tab experimental @tab experimental
483@item statistics @tab n/a @tab stable @tab stable
484@item testbed @tab n/a @tab testing @tab testing
485@item testing @tab n/a @tab n/a @tab testing
486@item topology @tab n/a @tab n/a @tab n/a
487@item transport @tab stable @tab stable @tab stable
488@item tun @tab n/a @tab n/a @tab stable
489@item vpn @tab testing @tab n/a @tab n/a
490@end multitable
491
492Here is a rough explanation of the values:
493
494@table @samp
495@item stable
496No incompatible changes are planned at this time; for IPC/APIs, if
497there are incompatible changes, they will be minor and might only require
498minimal changes to existing code; for P2P, changes will be avoided if at all
499possible for the 0.10.x-series
500
501@item testing
502No incompatible changes are
503planned at this time, but the code is still known to be in flux; so while we
504have no concrete plans, our expectation is that there will still be minor
505modifications; for P2P, changes will likely be extensions that should not break
506existing code
507
508@item unstable
509Changes are planned and will happen; however, they
510will not be totally radical and the result should still resemble what is there
511now; nevertheless, anticipated changes will break protocol/API compatibility
512
513@item experimental
514Changes are planned and the result may look nothing like
515what the API/protocol looks like today
516
517@item unknown
518Someone should think about where this subsystem headed
519
520@item n/a
521This subsystem does not have an API/IPC-protocol/P2P-protocol
522@end table
523
524@c ***************************************************************************
525@node Naming conventions and coding style guide
526@section Naming conventions and coding style guide
527
528Here you can find some rules to help you write code for GNUnet.
529
530
531
532@c ***************************************************************************
533@menu
534* Naming conventions::
535* Coding style::
536@end menu
537
538@node Naming conventions
539@subsection Naming conventions
540
541
542@c ***************************************************************************
543@menu
544* include files::
545* binaries::
546* logging::
547* configuration::
548* exported symbols::
549* private (library-internal) symbols (including structs and macros)::
550* testcases::
551* performance tests::
552* src/ directories::
553@end menu
554
555@node include files
556@subsubsection include files
557
558@itemize @bullet
559@item _lib: library without need for a process
560@item _service: library that needs a service process
561@item _plugin: plugin definition
562@item _protocol: structs used in network protocol
563@item exceptions:
564@itemize @bullet
565@item gnunet_config.h --- generated
566@item platform.h --- first included
567@item plibc.h --- external library
568@item gnunet_common.h --- fundamental routines
569@item gnunet_directories.h --- generated
570@item gettext.h --- external library
571@end itemize
572@end itemize
573
574@c ***************************************************************************
575@node binaries
576@subsubsection binaries
577
578@itemize @bullet
579@item gnunet-service-xxx: service process (has listen socket)
580@item gnunet-daemon-xxx: daemon process (no listen socket)
581@item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
582@item gnunet-yyy: command-line tool for end-users
583@item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
584@item libgnunetxxx.so: library for API xxx
585@end itemize
586
587@c ***************************************************************************
588@node logging
589@subsubsection logging
590
591@itemize @bullet
592@item services and daemons use their directory name in GNUNET_log_setup (i.e.
593'core') and log using plain 'GNUNET_log'.
594@item command-line tools use their full name in GNUNET_log_setup (i.e.
595'gnunet-publish') and log using plain 'GNUNET_log'.
596@item service access libraries log using 'GNUNET_log_from' and use
597'DIRNAME-api' for the component (i.e. 'core-api')
598@item pure libraries (without associated service) use 'GNUNET_log_from' with
599the component set to their library name (without lib or '.so'), which should
600also be their directory name (i.e. 'nat')
601@item plugins should use 'GNUNET_log_from' with the directory name and the
602plugin name combined to produce the component name (i.e. 'transport-tcp').
603@item logging should be unified per-file by defining a LOG macro with the
604appropriate arguments, along these lines:@ #define LOG(kind,...)
605GNUNET_log_from (kind, "example-api",__VA_ARGS__)
606@end itemize
607
608@c ***************************************************************************
609@node configuration
610@subsubsection configuration
611
612@itemize @bullet
613@item paths (that are substituted in all filenames) are in PATHS (have as few
614as possible)
615@item all options for a particular module (src/MODULE) are under [MODULE]
616@item options for a plugin of a module are under [MODULE-PLUGINNAME]
617@end itemize
618
619@c ***************************************************************************
620@node exported symbols
621@subsubsection exported symbols
622
623@itemize @bullet
624@item must start with "GNUNET_modulename_" and be defined in "modulename.c"
625@item exceptions: those defined in gnunet_common.h
626@end itemize
627
628@c ***************************************************************************
629@node private (library-internal) symbols (including structs and macros)
630@subsubsection private (library-internal) symbols (including structs and macros)
631
632@itemize @bullet
633@item must NOT start with any prefix
634@item must not be exported in a way that linkers could use them or@ other
635libraries might see them via headers; they must be either@ declared/defined in
636C source files or in headers that are in@ the respective directory under
637src/modulename/ and NEVER be@ declared in src/include/.
638@end itemize
639
640@node testcases
641@subsubsection testcases
642
643@itemize @bullet
644@item must be called "test_module-under-test_case-description.c"
645@item "case-description" maybe omitted if there is only one test
646@end itemize
647
648@c ***************************************************************************
649@node performance tests
650@subsubsection performance tests
651
652@itemize @bullet
653@item must be called "perf_module-under-test_case-description.c"
654@item "case-description" maybe omitted if there is only one performance test
655@item Must only be run if HAVE_BENCHMARKS is satisfied
656@end itemize
657
658@c ***************************************************************************
659@node src/ directories
660@subsubsection src/ directories
661
662@itemize @bullet
663@item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
664@item gnunet-service-NAME: service processes with accessor library (i.e.,
665gnunet-service-arm)
666@item libgnunetNAME: accessor library (_service.h-header) or standalone library
667(_lib.h-header)
668@item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
669gnunet-daemon-hostlist) and no GNUnet management port
670@item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
671libgnunet_plugin_transport_tcp)
672@end itemize
673
674@c ***************************************************************************
675@node Coding style
676@subsection Coding style
677
678@itemize @bullet
679@item GNU guidelines generally apply
680@item Indentation is done with spaces, two per level, no tabs
681@item C99 struct initialization is fine
682@item declare only one variable per line, so@
683
684@example
685int i; int j;
686@end example
687
688instead of
689
690@example
691int i,j;
692@end example
693
694This helps keep diffs small and forces developers to think precisely about the
695type of every variable. Note that @code{char *} is different from @code{const
696char*} and @code{int} is different from @code{unsigned int} or @code{uint32_t}.
697Each variable type should be chosen with care.
698
699@item While @code{goto} should generally be avoided, having a @code{goto} to
700the end of a function to a block of clean up statements (free, close, etc.) can
701be acceptable.
702
703@item Conditions should be written with constants on the left (to avoid
704accidental assignment) and with the 'true' target being either the 'error' case
705or the significantly simpler continuation. For example:@
706
707@example
708if (0 != stat ("filename," &sbuf)) @{ error(); @} else @{
709 /* handle normal case here */
710@}
711@end example
712
713
714instead of
715@example
716if (stat ("filename," &sbuf) == 0) @{
717 /* handle normal case here */
718@} else @{ error(); @}
719@end example
720
721
722If possible, the error clause should be terminated with a 'return' (or 'goto'
723to some cleanup routine) and in this case, the 'else' clause should be omitted:
724@example
725if (0 != stat ("filename," &sbuf)) @{ error(); return; @}
726/* handle normal case here */
727@end example
728
729
730This serves to avoid deep nesting. The 'constants on the left' rule applies to
731all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}), NULL, and enums).
732With the two above rules (constants on left, errors in 'true' branch), there is
733only one way to write most branches correctly.
734
735@item Combined assignments and tests are allowed if they do not hinder code
736clarity. For example, one can write:@
737
738@example
739if (NULL == (value = lookup_function())) @{ error(); return; @}
740@end example
741
742
743@item Use @code{break} and @code{continue} wherever possible to avoid deep(er)
744nesting. Thus, we would write:@
745
746@example
747next = head; while (NULL != (pos = next)) @{ next = pos->next; if (!
748should_free (pos)) continue; GNUNET_CONTAINER_DLL_remove (head, tail, pos);
749GNUNET_free (pos); @}
750@end example
751
752
753instead of
754@example
755next = head; while (NULL != (pos = next)) @{ next =
756pos->next; if (should_free (pos)) @{
757 /* unnecessary nesting! */
758 GNUNET_CONTAINER_DLL_remove (head, tail, pos); GNUNET_free (pos); @} @}
759@end example
760
761
762@item We primarily use @code{for} and @code{while} loops. A @code{while} loop
763is used if the method for advancing in the loop is not a straightforward
764increment operation. In particular, we use:@
765
766@example
767next = head;
768while (NULL != (pos = next))
769@{
770 next = pos->next;
771 if (! should_free (pos))
772 continue;
773 GNUNET_CONTAINER_DLL_remove (head, tail, pos);
774 GNUNET_free (pos);
775@}
776@end example
777
778
779to free entries in a list (as the iteration changes the structure of the list
780due to the free; the equivalent @code{for} loop does no longer follow the
781simple @code{for} paradigm of @code{for(INIT;TEST;INC)}). However, for loops
782that do follow the simple @code{for} paradigm we do use @code{for}, even if it
783involves linked lists:
784@example
785/* simple iteration over a linked list */
786for (pos = head; NULL != pos; pos = pos->next)
787@{
788 use (pos);
789@}
790@end example
791
792
793@item The first argument to all higher-order functions in GNUnet must be
794declared to be of type @code{void *} and is reserved for a closure. We do not
795use inner functions, as trampolines would conflict with setups that use
796non-executable stacks.@ The first statement in a higher-order function, which
797unusually should be part of the variable declarations, should assign the
798@code{cls} argument to the precise expected type. For example:
799@example
800int callback (void *cls, char *args) @{
801 struct Foo *foo = cls; int other_variables;
802
803 /* rest of function */
804@}
805@end example
806
807
808@item It is good practice to write complex @code{if} expressions instead of
809using deeply nested @code{if} statements. However, except for addition and
810multiplication, all operators should use parens. This is fine:@
811
812@example
813if ( (1 == foo) || ((0 == bar) && (x != y)) )
814 return x;
815@end example
816
817
818However, this is not:
819@example
820if (1 == foo)
821 return x;
822if (0 == bar && x != y)
823 return x;
824@end example
825
826
827Note that splitting the @code{if} statement above is debateable as the
828@code{return x} is a very trivial statement. However, once the logic after the
829branch becomes more complicated (and is still identical), the "or" formulation
830should be used for sure.
831
832@item There should be two empty lines between the end of the function and the
833comments describing the following function. There should be a single empty line
834after the initial variable declarations of a function. If a function has no
835local variables, there should be no initial empty line. If a long function
836consists of several complex steps, those steps might be separated by an empty
837line (possibly followed by a comment describing the following step). The code
838should not contain empty lines in arbitrary places; if in doubt, it is likely
839better to NOT have an empty line (this way, more code will fit on the screen).
840@end itemize
841
842@c ***************************************************************************
843@node Build-system
844@section Build-system
845
846If you have code that is likely not to compile or build rules you might want to
847not trigger for most developers, use "if HAVE_EXPERIMENTAL" in your
848Makefile.am. Then it is OK to (temporarily) add non-compiling (or
849known-to-not-port) code.
850
851If you want to compile all testcases but NOT run them, run configure with the@
852@code{--enable-test-suppression} option.
853
854If you want to run all testcases, including those that take a while, run
855configure with the@ @code{--enable-expensive-testcases} option.
856
857If you want to compile and run benchmarks, run configure with the@
858@code{--enable-benchmarks} option.
859
860If you want to obtain code coverage results, run configure with the@
861@code{--enable-coverage} option and run the coverage.sh script in contrib/.
862
863@c ***************************************************************************
864@node Developing extensions for GNUnet using the gnunet-ext template
865@section Developing extensions for GNUnet using the gnunet-ext template
866
867
868For developers who want to write extensions for GNUnet we provide the
869gnunet-ext template to provide an easy to use skeleton.
870
871gnunet-ext contains the build environment and template files for the
872development of GNUnet services, command line tools, APIs and tests.
873
874First of all you have to obtain gnunet-ext from SVN:
875
876@code{svn co https://gnunet.org/svn/gnunet-ext}
877
878The next step is to bootstrap and configure it. For configure you have to
879provide the path containing GNUnet with @code{--with-gnunet=/path/to/gnunet}
880and the prefix where you want the install the extension using
881@code{--prefix=/path/to/install}@ @code{@ ./bootstrap@ ./configure
882--prefix=/path/to/install --with-gnunet=/path/to/gnunet@ }
883
884When your GNUnet installation is not included in the default linker search
885path, you have to add @code{/path/to/gnunet} to the file @code{/etc/ld.so.conf}
886and run @code{ldconfig} or your add it to the environmental variable
887@code{LD_LIBRARY_PATH} by using
888
889@code{export LD_LIBRARY_PATH=/path/to/gnunet/lib}
890
891@c ***************************************************************************
892@node Writing testcases
893@section Writing testcases
894
895Ideally, any non-trivial GNUnet code should be covered by automated testcases.
896Testcases should reside in the same place as the code that is being tested. The
897name of source files implementing tests should begin with "test_" followed by
898the name of the file that contains the code that is being tested.
899
900Testcases in GNUnet should be integrated with the autotools build system. This
901way, developers and anyone building binary packages will be able to run all
902testcases simply by running @code{make check}. The final testcases shipped with
903the distribution should output at most some brief progress information and not
904display debug messages by default. The success or failure of a testcase must be
905indicated by returning zero (success) or non-zero (failure) from the main
906method of the testcase. The integration with the autotools is relatively
907straightforward and only requires modifications to the @code{Makefile.am} in
908the directory containing the testcase. For a testcase testing the code in
909@code{foo.c} the @code{Makefile.am} would contain the following lines:
910@example
911check_PROGRAMS = test_foo TESTS = $(check_PROGRAMS) test_foo_SOURCES =
912test_foo.c test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
913@end example
914
915Naturally, other libraries used by the testcase may be specified in the
916@code{LDADD} directive as necessary.
917
918Often testcases depend on additional input files, such as a configuration file.
919These support files have to be listed using the EXTRA_DIST directive in order
920to ensure that they are included in the distribution. Example:
921@example
922EXTRA_DIST = test_foo_data.conf
923@end example
924
925
926Executing @code{make check} will run all testcases in the current directory and
927all subdirectories. Testcases can be compiled individually by running
928@code{make test_foo} and then invoked directly using @code{./test_foo}. Note
929that due to the use of plugins in GNUnet, it is typically necessary to run
930@code{make install} before running any testcases. Thus the canonical command
931@code{make check install} has to be changed to @code{make install check} for
932GNUnet.
933
934@c ***************************************************************************
935@node GNUnet's TESTING library
936@section GNUnet's TESTING library
937
938The TESTING library is used for writing testcases which involve starting a
939single or multiple peers. While peers can also be started by testcases using
940the ARM subsystem, using TESTING library provides an elegant way to do this.
941The configurations of the peers are auto-generated from a given template to
942have non-conflicting port numbers ensuring that peers' services do not run into
943bind errors. This is achieved by testing ports' availability by binding a
944listening socket to them before allocating them to services in the generated
945configurations.
946
947An another advantage while using TESTING is that it shortens the testcase
948startup time as the hostkeys for peers are copied from a pre-computed set of
949hostkeys instead of generating them at peer startup which may take a
950considerable amount of time when starting multiple peers or on an embedded
951processor.
952
953TESTING also allows for certain services to be shared among peers. This feature
954is invaluable when testing with multiple peers as it helps to reduce the number
955of services run per each peer and hence the total number of processes run per
956testcase.
957
958TESTING library only handles creating, starting and stopping peers. Features
959useful for testcases such as connecting peers in a topology are not available
960in TESTING but are available in the TESTBED subsystem. Furthermore, TESTING
961only creates peers on the localhost, however by using TESTBED testcases can
962benefit from creating peers across multiple hosts.
963
964@menu
965* API::
966* Finer control over peer stop::
967* Helper functions::
968* Testing with multiple processes::
969@end menu
970
971@c ***************************************************************************
972@node API
973@subsection API
974
975TESTING abstracts a group of peers as a TESTING system. All peers in a system
976have common hostname and no two services of these peers have a same port or a
977UNIX domain socket path.
978
979TESTING system can be created with the function
980@code{GNUNET_TESTING_system_create()} which returns a handle to the system.
981This function takes a directory path which is used for generating the
982configurations of peers, an IP address from which connections to the peers'
983services should be allowed, the hostname to be used in peers' configuration,
984and an array of shared service specifications of type @code{struct
985GNUNET_TESTING_SharedService}.
986
987The shared service specification must specify the name of the service to share,
988the configuration pertaining to that shared service and the maximum number of
989peers that are allowed to share a single instance of the shared service.
990
991TESTING system created with @code{GNUNET_TESTING_system_create()} chooses ports
992from the default range 12000 - 56000 while auto-generating configurations for
993peers. This range can be customised with the function
994@code{GNUNET_TESTING_system_create_with_portrange()}. This function is similar
995to @code{GNUNET_TESTING_system_create()} except that it take 2 additional
996parameters --- the start and end of the port range to use.
997
998A TESTING system is destroyed with the funciton
999@code{GNUNET_TESTING_system_destory()}. This function takes the handle of the
1000system and a flag to remove the files created in the directory used to generate
1001configurations.
1002
1003A peer is created with the function @code{GNUNET_TESTING_peer_configure()}.
1004This functions takes the system handle, a configuration template from which the
1005configuration for the peer is auto-generated and the index from where the
1006hostkey for the peer has to be copied from. When successfull, this function
1007returs a handle to the peer which can be used to start and stop it and to
1008obtain the identity of the peer. If unsuccessful, a NULL pointer is returned
1009with an error message. This function handles the generated configuration to
1010have non-conflicting ports and paths.
1011
1012Peers can be started and stopped by calling the functions
1013@code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1014respectively. A peer can be destroyed by calling the function
1015@code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports and
1016paths in allocated in its configuration are reclaimed for usage in new
1017peers.
1018
1019@c ***************************************************************************
1020@node Finer control over peer stop
1021@subsection Finer control over peer stop
1022
1023Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1024However, calling this function for each peer is inefficient when trying to
1025shutdown multiple peers as this function sends the termination signal to the
1026given peer process and waits for it to terminate. It would be faster in this
1027case to send the termination signals to the peers first and then wait on them.
1028This is accomplished by the functions @code{GNUNET_TESTING_peer_kill()} which
1029sends a termination signal to the peer, and the function
1030@code{GNUNET_TESTING_peer_wait()} which waits on the peer.
1031
1032Further finer control can be achieved by choosing to stop a peer asynchronously
1033with the function @code{GNUNET_TESTING_peer_stop_async()}. This function takes
1034a callback parameter and a closure for it in addition to the handle to the peer
1035to stop. The callback function is called with the given closure when the peer
1036is stopped. Using this function eliminates blocking while waiting for the peer
1037to terminate.
1038
1039An asynchronous peer stop can be cancelled by calling the function
1040@code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this function
1041does not prevent the peer from terminating if the termination signal has
1042already been sent to it. It does, however, cancels the callback to be called
1043when the peer is stopped.
1044
1045@c ***************************************************************************
1046@node Helper functions
1047@subsection Helper functions
1048
1049Most of the testcases can benefit from an abstraction which configures a peer
1050and starts it. This is provided by the function
1051@code{GNUNET_TESTING_peer_run()}. This function takes the testing directory
1052pathname, a configuration template, a callback and its closure. This function
1053creates a peer in the given testing directory by using the configuration
1054template, starts the peer and calls the given callback with the given closure.
1055
1056The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of the
1057peer which starts the rest of the configured services. A similar function
1058@code{GNUNET_TESTING_service_run} can be used to just start a single service of
1059a peer. In this case, the peer's ARM service is not started; instead, only the
1060given service is run.
1061
1062@c ***************************************************************************
1063@node Testing with multiple processes
1064@subsection Testing with multiple processes
1065
1066When testing GNUnet, the splitting of the code into a services and clients
1067often complicates testing. The solution to this is to have the testcase fork
1068@code{gnunet-service-arm}, ask it to start the required server and daemon
1069processes and then execute appropriate client actions (to test the client APIs
1070or the core module or both). If necessary, multiple ARM services can be forked
1071using different ports (!) to simulate a network. However, most of the time only
1072one ARM process is needed. Note that on exit, the testcase should shutdown ARM
1073with a @code{TERM} signal (to give it the chance to cleanly stop its child
1074processes).
1075
1076The following code illustrates spawning and killing an ARM process from a
1077testcase:
1078@example
1079static void run (void *cls, char *const *args, const char
1080*cfgfile, const struct GNUNET_CONFIGURATION_Handle *cfg) @{ struct
1081GNUNET_OS_Process *arm_pid; arm_pid = GNUNET_OS_start_process (NULL, NULL,
1082"gnunet-service-arm", "gnunet-service-arm", "-c", cfgname, NULL);
1083 /* do real test work here */
1084 if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM)) GNUNET_log_strerror
1085 (GNUNET_ERROR_TYPE_WARNING, "kill"); GNUNET_assert (GNUNET_OK ==
1086 GNUNET_OS_process_wait (arm_pid)); GNUNET_OS_process_close (arm_pid); @}
1087
1088GNUNET_PROGRAM_run (argc, argv, "NAME-OF-TEST", "nohelp", options, &run, cls);
1089@end example
1090
1091
1092An alternative way that works well to test plugins is to implement a
1093mock-version of the environment that the plugin expects and then to simply load
1094the plugin directly.
1095
1096@c ***************************************************************************
1097@node Performance regression analysis with Gauger
1098@section Performance regression analysis with Gauger
1099
1100To help avoid performance regressions, GNUnet uses Gauger. Gauger is a simple
1101logging tool that allows remote hosts to send performance data to a central
1102server, where this data can be analyzed and visualized. Gauger shows graphs of
1103the repository revisions and the performace data recorded for each revision, so
1104sudden performance peaks or drops can be identified and linked to a specific
1105revision number.
1106
1107In the case of GNUnet, the buildbots log the performance data obtained during
1108the tests after each build. The data can be accesed on GNUnet's Gauger page.
1109
1110The menu on the left allows to select either the results of just one build bot
1111(under "Hosts") or review the data from all hosts for a given test result
1112(under "Metrics"). In case of very different absolute value of the results, for
1113instance arm vs. amd64 machines, the option "Normalize" on a metric view can
1114help to get an idea about the performance evolution across all hosts.
1115
1116Using Gauger in GNUnet and having the performance of a module tracked over time
1117is very easy. First of course, the testcase must generate some consistent
1118metric, which makes sense to have logged. Highly volatile or random dependant
1119metrics probably are not ideal candidates for meaningful regression detection.
1120
1121To start logging any value, just include @code{gauger.h} in your testcase code.
1122Then, use the macro @code{GAUGER()} to make the buildbots log whatever value is
1123of interest for you to @code{gnunet.org}'s Gauger server. No setup is necessary
1124as most buildbots have already everything in place and new metrics are created
1125on demand. To delete a metric, you need to contact a member of the GNUnet
1126development team (a file will need to be removed manually from the respective
1127directory).
1128
1129The code in the test should look like this:
1130@example
1131[other includes]
1132#include <gauger.h>
1133
1134int main (int argc, char *argv[]) @{
1135
1136 [run test, generate data] GAUGER("YOUR_MODULE", "METRIC_NAME", (float)value,
1137 "UNIT"); @}
1138@end example
1139
1140
1141Where:
1142@table @asis
1143
1144@item @strong{YOUR_MODULE} is a category in the gauger page and should be the
1145name of the module or subsystem like "Core" or "DHT"
1146@item @strong{METRIC} is
1147the name of the metric being collected and should be concise and descriptive,
1148like "PUT operations in sqlite-datastore".
1149@item @strong{value} is the value
1150of the metric that is logged for this run.
1151@item @strong{UNIT} is the unit in
1152which the value is measured, for instance "kb/s" or "kb of RAM/node".
1153@end table
1154
1155If you wish to use Gauger for your own project, you can grab a copy of the
1156latest stable release or check out Gauger's Subversion repository.
1157
1158@c ***************************************************************************
1159@node GNUnet's TESTBED Subsystem
1160@section GNUnet's TESTBED Subsystem
1161
1162The TESTBED subsystem facilitates testing and measuring of multi-peer
1163deployments on a single host or over multiple hosts.
1164
1165The architecture of the testbed module is divided into the following:
1166@itemize @bullet
1167
1168@item Testbed API: An API which is used by the testing driver programs. It
1169provides with functions for creating, destroying, starting, stopping peers,
1170etc.
1171
1172@item Testbed service (controller): A service which is started through the
1173Testbed API. This service handles operations to create, destroy, start, stop
1174peers, connect them, modify their configurations.
1175
1176@item Testbed helper: When a controller has to be started on a host, the
1177testbed API starts the testbed helper on that host which in turn starts the
1178controller. The testbed helper receives a configuration for the controller
1179through its stdin and changes it to ensure the controller doesn't run into any
1180port conflict on that host.
1181@end itemize
1182
1183
1184The testbed service (controller) is different from the other GNUnet services in
1185that it is not started by ARM and is not supposed to be run as a daemon. It is
1186started by the testbed API through a testbed helper. In a typical scenario
1187involving multiple hosts, a controller is started on each host. Controllers
1188take up the actual task of creating peers, starting and stopping them on the
1189hosts they run.
1190
1191While running deployments on a single localhost the testbed API starts the
1192testbed helper directly as a child process. When running deployments on remote
1193hosts the testbed API starts Testbed Helpers on each remote host through remote
1194shell. By default testbed API uses SSH as a remote shell. This can be changed
1195by setting the environmental variable GNUNET_TESTBED_RSH_CMD to the required
1196remote shell program. This variable can also contain parameters which are to be
1197passed to the remote shell program. For e.g:@ @code{@ export
1198GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes -o
1199NoHostAuthenticationForLocalhost=yes %h"@ }@ Substitutions are allowed int the
1200above command string also allows for substitions. through placemarks which
1201begin with a `%'. At present the following substitutions are supported
1202@itemize @bullet
1203@item
1204%h: hostname
1205@item
1206%u: username
1207@item
1208%p: port
1209@end itemize
1210
1211Note that the substitution placemark is replaced only when the corresponding
1212field is available and only once. Specifying @code{%u@@%h} doesn't work either.
1213If you want to user username substitutions for SSH use the argument @code{-l}
1214before the username substitution. Ex: @code{ssh -l %u -p %p %h}
1215
1216The testbed API and the helper communicate through the helpers stdin and
1217stdout. As the helper is started through a remote shell on remote hosts any
1218output messages from the remote shell interfere with the communication and
1219results in a failure while starting the helper. For this reason, it is
1220suggested to use flags to make the remote shells produce no output messages and
1221to have password-less logins. The default remote shell, SSH, the default
1222options are "-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes".
1223Password-less logins should be ensured by using SSH keys.
1224
1225Since the testbed API executes the remote shell as a non-interactive shell,
1226certain scripts like .bashrc, .profiler may not be executed. If this is the
1227case testbed API can be forced to execute an interactive shell by setting up
1228the environmental variable `GNUNET_TESTBED_RSH_CMD_SUFFIX' to a shell program.
1229An example could be:@ @code{@ export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"@ }@
1230The testbed API will then execute the remote shell program as: @code{
1231$GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX
1232gnunet-helper-testbed }
1233
1234On some systems, problems may arise while starting testbed helpers if GNUnet is
1235installed into a custom location since the helper may not be found in the
1236standard path. This can be addressed by setting the variable
1237`HELPER_BINARY_PATH' to the path of the testbed helper. Testbed API will then
1238use this path to start helper binaries both locally and remotely.
1239
1240Testbed API can accessed by including "gnunet_testbed_service.h" file and
1241linking with -lgnunettestbed.
1242
1243
1244
1245@c ***************************************************************************
1246@menu
1247* Supported Topologies::
1248* Hosts file format::
1249* Topology file format::
1250* Testbed Barriers::
1251* Automatic large-scale deployment of GNUnet in the PlanetLab testbed::
1252* TESTBED Caveats::
1253@end menu
1254
1255@node Supported Topologies
1256@subsection Supported Topologies
1257
1258While testing multi-peer deployments, it is often needed that the peers are
1259connected in some topology. This requirement is addressed by the function
1260@code{GNUNET_TESTBED_overlay_connect()} which connects any given two peers in
1261the testbed.
1262
1263The API also provides a helper function
1264@code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set of
1265peers in any of the following supported topologies:
1266@itemize @bullet
1267
1268@item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with each
1269other
1270
1271@item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a line
1272
1273@item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a ring
1274topology
1275
1276@item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to form a 2
1277dimensional torus topology. The number of peers may not be a perfect square, in
1278that case the resulting torus may not have the uniform poloidal and toroidal
1279lengths
1280
1281@item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated to form
1282a random graph. The number of links to be present should be given
1283
1284@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to form a
12852D Torus with some random links among them. The number of random links are to
1286be given
1287
1288@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are connected to
1289form a ring with some random links among them. The number of random links are
1290to be given
1291
1292@item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a topology
1293where peer connectivity follows power law - new peers are connected with high
1294probabililty to well connected peers. See Emergence of Scaling in Random
1295Networks. Science 286, 509-512, 1999.
1296
1297@item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information is
1298loaded from a file. The path to the file has to be given. See Topology file
1299format for the format of this file.
1300
1301@item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1302@end itemize
1303
1304
1305The above supported topologies can be specified respectively by setting the
1306variable @code{OVERLAY_TOPOLOGY} to the following values in the configuration
1307passed to Testbed API functions @code{GNUNET_TESTBED_test_run()} and
1308@code{GNUNET_TESTBED_run()}:
1309@itemize @bullet
1310@item @code{CLIQUE}
1311@item @code{RING}
1312@item @code{LINE}
1313@item @code{2D_TORUS}
1314@item @code{RANDOM}
1315@item @code{SMALL_WORLD}
1316@item @code{SMALL_WORLD_RING}
1317@item @code{SCALE_FREE}
1318@item @code{FROM_FILE}
1319@item @code{NONE}
1320@end itemize
1321
1322
1323Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1324require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1325random links to be generated in the configuration. The option will be ignored
1326for the rest of the topologies.
1327
1328Toplogy @code{SCALE_FREE} requires the options @code{SCALE_FREE_TOPOLOGY_CAP}
1329to be set to the maximum number of peers which can connect to a peer and
1330@code{SCALE_FREE_TOPOLOGY_M} to be set to how many peers a peer should be
1331atleast connected to.
1332
1333Similarly, the topology @code{FROM_FILE} requires the option
1334@code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing the
1335topology information. This option is ignored for the rest of the topologies.
1336See Topology file format for the format of this file.
1337
1338@c ***************************************************************************
1339@node Hosts file format
1340@subsection Hosts file format
1341
1342The testbed API offers the function GNUNET_TESTBED_hosts_load_from_file() to
1343load from a given file details about the hosts which testbed can use for
1344deploying peers. This function is useful to keep the data about hosts separate
1345instead of hard coding them in code.
1346
1347Another helper function from testbed API, GNUNET_TESTBED_run() also takes a
1348hosts file name as its parameter. It uses the above function to populate the
1349hosts data structures and start controllers to deploy peers.
1350
1351These functions require the hosts file to be of the following format:
1352@itemize @bullet
1353@item Each line is interpreted to have details about a host
1354@item Host details should include the username to use for logging into the
1355host, the hostname of the host and the port number to use for the remote shell
1356program. All thee values should be given.
1357@item These details should be given in the following format:
1358@code{<username>@@<hostname>:<port>}
1359@end itemize
1360
1361Note that having canonical hostnames may cause problems while resolving the IP
1362addresses (See this bug). Hence it is advised to provide the hosts' IP
1363numerical addresses as hostnames whenever possible.
1364
1365@c ***************************************************************************
1366@node Topology file format
1367@subsection Topology file format
1368
1369A topology file describes how peers are to be connected. It should adhere to
1370the following format for testbed to parse it correctly.
1371
1372Each line should begin with the target peer id. This should be followed by a
1373colon(`:') and origin peer ids seperated by `|'. All spaces except for newline
1374characters are ignored. The API will then try to connect each origin peer to
1375the target peer.
1376
1377For example, the following file will result in 5 overlay connections: [2->1],
1378[3->1],[4->3], [0->3], [2->0]@ @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
1379
1380@c ***************************************************************************
1381@node Testbed Barriers
1382@subsection Testbed Barriers
1383
1384The testbed subsystem's barriers API facilitates coordination among the peers
1385run by the testbed and the experiment driver. The concept is similar to the
1386barrier synchronisation mechanism found in parallel programming or
1387multi-threading paradigms - a peer waits at a barrier upon reaching it until
1388the barrier is reached by a predefined number of peers. This predefined number
1389of peers required to cross a barrier is also called quorum. We say a peer has
1390reached a barrier if the peer is waiting for the barrier to be crossed.
1391Similarly a barrier is said to be reached if the required quorum of peers reach
1392the barrier. A barrier which is reached is deemed as crossed after all the
1393peers waiting on it are notified.
1394
1395The barriers API provides the following functions:
1396@itemize @bullet
1397@item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to initialse a
1398barrier in the experiment
1399@item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel a
1400barrier which has been initialised before
1401@item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal barrier
1402service that the caller has reached a barrier and is waiting for it to be
1403crossed
1404@item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to stop
1405waiting for a barrier to be crossed
1406@end itemize
1407
1408
1409Among the above functions, the first two, namely
1410@code{GNUNET_TESTBED_barrier_init()} and @code{GNUNET_TESTBED_barrier_cancel()}
1411are used by experiment drivers. All barriers should be initialised by the
1412experiment driver by calling @code{GNUNET_TESTBED_barrier_init()}. This
1413function takes a name to identify the barrier, the quorum required for the
1414barrier to be crossed and a notification callback for notifying the experiment
1415driver when the barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()}
1416cancels an initialised barrier and frees the resources allocated for it. This
1417function can be called upon a initialised barrier before it is crossed.
1418
1419The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
1420@code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's processes.
1421@code{GNUNET_TESTBED_barrier_wait()} connects to the local barrier service
1422running on the same host the peer is running on and registers that the caller
1423has reached the barrier and is waiting for the barrier to be crossed. Note that
1424this function can only be used by peers which are started by testbed as this
1425function tries to access the local barrier service which is part of the testbed
1426controller service. Calling @code{GNUNET_TESTBED_barrier_wait()} on an
1427uninitialised barrier results in failure.
1428@code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the notification registered
1429by @code{GNUNET_TESTBED_barrier_wait()}.
1430
1431
1432@c ***************************************************************************
1433@menu
1434* Implementation::
1435@end menu
1436
1437@node Implementation
1438@subsubsection Implementation
1439
1440Since barriers involve coordination between experiment driver and peers, the
1441barrier service in the testbed controller is split into two components. The
1442first component responds to the message generated by the barrier API used by
1443the experiment driver (functions @code{GNUNET_TESTBED_barrier_init()} and
1444@code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
1445messages generated by barrier API used by peers (functions
1446@code{GNUNET_TESTBED_barrier_wait()} and
1447@code{GNUNET_TESTBED_barrier_wait_cancel()}).
1448
1449Calling @code{GNUNET_TESTBED_barrier_init()} sends a
1450@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
1451controller. The master controller then registers a barrier and calls
1452@code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this way
1453barrier initialisation is propagated to the controller hierarchy. While
1454propagating initialisation, any errors at a subcontroller such as timeout
1455during further propagation are reported up the hierarchy back to the experiment
1456driver.
1457
1458Similar to @code{GNUNET_TESTBED_barrier_init()},
1459@code{GNUNET_TESTBED_barrier_cancel()} propagates
1460@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
1461controllers to remove an initialised barrier.
1462
1463The second component is implemented as a separate service in the binary
1464`gnunet-service-testbed' which already has the testbed controller service.
1465Although this deviates from the gnunet process architecture of having one
1466service per binary, it is needed in this case as this component needs access to
1467barrier data created by the first component. This component responds to
1468@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from local peers when
1469they call @code{GNUNET_TESTBED_barrier_wait()}. Upon receiving
1470@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the service checks if
1471the requested barrier has been initialised before and if it was not
1472initialised, an error status is sent through
1473@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local peer and
1474the connection from the peer is terminated. If the barrier is initialised
1475before, the barrier's counter for reached peers is incremented and a
1476notification is registered to notify the peer when the barrier is reached. The
1477connection from the peer is left open.
1478
1479When enough peers required to attain the quorum send
1480@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller sends
1481a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its parent
1482informing that the barrier is crossed. If the controller has started further
1483subcontrollers, it delays this message until it receives a similar notification
1484from each of those subcontrollers. Finally, the barriers API at the experiment
1485driver receives the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the
1486barrier is reached at all the controllers.
1487
1488The barriers API at the experiment driver responds to the
1489@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it back to
1490the master controller and notifying the experiment controller through the
1491notification callback that a barrier has been crossed. The echoed
1492@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is propagated by the
1493master controller to the controller hierarchy. This propagation triggers the
1494notifications registered by peers at each of the controllers in the hierarchy.
1495Note the difference between this downward propagation of the
1496@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message from its upward
1497propagation --- the upward propagation is needed for ensuring that the barrier
1498is reached by all the controllers and the downward propagation is for
1499triggering that the barrier is crossed.
1500
1501@c ***************************************************************************
1502@node Automatic large-scale deployment of GNUnet in the PlanetLab testbed
1503@subsection Automatic large-scale deployment of GNUnet in the PlanetLab testbed
1504
1505PlanetLab is as a testbed for computer networking and distributed systems
1506research. It was established in 2002 and as of June 2010 was composed of 1090
1507nodes at 507 sites worldwide.
1508
1509To automate the GNUnet we created a set of automation tools to simplify the
1510large-scale deployment. We provide you a set of scripts you can use to deploy
1511GNUnet on a set of nodes and manage your installation.
1512
1513Please also check @uref{https://gnunet.org/installation-fedora8-svn} and@
1514@uref{https://gnunet.org/installation-fedora12-svn} to find detailled
1515instructions how to install GNUnet on a PlanetLab node.
1516
1517
1518@c ***************************************************************************
1519@menu
1520* PlanetLab Automation for Fedora8 nodes::
1521* Install buildslave on PlanetLab nodes running fedora core 8::
1522* Setup a new PlanetLab testbed using GPLMT::
1523* Why do i get an ssh error when using the regex profiler?::
1524@end menu
1525
1526@node PlanetLab Automation for Fedora8 nodes
1527@subsubsection PlanetLab Automation for Fedora8 nodes
1528
1529@c ***************************************************************************
1530@node Install buildslave on PlanetLab nodes running fedora core 8
1531@subsubsection Install buildslave on PlanetLab nodes running fedora core 8
1532@c ** Actually this is a subsubsubsection, but must be fixed differently
1533@c ** as subsubsection is the lowest.
1534
1535Since most of the PlanetLab nodes are running the very old fedora core 8 image,
1536installing the buildslave software is quite some pain. For our PlanetLab
1537testbed we figured out how to install the buildslave software best.
1538
1539Install Distribute for python:@ @code{@ curl
1540http://python-distribute.org/distribute_setup.py | sudo python@ }
1541
1542Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not work):@
1543@code{@ wget
1544http://pypi.python.org/packages/source/z/zope.interface/zope.interface-3.8.0.tar.gz@
1545tar zvfz zope.interface-3.8.0.tar.gz@ cd zope.interface-3.8.0@ sudo python
1546setup.py install@ }
1547
1548Install the buildslave software (0.8.6 was the latest version):@ @code{@ wget
1549http://buildbot.googlecode.com/files/buildbot-slave-0.8.6p1.tar.gz@ tar xvfz
1550buildbot-slave-0.8.6p1.tar.gz@ cd buildslave-0.8.6p1@ sudo python setup.py
1551install@ }
1552
1553The setup will download the matching twisted package and install it.@ It will
1554also try to install the latest version of zope.interface which will fail to
1555install. Buildslave will work anyway since version 3.8.0 was installed before!
1556
1557@c ***************************************************************************
1558@node Setup a new PlanetLab testbed using GPLMT
1559@subsubsection Setup a new PlanetLab testbed using GPLMT
1560
1561@itemize @bullet
1562@item Get a new slice and assign nodes
1563Ask your PlanetLab PI to give you a new slice and assign the nodes you need
1564@item Install a buildmaster
1565You can stick to the buildbot documentation:@
1566@uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
1567@item Install the buildslave software on all nodes
1568To install the buildslave on all nodes assigned to your slice you can use the
1569tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
1570
1571@code{@ ./gplmt.py -c contrib/tumple_gnunet.conf -t
1572contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>@ }
1573
1574@item Create the buildmaster configuration and the slave setup commands
1575
1576The master and the and the slaves have need to have credentials and the master
1577has to have all nodes configured. This can be done with the
1578@code{create_buildbot_configuration.py} script in the @code{scripts} directory
1579
1580This scripts takes a list of nodes retrieved directly from PlanetLab or read
1581from a file and a configuration template and creates:@
1582 - a tasklist which can be executed with gplmt to setup the slaves@
1583 - a master.cfg file containing a PlanetLab nodes
1584
1585A configuration template is included in the <contrib>, most important is that
1586the script replaces the following tags in the template:
1587
1588%GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
1589%GPLMT_SCHEDULER_BUILDERS
1590
1591Create configuration for all nodes assigned to a slice:@ @code{@
1592./create_buildbot_configuration.py -u <planetlab username> -p <planetlab
1593password> -s <slice> -m <buildmaster+port> -t <template>@ }@ Create
1594configuration for some nodes in a file:@ @code{@
1595./create_buildbot_configuration.p -f <node_file> -m <buildmaster+port> -t
1596<template>@ }
1597
1598@item Copy the @code{master.cfg} to the buildmaster and start it
1599Use @code{buildbot start <basedir>} to start the server
1600@item Setup the buildslaves
1601@end itemize
1602
1603@c ***************************************************************************
1604@node Why do i get an ssh error when using the regex profiler?
1605@subsubsection Why do i get an ssh error when using the regex profiler?
1606
1607Why do i get an ssh error "Permission denied (publickey,password)." when using
1608the regex profiler although passwordless ssh to localhost works using publickey
1609and ssh-agent?
1610
1611You have to generate a public/private-key pair with no password:@
1612@code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
1613and then add the following to your ~/.ssh/config file:
1614
1615@code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
1616
1617now make sure your hostsfile looks like@
1618
1619[USERNAME]@@127.0.0.1:22@
1620[USERNAME]@@127.0.0.1:22
1621
1622You can test your setup by running `ssh 127.0.0.1` in a terminal and then in
1623the opened session run it again. If you were not asked for a password on either
1624login, then you should be good to go.
1625
1626@c ***************************************************************************
1627@node TESTBED Caveats
1628@subsection TESTBED Caveats
1629
1630This section documents a few caveats when using the GNUnet testbed
1631subsystem.
1632
1633
1634@c ***************************************************************************
1635@menu
1636* CORE must be started::
1637* ATS must want the connections::
1638@end menu
1639
1640@node CORE must be started
1641@subsubsection CORE must be started
1642
1643A simple issue is #3993: Your configuration MUST somehow ensure that for each
1644peer the CORE service is started when the peer is setup, otherwise TESTBED may
1645fail to connect peers when the topology is initialized, as TESTBED will start
1646some CORE services but not necessarily all (but it relies on all of them
1647running). The easiest way is to set 'FORCESTART = YES' in the '[core]' section
1648of the configuration file. Alternatively, having any service that directly or
1649indirectly depends on CORE being started with FORCESTART will also do. This
1650issue largely arises if users try to over-optimize by not starting any services
1651with FORCESTART.
1652
1653@c ***************************************************************************
1654@node ATS must want the connections
1655@subsubsection ATS must want the connections
1656
1657When TESTBED sets up connections, it only offers the respective HELLO
1658information to the TRANSPORT service. It is then up to the ATS service to
1659@strong{decide} to use the connection. The ATS service will typically eagerly
1660establish any connection if the number of total connections is low (relative to
1661bandwidth). Details may further depend on the specific ATS backend that was
1662configured. If ATS decides to NOT establish a connection (even though TESTBED
1663provided the required information), then that connection will count as failed
1664for TESTBED. Note that you can configure TESTBED to tolerate a certain number
1665of connection failures (see '-e' option of gnunet-testbed-profiler). This issue
1666largely arises for dense overlay topologies, especially if you try to create
1667cliques with more than 20 peers.
1668
1669@c ***************************************************************************
1670@node libgnunetutil
1671@section libgnunetutil
1672
1673libgnunetutil is the fundamental library that all GNUnet code builds upon.
1674Ideally, this library should contain most of the platform dependent code
1675(except for user interfaces and really special needs that only few applications
1676have). It is also supposed to offer basic services that most if not all GNUnet
1677binaries require. The code of libgnunetutil is in the src/util/ directory. The
1678public interface to the library is in the gnunet_util.h header. The functions
1679provided by libgnunetutil fall roughly into the following categories (in
1680roughly the order of importance for new developers):
1681@itemize @bullet
1682@item logging (common_logging.c)
1683@item memory allocation (common_allocation.c)
1684@item endianess conversion (common_endian.c)
1685@item internationalization (common_gettext.c)
1686@item String manipulation (string.c)
1687@item file access (disk.c)
1688@item buffered disk IO (bio.c)
1689@item time manipulation (time.c)
1690@item configuration parsing (configuration.c)
1691@item command-line handling (getopt*.c)
1692@item cryptography (crypto_*.c)
1693@item data structures (container_*.c)
1694@item CPS-style scheduling (scheduler.c)
1695@item Program initialization (program.c)
1696@item Networking (network.c, client.c, server*.c, service.c)
1697@item message queueing (mq.c)
1698@item bandwidth calculations (bandwidth.c)
1699@item Other OS-related (os*.c, plugin.c, signal.c)
1700@item Pseudonym management (pseudonym.c)
1701@end itemize
1702
1703It should be noted that only developers that fully understand this entire API
1704will be able to write good GNUnet code.
1705
1706Ideally, porting GNUnet should only require porting the gnunetutil library.
1707More testcases for the gnunetutil APIs are therefore a great way to make
1708porting of GNUnet easier.
1709
1710@menu
1711* Logging::
1712* Interprocess communication API (IPC)::
1713* Cryptography API::
1714* Message Queue API::
1715* Service API::
1716* Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
1717* The CONTAINER_MDLL API::
1718@end menu
1719
1720@c ***************************************************************************
1721@node Logging
1722@subsection Logging
1723
1724GNUnet is able to log its activity, mostly for the purposes of debugging the
1725program at various levels.
1726
1727@code{gnunet_common.h} defines several @strong{log levels}:
1728@table @asis
1729
1730@item ERROR for errors (really problematic situations, often leading to
1731crashes)
1732@item WARNING for warnings (troubling situations that might have
1733negative consequences, although not fatal)
1734@item INFO for various information.
1735Used somewhat rarely, as GNUnet statistics is used to hold and display most of
1736the information that users might find interesting.
1737@item DEBUG for debugging.
1738Does not produce much output on normal builds, but when extra logging is
1739enabled at compile time, a staggering amount of data is outputted under this
1740log level.
1741@end table
1742
1743
1744Normal builds of GNUnet (configured with @code{--enable-logging[=yes]}) are
1745supposed to log nothing under DEBUG level. The @code{--enable-logging=verbose}
1746configure option can be used to create a build with all logging enabled.
1747However, such build will produce large amounts of log data, which is
1748inconvenient when one tries to hunt down a specific problem.
1749
1750To mitigate this problem, GNUnet provides facilities to apply a filter to
1751reduce the logs:
1752@table @asis
1753
1754@item Logging by default When no log levels are configured in any other way
1755(see below), GNUnet will default to the WARNING log level. This mostly applies
1756to GNUnet command line utilities, services and daemons; tests will always set
1757log level to WARNING or, if @code{--enable-logging=verbose} was passed to
1758configure, to DEBUG. The default level is suggested for normal operation.
1759@item The -L option Most GNUnet executables accept an "-L loglevel" or
1760"--log=loglevel" option. If used, it makes the process set a global log level
1761to "loglevel". Thus it is possible to run some processes with -L DEBUG, for
1762example, and others with -L ERROR to enable specific settings to diagnose
1763problems with a particular process.
1764@item Configuration files. Because GNUnet
1765service and deamon processes are usually launched by gnunet-arm, it is not
1766possible to pass different custom command line options directly to every one of
1767them. The options passed to @code{gnunet-arm} only affect gnunet-arm and not
1768the rest of GNUnet. However, one can specify a configuration key "OPTIONS" in
1769the section that corresponds to a service or a daemon, and put a value of "-L
1770loglevel" there. This will make the respective service or daemon set its log
1771level to "loglevel" (as the value of OPTIONS will be passed as a command-line
1772argument).
1773
1774To specify the same log level for all services without creating separate
1775"OPTIONS" entries in the configuration for each one, the user can specify a
1776config key "GLOBAL_POSTFIX" in the [arm] section of the configuration file. The
1777value of GLOBAL_POSTFIX will be appended to all command lines used by the ARM
1778service to run other services. It can contain any option valid for all GNUnet
1779commands, thus in particular the "-L loglevel" option. The ARM service itself
1780is, however, unaffected by GLOBAL_POSTFIX; to set log level for it, one has to
1781specify "OPTIONS" key in the [arm] section.
1782@item Environment variables.
1783Setting global per-process log levels with "-L loglevel" does not offer
1784sufficient log filtering granularity, as one service will call interface
1785libraries and supporting libraries of other GNUnet services, potentially
1786producing lots of debug log messages from these libraries. Also, changing the
1787config file is not always convenient (especially when running the GNUnet test
1788suite).@ To fix that, and to allow GNUnet to use different log filtering at
1789runtime without re-compiling the whole source tree, the log calls were changed
1790to be configurable at run time. To configure them one has to define environment
1791variables "GNUNET_FORCE_LOGFILE", "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
1792@itemize @bullet
1793
1794@item "GNUNET_LOG" only affects the logging when no global log level is
1795configured by any other means (that is, the process does not explicitly set its
1796own log level, there are no "-L loglevel" options on command line or in
1797configuration files), and can be used to override the default WARNING log
1798level.
1799
1800@item "GNUNET_FORCE_LOG" will completely override any other log configuration
1801options given.
1802
1803@item "GNUNET_FORCE_LOGFILE" will completely override the location of the file
1804to log messages to. It should contain a relative or absolute file name. Setting
1805GNUNET_FORCE_LOGFILE is equivalent to passing "--log-file=logfile" or "-l
1806logfile" option (see below). It supports "[]" format in file names, but not
1807"@{@}" (see below).
1808@end itemize
1809
1810
1811Because environment variables are inherited by child processes when they are
1812launched, starting or re-starting the ARM service with these variables will
1813propagate them to all other services.
1814
1815"GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
1816formatted @strong{logging definition} string, which looks like this:@ @code{@
1817[component];[file];[function];[from_line[-to_line]];loglevel@emph{[/component...]}@
1818}@ That is, a logging definition consists of definition entries, separated by
1819slashes ('/'). If only one entry is present, there is no need to add a slash
1820to its end (although it is not forbidden either).@ All definition fields
1821(component, file, function, lines and loglevel) are mandatory, but (except for
1822the loglevel) they can be empty. An empty field means "match anything". Note
1823that even if fields are empty, the semicolon (';') separators must be
1824present.@ The loglevel field is mandatory, and must contain one of the log
1825level names (ERROR, WARNING, INFO or DEBUG).@ The lines field might contain
1826one non-negative number, in which case it matches only one line, or a range
1827"from_line-to_line", in which case it matches any line in the interval
1828[from_line;to_line] (that is, including both start and end line).@ GNUnet
1829mostly defaults component name to the name of the service that is implemented
1830in a process ('transport', 'core', 'peerinfo', etc), but logging calls can
1831specify custom component names using @code{GNUNET_log_from}.@ File name and
1832function name are provided by the compiler (__FILE__ and __FUNCTION__
1833built-ins).
1834
1835Component, file and function fields are interpreted as non-extended regular
1836expressions (GNU libc regex functions are used). Matching is case-sensitive, ^
1837and $ will match the beginning and the end of the text. If a field is empty,
1838its contents are automatically replaced with a ".*" regular expression, which
1839matches anything. Matching is done in the default way, which means that the
1840expression matches as long as it's contained anywhere in the string. Thus
1841"GNUNET_" will match both "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$'
1842to make sure that the expression matches at the start and/or at the end of the
1843string.@ The semicolon (';') can't be escaped, and GNUnet will not use it in
1844component names (it can't be used in function names and file names anyway).@
1845
1846@end table
1847
1848
1849Every logging call in GNUnet code will be (at run time) matched against the
1850log definitions passed to the process. If a log definition fields are matching
1851the call arguments, then the call log level is compared the the log level of
1852that definition. If the call log level is less or equal to the definition log
1853level, the call is allowed to proceed. Otherwise the logging call is
1854forbidden, and nothing is logged. If no definitions matched at all, GNUnet
1855will use the global log level or (if a global log level is not specified) will
1856default to WARNING (that is, it will allow the call to proceed, if its level
1857is less or equal to the global log level or to WARNING).
1858
1859That is, definitions are evaluated from left to right, and the first matching
1860definition is used to allow or deny the logging call. Thus it is advised to
1861place narrow definitions at the beginning of the logdef string, and generic
1862definitions - at the end.
1863
1864Whether a call is allowed or not is only decided the first time this particular
1865call is made. The evaluation result is then cached, so that any attempts to
1866make the same call later will be allowed or disallowed right away. Because of
1867that runtime log level evaluation should not significantly affect the process
1868performance.@ Log definition parsing is only done once, at the first call to
1869GNUNET_log_setup () made by the process (which is usually done soon after it
1870starts).
1871
1872At the moment of writing there is no way to specify logging definitions from
1873configuration files, only via environment variables.
1874
1875At the moment GNUnet will stop processing a log definition when it encounters
1876an error in definition formatting or an error in regular expression syntax, and
1877will not report the failure in any way.
1878
1879
1880@c ***************************************************************************
1881@menu
1882* Examples::
1883* Log files::
1884* Updated behavior of GNUNET_log::
1885@end menu
1886
1887@node Examples
1888@subsubsection Examples
1889
1890@table @asis
1891
1892@item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet process
1893tree, running all processes with DEBUG level (one should be careful with it, as
1894log files will grow at alarming rate!)
1895@item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet process
1896tree, running the core service under DEBUG level (everything else will use
1897configured or default level).
1898@item @code{GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;;DEBUG" gnunet-arm -s}
1899Start GNUnet process tree, allowing any logging calls from
1900gnunet-service-transport_validation.c (everything else will use configured or
1901default level).
1902@item @code{GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s}
1903Start GNUnet process tree, allowing any logging calls from
1904gnunet-gnunet-service-fs_push.c (everything else will use configured or default
1905level).
1906@item @code{GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s}
1907Start GNUnet process tree, allowing any logging calls from the
1908GNUNET_NETWORK_socket_select function (everything else will use configured or
1909default level).
1910@item @code{GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s}
1911Start GNUnet process tree, allowing any logging calls from the components
1912that have "transport" in their names, and are made from function that have
1913"send" in their names. Everything else will be allowed to be logged only if it
1914has WARNING level.
1915@end table
1916
1917
1918On Windows, one can use batch files to run GNUnet processes with special
1919environment variables, without affecting the whole system. Such batch file will
1920look like this:@ @code{@ set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm
1921-s@ }@ (note the absence of double quotes in the environment variable
1922definition, as opposed to earlier examples, which use the shell).@ Another
1923limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set in order to
1924GNUNET_FORCE_LOG to work.
1925
1926
1927@c ***************************************************************************
1928@node Log files
1929@subsubsection Log files
1930
1931GNUnet can be told to log everything into a file instead of stderr (which is
1932the default) using the "--log-file=logfile" or "-l logfile" option. This option
1933can also be passed via command line, or from the "OPTION" and "GLOBAL_POSTFIX"
1934configuration keys (see above). The file name passed with this option is
1935subject to GNUnet filename expansion. If specified in "GLOBAL_POSTFIX", it is
1936also subject to ARM service filename expansion, in particular, it may contain
1937"@{@}" (left and right curly brace) sequence, which will be replaced by ARM
1938with the name of the service. This is used to keep logs from more than one
1939service separate, while only specifying one template containing "@{@}" in
1940GLOBAL_POSTFIX.
1941
1942As part of a secondary file name expansion, the first occurrence of "[]"
1943sequence ("left square brace" followed by "right square brace") in the file
1944name will be replaced with a process identifier or the process when it
1945initializes its logging subsystem. As a result, all processes will log into
1946different files. This is convenient for isolating messages of a particular
1947process, and prevents I/O races when multiple processes try to write into the
1948file at the same time. This expansion is done independently of "@{@}"
1949expansion that ARM service does (see above).
1950
1951The log file name that is specified via "-l" can contain format characters
1952from the 'strftime' function family. For example, "%Y" will be replaced with
1953the current year. Using "basename-%Y-%m-%d.log" would include the current
1954year, month and day in the log file. If a GNUnet process runs for long enough
1955to need more than one log file, it will eventually clean up old log files.
1956Currently, only the last three log files (plus the current log file) are
1957preserved. So once the fifth log file goes into use (so after 4 days if you
1958use "%Y-%m-%d" as above), the first log file will be automatically deleted.
1959Note that if your log file name only contains "%Y", then log files would be
1960kept for 4 years and the logs from the first year would be deleted once year 5
1961begins. If you do not use any date-related string format codes, logs would
1962never be automatically deleted by GNUnet.
1963
1964
1965@c ***************************************************************************
1966
1967@node Updated behavior of GNUNET_log
1968@subsubsection Updated behavior of GNUNET_log
1969
1970It's currently quite common to see constructions like this all over the code:
1971@example
1972#if MESH_DEBUG GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client
1973disconnected\n"); #endif
1974@end example
1975
1976The reason for the #if is not to avoid displaying the message when disabled
1977(GNUNET_ERROR_TYPE takes care of that), but to avoid the compiler including it
1978in the binary at all, when compiling GNUnet for platforms with restricted
1979storage space / memory (MIPS routers, ARM plug computers / dev boards, etc).
1980
1981This presents several problems: the code gets ugly, hard to write and it is
1982very easy to forget to include the #if guards, creating non-consistent code. A
1983new change in GNUNET_log aims to solve these problems.
1984
1985@strong{This change requires to @code{./configure} with at least
1986@code{--enable-logging=verbose} to see debug messages.}
1987
1988Here is an example of code with dense debug statements:
1989@example
1990switch (restrict_topology) @{
1991case GNUNET_TESTING_TOPOLOGY_CLIQUE: #if VERBOSE_TESTING
1992GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
1993topology\n")); #endif unblacklisted_connections = create_clique (pg,
1994&remove_connections, BLACKLIST, GNUNET_NO); break; case
1995GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
1996(GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
1997topology\n")); #endif unblacklisted_connections = create_small_world_ring (pg,
1998&remove_connections, BLACKLIST); break;
1999@end example
2000
2001
2002Pretty hard to follow, huh?
2003
2004From now on, it is not necessary to include the #if / #endif statements to
2005acheive the same behavior. The GNUNET_log and GNUNET_log_from macros take care
2006of it for you, depending on the configure option:
2007@itemize @bullet
2008@item If @code{--enable-logging} is set to @code{no}, the binary will contain
2009no log messages at all.
2010@item If @code{--enable-logging} is set to @code{yes}, the binary will contain
2011no DEBUG messages, and therefore running with -L DEBUG will have no effect.
2012Other messages (ERROR, WARNING, INFO, etc) will be included.
2013@item If @code{--enable-logging} is set to @code{verbose}, or
2014@code{veryverbose} the binary will contain DEBUG messages (still, it will be
2015neccessary to run with -L DEBUG or set the DEBUG config option to show them).
2016@end itemize
2017
2018
2019If you are a developer:
2020@itemize @bullet
2021@item please make sure that you @code{./configure
2022--enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2023@item please remove the @code{#if} statements around @code{GNUNET_log
2024(GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your code.
2025@end itemize
2026
2027Since now activating DEBUG automatically makes it VERBOSE and activates
2028@strong{all} debug messages by default, you probably want to use the
2029https://gnunet.org/logging functionality to filter only relevant messages. A
2030suitable configuration could be:@ @code{$ export
2031GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"}@ Which will behave
2032almost like enabling DEBUG in that subsytem before the change. Of course you
2033can adapt it to your particular needs, this is only a quick example.
2034
2035@c ***************************************************************************
2036@node Interprocess communication API (IPC)
2037@subsection Interprocess communication API (IPC)
2038
2039In GNUnet a variety of new message types might be defined and used in
2040interprocess communication, in this tutorial we use the @code{struct
2041AddressLookupMessage} as a example to introduce how to construct our own
2042message type in GNUnet and how to implement the message communication between
2043service and client.@ (Here, a client uses the @code{struct
2044AddressLookupMessage} as a request to ask the server to return the address of
2045any other peer connecting to the service.)
2046
2047
2048@c ***************************************************************************
2049@menu
2050* Define new message types::
2051* Define message struct::
2052* Client: Establish connection::
2053* Client: Initialize request message::
2054* Client: Send request and receive response::
2055* Server: Startup service::
2056* Server: Add new handles for specified messages::
2057* Server: Process request message::
2058* Server: Response to client::
2059* Server: Notification of clients::
2060* Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2061@end menu
2062
2063@node Define new message types
2064@subsubsection Define new message types
2065
2066First of all, you should define the new message type in
2067@code{gnunet_protocols.h}:
2068@example
2069 // Request to look addresses of peers in server.
2070#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2071 // Response to the address lookup request.
2072#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2073@end example
2074
2075@c ***************************************************************************
2076@node Define message struct
2077@subsubsection Define message struct
2078
2079After the type definition, the specified message structure should also be
2080described in the header file, e.g. transport.h in our case.
2081@example
2082GNUNET_NETWORK_STRUCT_BEGIN
2083
2084struct AddressLookupMessage @{ struct GNUNET_MessageHeader header; int32_t
2085numeric_only GNUNET_PACKED; struct GNUNET_TIME_AbsoluteNBO timeout; uint32_t
2086addrlen GNUNET_PACKED;
2087 /* followed by 'addrlen' bytes of the actual address, then
2088 followed by the 0-terminated name of the transport */ @};
2089 GNUNET_NETWORK_STRUCT_END
2090@end example
2091
2092
2093Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED} which
2094both ensure correct alignment when sending structs over the network
2095
2096@menu
2097@end menu
2098
2099@c ***************************************************************************
2100@node Client: Establish connection
2101@subsubsection Client: Establish connection
2102@c %**end of header
2103
2104
2105At first, on the client side, the underlying API is employed to create a new
2106connection to a service, in our example the transport service would be
2107connected.
2108@example
2109struct GNUNET_CLIENT_Connection *client; client =
2110GNUNET_CLIENT_connect ("transport", cfg);
2111@end example
2112
2113@c ***************************************************************************
2114@node Client: Initialize request message
2115@subsubsection Client: Initialize request message
2116@c %**end of header
2117
2118When the connection is ready, we initialize the message. In this step, all the
2119fields of the message should be properly initialized, namely the size, type,
2120and some extra user-defined data, such as timeout, name of transport, address
2121and name of transport.
2122@example
2123struct AddressLookupMessage *msg; size_t len =
2124sizeof (struct AddressLookupMessage) + addressLen + strlen (nameTrans) + 1;
2125msg->header->size = htons (len); msg->header->type = htons
2126(GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP); msg->timeout =
2127GNUNET_TIME_absolute_hton (abs_timeout); msg->addrlen = htonl (addressLen);
2128char *addrbuf = (char *) &msg[1]; memcpy (addrbuf, address, addressLen); char
2129*tbuf = &addrbuf[addressLen]; memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2130@end example
2131
2132Note that, here the functions @code{htonl}, @code{htons} and
2133@code{GNUNET_TIME_absolute_hton} are applied to convert little endian into big
2134endian, about the usage of the big/small edian order and the corresponding
2135conversion function please refer to Introduction of Big Endian and Little
2136Endian.
2137
2138@c ***************************************************************************
2139@node Client: Send request and receive response
2140@subsubsection Client: Send request and receive response
2141@c %**end of header
2142
2143FIXME: This is very outdated, see the tutorial for the
2144current API!
2145
2146Next, the client would send the constructed message as a request to the service
2147and wait for the response from the service. To accomplish this goal, there are
2148a number of API calls that can be used. In this example,
2149@code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2150appropriate function to use.
2151@example
2152GNUNET_CLIENT_transmit_and_get_response
2153(client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2154arp_ctx);
2155@end example
2156
2157the argument @code{address_response_processor} is a function with
2158@code{GNUNET_CLIENT_MessageHandler} type, which is used to process the reply
2159message from the service.
2160
2161@node Server: Startup service
2162@subsubsection Server: Startup service
2163
2164After receiving the request message, we run a standard GNUnet service startup
2165sequence using @code{GNUNET_SERVICE_run}, as follows,
2166@example
2167int main(int
2168argc, char**argv) @{ GNUNET_SERVICE_run(argc, argv, "transport"
2169GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2170@end example
2171
2172@c ***************************************************************************
2173@node Server: Add new handles for specified messages
2174@subsubsection Server: Add new handles for specified messages
2175@c %**end of header
2176
2177in the function above the argument @code{run} is used to initiate transport
2178service,and defined like this:
2179@example
2180static void run (void *cls, struct
2181GNUNET_SERVER_Handle *serv, const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2182GNUNET_SERVER_add_handlers (serv, handlers); @}
2183@end example
2184
2185
2186Here, @code{GNUNET_SERVER_add_handlers} must be called in the run function to
2187add new handlers in the service. The parameter @code{handlers} is a list of
2188@code{struct GNUNET_SERVER_MessageHandler} to tell the service which function
2189should be called when a particular type of message is received, and should be
2190defined in this way:
2191@example
2192static struct GNUNET_SERVER_MessageHandler
2193handlers[] = @{ @{&handle_start, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_START,
21940@}, @{&handle_send, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_SEND, 0@},
2195@{&handle_try_connect, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT, sizeof
2196(struct TryConnectMessage)@}, @{&handle_address_lookup, NULL,
2197GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP, 0@}, @{NULL, NULL, 0, 0@} @};
2198@end example
2199
2200
2201As shown, the first member of the struct in the first area is a callback
2202function, which is called to process the specified message types, given as the
2203third member. The second parameter is the closure for the callback function,
2204which is set to @code{NULL} in most cases, and the last parameter is the
2205expected size of the message of this type, usually we set it to 0 to accept
2206variable size, for special cases the exact size of the specified message also
2207can be set. In addition, the terminator sign depicted as @code{@{NULL, NULL, 0,
22080@}} is set in the last aera.
2209
2210@c ***************************************************************************
2211@node Server: Process request message
2212@subsubsection Server: Process request message
2213@c %**end of header
2214
2215After the initialization of transport service, the request message would be
2216processed. Before handling the main message data, the validity of this message
2217should be checked out, e.g., to check whether the size of message is correct.
2218@example
2219size = ntohs (message->size); if (size < sizeof (struct
2220AddressLookupMessage)) @{ GNUNET_break_op (0); GNUNET_SERVER_receive_done
2221(client, GNUNET_SYSERR); return; @}
2222@end example
2223
2224
2225Note that, opposite to the construction method of the request message in the
2226client, in the server the function @code{nothl} and @code{ntohs} should be
2227employed during the extraction of the data from the message, so that the data
2228in big endian order can be converted back into little endian order. See more in
2229detail please refer to Introduction of Big Endian and Little Endian.
2230
2231Moreover in this example, the name of the transport stored in the message is a
22320-terminated string, so we should also check whether the name of the transport
2233in the received message is 0-terminated:
2234@example
2235nameTransport = (const char *)
2236&address[addressLen]; if (nameTransport[size - sizeof (struct
2237AddressLookupMessage)
2238 - addressLen - 1] != '\0') @{ GNUNET_break_op
2239 (0); GNUNET_SERVER_receive_done (client,
2240 GNUNET_SYSERR); return; @}
2241@end example
2242
2243Here, @code{GNUNET_SERVER_receive_done} should be called to tell the service
2244that the request is done and can receive the next message. The argument
2245@code{GNUNET_SYSERR} here indicates that the service didn't understand the
2246request message, and the processing of this request would be terminated.
2247
2248In comparison to the aforementioned situation, when the argument is equal to
2249@code{GNUNET_OK}, the service would continue to process the requst message.
2250
2251@c ***************************************************************************
2252@node Server: Response to client
2253@subsubsection Server: Response to client
2254@c %**end of header
2255
2256Once the processing of current request is done, the server should give the
2257response to the client. A new @code{struct AddressLookupMessage} would be
2258produced by the server in a similar way as the client did and sent to the
2259client, but here the type should be
2260@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2261@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2262@example
2263struct
2264AddressLookupMessage *msg; size_t len = sizeof (struct AddressLookupMessage) +
2265addressLen + strlen (nameTrans) + 1; msg->header->size = htons (len);
2266msg->header->type = htons (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2267
2268// ...
2269
2270struct GNUNET_SERVER_TransmitContext *tc; tc =
2271GNUNET_SERVER_transmit_context_create (client);
2272GNUNET_SERVER_transmit_context_append_data (tc, NULL, 0,
2273GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2274GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2275@end example
2276
2277
2278Note that, there are also a number of other APIs provided to the service to
2279send the message.
2280
2281@c ***************************************************************************
2282@node Server: Notification of clients
2283@subsubsection Server: Notification of clients
2284@c %**end of header
2285
2286Often a service needs to (repeatedly) transmit notifications to a client or a
2287group of clients. In these cases, the client typically has once registered for
2288a set of events and then needs to receive a message whenever such an event
2289happens (until the client disconnects). The use of a notification context can
2290help manage message queues to clients and handle disconnects. Notification
2291contexts can be used to send individualized messages to a particular client or
2292to broadcast messages to a group of clients. An individualized notification
2293might look like this:
2294@example
2295 GNUNET_SERVER_notification_context_unicast(nc,
2296 client, msg, GNUNET_YES);
2297@end example
2298
2299
2300Note that after processing the original registration message for notifications,
2301the server code still typically needs to call@
2302@code{GNUNET_SERVER_receive_done} so that the client can transmit further
2303messages to the server.
2304
2305@c ***************************************************************************
2306@node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2307@subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2308@c %** subsub? it's a referenced page on the ipc document.
2309@c %**end of header
2310
2311Here we can simply comprehend big endian and little endian as Network Byte
2312Order and Host Byte Order respectively. What is the difference between both
2313two?
2314
2315Usually in our host computer we store the data byte as Host Byte Order, for
2316example, we store a integer in the RAM which might occupies 4 Byte, as Host
2317Byte Order the higher Byte would be stored at the lower address of RAM, and
2318the lower Byte would be stored at the higher address of RAM. However, contrast
2319to this, Network Byte Order just take the totally opposite way to store the
2320data, says, it will store the lower Byte at the lower address, and the higher
2321Byte will stay at higher address.
2322
2323For the current communication of network, we normally exchange the information
2324by surveying the data package, every two host wants to communicate with each
2325other must send and receive data package through network. In order to maintain
2326the identity of data through the transmission in the network, the order of the
2327Byte storage must changed before sending and after receiving the data.
2328
2329There ten convenient functions to realize the conversion of Byte Order in
2330GNUnet, as following:
2331@table @asis
2332
2333@item uint16_t htons(uint16_t hostshort) Convert host byte order to net byte
2334order with short int
2335@item uint32_t htonl(uint32_t hostlong) Convert host byte
2336order to net byte order with long int
2337@item uint16_t ntohs(uint16_t netshort)
2338Convert net byte order to host byte order with short int
2339@item uint32_t
2340ntohl(uint32_t netlong) Convert net byte order to host byte order with long int
2341@item unsigned long long GNUNET_ntohll (unsigned long long netlonglong) Convert
2342net byte order to host byte order with long long int
2343@item unsigned long long
2344GNUNET_htonll (unsigned long long hostlonglong) Convert host byte order to net
2345byte order with long long int
2346@item struct GNUNET_TIME_RelativeNBO
2347GNUNET_TIME_relative_hton (struct GNUNET_TIME_Relative a) Convert relative time
2348to network byte order.
2349@item struct GNUNET_TIME_Relative
2350GNUNET_TIME_relative_ntoh (struct GNUNET_TIME_RelativeNBO a) Convert relative
2351time from network byte order.
2352@item struct GNUNET_TIME_AbsoluteNBO
2353GNUNET_TIME_absolute_hton (struct GNUNET_TIME_Absolute a) Convert relative time
2354to network byte order.
2355@item struct GNUNET_TIME_Absolute
2356GNUNET_TIME_absolute_ntoh (struct GNUNET_TIME_AbsoluteNBO a) Convert relative
2357time from network byte order.
2358@end table
2359
2360@c ***************************************************************************
2361
2362@node Cryptography API
2363@subsection Cryptography API
2364@c %**end of header
2365
2366The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
2367GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
2368messages by peers and most other public-key operations. Most researchers in
2369cryptography consider 2048 bit RSA keys as secure and practically unbreakable
2370for a long time. The API provides functions to create a fresh key pair, read a
2371private key from a file (or create a new file if the file does not exist),
2372encrypt, decrypt, sign, verify and extraction of the public key into a format
2373suitable for network transmission.
2374
2375For the encryption of files and the actual data exchanged between peers GNUnet
2376uses 256-bit AES encryption. Fresh, session keys are negotiated for every new
2377connection.@ Again, there is no published technique to break this cipher in any
2378realistic amount of time. The API provides functions for generation of keys,
2379validation of keys (important for checking that decryptions using RSA
2380succeeded), encryption and decryption.
2381
2382GNUnet uses SHA-512 for computing one-way hash codes. The API provides
2383functions to compute a hash over a block in memory or over a file on disk.
2384
2385The crypto API also provides functions for randomizing a block of memory,
2386obtaining a single random number and for generating a permuation of the numbers
23870 to n-1. Random number generation distinguishes between WEAK and STRONG random
2388number quality; WEAK random numbers are pseudo-random whereas STRONG random
2389numbers use entropy gathered from the operating system.
2390
2391Finally, the crypto API provides a means to deterministically generate a
23921024-bit RSA key from a hash code. These functions should most likely not be
2393used by most applications; most importantly,@
2394GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that should
2395be considered secure for traditional applications of RSA.
2396
2397@c ***************************************************************************
2398@node Message Queue API
2399@subsection Message Queue API
2400@c %**end of header
2401
2402@strong{ Introduction }@ Often, applications need to queue messages that are to
2403be sent to other GNUnet peers, clients or services. As all of GNUnet's
2404message-based communication APIs, by design, do not allow messages to be
2405queued, it is common to implement custom message queues manually when they are
2406needed. However, writing very similar code in multiple places is tedious and
2407leads to code duplication.
2408
2409MQ (for Message Queue) is an API that provides the functionality to implement
2410and use message queues. We intend to eventually replace all of the custom
2411message queue implementations in GNUnet with MQ.
2412
2413@strong{ Basic Concepts }@ The two most important entities in MQ are queues and
2414envelopes.
2415
2416Every queue is backed by a specific implementation (e.g. for mesh, stream,
2417connection, server client, etc.) that will actually deliver the queued
2418messages. For convenience,@ some queues also allow to specify a list of message
2419handlers. The message queue will then also wait for incoming messages and
2420dispatch them appropriately.
2421
2422An envelope holds the the memory for a message, as well as metadata (Where is
2423the envelope queued? What should happen after it has been sent?). Any envelope
2424can only be queued in one message queue.
2425
2426@strong{ Creating Queues }@ The following is a list of currently available
2427message queues. Note that to avoid layering issues, message queues for higher
2428level APIs are not part of @code{libgnunetutil}, but@ the respective API itself
2429provides the queue implementation.
2430@table @asis
2431
2432@item @code{GNUNET_MQ_queue_for_connection_client} Transmits queued messages
2433over a @code{GNUNET_CLIENT_Connection}@ handle. Also supports receiving with
2434message handlers.@
2435
2436@item @code{GNUNET_MQ_queue_for_server_client} Transmits queued messages over a
2437@code{GNUNET_SERVER_Client}@ handle. Does not support incoming message
2438handlers.@
2439
2440@item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
2441@code{GNUNET_MESH_Tunnel}@ handle. Does not support incoming message handlers.@
2442
2443@item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
2444implementation. Instead of delivering and receiving messages with one of
2445GNUnet's communication APIs, implementation callbacks are called. Refer to
2446"Implementing Queues" for a more detailed explanation.
2447@end table
2448
2449
2450@strong{ Allocating Envelopes }@ A GNUnet message (as defined by the
2451GNUNET_MessageHeader) has three parts: The size, the type, and the body.
2452
2453MQ provides macros to allocate an envelope containing a message conveniently,@
2454automatically setting the size and type fields of the message.
2455
2456Consider the following simple message, with the body consisting of a single
2457number value.@ @code{}
2458@example
2459struct NumberMessage @{
2460 /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
2461 struct GNUNET_MessageHeader header; uint32_t number GNUNET_PACKED; @};
2462@end example
2463
2464An envelope containing an instance of the NumberMessage can be constructed like
2465this:
2466@example
2467struct GNUNET_MQ_Envelope *ev; struct NumberMessage *msg; ev =
2468GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1); msg->number = htonl (42);
2469@end example
2470
2471
2472In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is the
2473newly allocated envelope. The first argument must be a pointer to some
2474@code{struct} containing a @code{struct GNUNET_MessageHeader header} field,
2475while the second argument is the desired message type, in host byte order.
2476
2477The @code{msg} pointer now points to an allocated message, where the message
2478type and the message size are already set. The message's size is inferred from
2479the type of the @code{msg} pointer: It will be set to 'sizeof(*msg)', properly
2480converted to network byte order.
2481
2482If the message body's size is dynamic, the the macro @code{GNUNET_MQ_msg_extra}
2483can be used to allocate an envelope whose message has additional space
2484allocated after the @code{msg} structure.
2485
2486If no structure has been defined for the message,
2487@code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
2488after the message header. The first argument then must be a pointer to a
2489@code{GNUNET_MessageHeader}.
2490
2491@strong{Envelope Properties}@ A few functions in MQ allow to set additional
2492properties on envelopes:
2493@table @asis
2494
2495@item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will be
2496called once the envelope's message@ has been sent irrevocably. An envelope can
2497be canceled precisely up to the@ point where the notify sent callback has been
2498called.
2499@item @code{GNUNET_MQ_disable_corking} No corking will be used when
2500sending the message. Not every@ queue supports this flag, per default,
2501envelopes are sent with corking.@
2502
2503@end table
2504
2505
2506@strong{Sending Envelopes}@ Once an envelope has been constructed, it can be
2507queued for sending with @code{GNUNET_MQ_send}.
2508
2509Note that in order to avoid memory leaks, an envelope must either be sent (the
2510queue will free it) or destroyed explicitly with @code{GNUNET_MQ_discard}.
2511
2512@strong{Canceling Envelopes}@ An envelope queued with @code{GNUNET_MQ_send} can
2513be canceled with @code{GNUNET_MQ_cancel}. Note that after the notify sent
2514callback has been called, canceling a message results in undefined behavior.
2515Thus it is unsafe to cancel an envelope that does not have a notify sent
2516callback. When canceling an envelope, it is not necessary@ to call
2517@code{GNUNET_MQ_discard}, and the envelope can't be sent again.
2518
2519@strong{ Implementing Queues }@ @code{TODO}
2520
2521@c ***************************************************************************
2522@node Service API
2523@subsection Service API
2524@c %**end of header
2525
2526Most GNUnet code lives in the form of services. Services are processes that
2527offer an API for other components of the system to build on. Those other
2528components can be command-line tools for users, graphical user interfaces or
2529other services. Services provide their API using an IPC protocol. For this,
2530each service must listen on either a TCP port or a UNIX domain socket; for
2531this, the service implementation uses the server API. This use of server is
2532exposed directly to the users of the service API. Thus, when using the service
2533API, one is usually also often using large parts of the server API. The service
2534API provides various convenience functions, such as parsing command-line
2535arguments and the configuration file, which are not found in the server API.
2536The dual to the service/server API is the client API, which can be used to
2537access services.
2538
2539The most common way to start a service is to use the GNUNET_SERVICE_run
2540function from the program's main function. GNUNET_SERVICE_run will then parse
2541the command line and configuration files and, based on the options found there,
2542start the server. It will then give back control to the main program, passing
2543the server and the configuration to the GNUNET_SERVICE_Main callback.
2544GNUNET_SERVICE_run will also take care of starting the scheduler loop. If this
2545is inappropriate (for example, because the scheduler loop is already running),
2546GNUNET_SERVICE_start and related functions provide an alternative to
2547GNUNET_SERVICE_run.
2548
2549When starting a service, the service_name option is used to determine which
2550sections in the configuration file should be used to configure the service. A
2551typical value here is the name of the src/ sub-directory, for example
2552"statistics". The same string would also be given to GNUNET_CLIENT_connect to
2553access the service.
2554
2555Once a service has been initialized, the program should use the@
2556GNUNET_SERVICE_Main callback to register message handlers using
2557GNUNET_SERVER_add_handlers. The service will already have registered a handler
2558for the "TEST" message.
2559
2560The option bitfield (enum GNUNET_SERVICE_Options) determines how a service
2561should behave during shutdown. There are three key strategies:
2562@table @asis
2563
2564@item instant (GNUNET_SERVICE_OPTION_NONE) Upon receiving the shutdown signal
2565from the scheduler, the service immediately terminates the server, closing all
2566existing connections with clients.
2567@item manual
2568(GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN) The service does nothing by itself
2569during shutdown. The main program will need to take the appropriate action by
2570calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending on how the
2571service was initialized) to terminate the service. This method is used by
2572gnunet-service-arm and rather uncommon.
2573@item soft
2574(GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN) Upon receiving the shutdown signal from
2575the scheduler, the service immediately tells the server to stop listening for
2576incoming clients. Requests from normal existing clients are still processed and
2577the server/service terminates once all normal clients have disconnected.
2578Clients that are not expected to ever disconnect (such as clients that monitor
2579performance values) can be marked as 'monitor' clients using
2580GNUNET_SERVER_client_mark_monitor. Those clients will continue to be processed
2581until all 'normal' clients have disconnected. Then, the server will terminate,
2582closing the monitor connections. This mode is for example used by 'statistics',
2583allowing existing 'normal' clients to set (possibly persistent) statistic
2584values before terminating.
2585@end table
2586
2587@c ***************************************************************************
2588@node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2589@subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2590@c %**end of header
2591
2592A commonly used data structure in GNUnet is a (multi-)hash map. It is most
2593often used to map a peer identity to some data structure, but also to map
2594arbitrary keys to values (for example to track requests in the distributed hash
2595table or in file-sharing). As it is commonly used, the DHT is actually
2596sometimes responsible for a large share of GNUnet's overall memory consumption
2597(for some processes, 30% is not uncommon). The following text documents some
2598API quirks (and their implications for applications) that were recently
2599introduced to minimize the footprint of the hash map.
2600
2601
2602@c ***************************************************************************
2603@menu
2604* Analysis::
2605* Solution::
2606* Migration::
2607* Conclusion::
2608* Availability::
2609@end menu
2610
2611@node Analysis
2612@subsubsection Analysis
2613@c %**end of header
2614
2615The main reason for the "excessive" memory consumption by the hash map is that
2616GNUnet uses 512-bit cryptographic hash codes --- and the (multi-)hash map also
2617uses the same 512-bit 'struct GNUNET_HashCode'. As a result, storing just the
2618keys requires 64 bytes of memory for each key. As some applications like to
2619keep a large number of entries in the hash map (after all, that's what maps
2620are good for), 64 bytes per hash is significant: keeping a pointer to the
2621value and having a linked list for collisions consume between 8 and 16 bytes,
2622and 'malloc' may add about the same overhead per allocation, putting us in the
262316 to 32 byte per entry ballpark. Adding a 64-byte key then triples the
2624overall memory requirement for the hash map.
2625
2626To make things "worse", most of the time storing the key in the hash map is
2627not required: it is typically already in memory elsewhere! In most cases, the
2628values stored in the hash map are some application-specific struct that _also_
2629contains the hash. Here is a simplified example:
2630@example
2631struct MyValue @{
2632struct GNUNET_HashCode key; unsigned int my_data; @};
2633
2634// ...
2635val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
263642; GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
2637@end example
2638
2639
2640This is a common pattern as later the entries might need to be removed, and at
2641that time it is convenient to have the key immediately at hand:
2642@example
2643GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
2644@end example
2645
2646
2647Note that here we end up with two times 64 bytes for the key, plus maybe 64
2648bytes total for the rest of the 'struct MyValue' and the map entry in the hash
2649map. The resulting redundant storage of the key increases overall memory
2650consumption per entry from the "optimal" 128 bytes to 192 bytes. This is not
2651just an extreme example: overheads in practice are actually sometimes close to
2652those highlighted in this example. This is especially true for maps with a
2653significant number of entries, as there we tend to really try to keep the
2654entries small.
2655@c ***************************************************************************
2656@node Solution
2657@subsubsection Solution
2658@c %**end of header
2659
2660The solution that has now been implemented is to @strong{optionally} allow the
2661hash map to not make a (deep) copy of the hash but instead have a pointer to
2662the hash/key in the entry. This reduces the memory consumption for the key
2663from 64 bytes to 4 to 8 bytes. However, it can also only work if the key is
2664actually stored in the entry (which is the case most of the time) and if the
2665entry does not modify the key (which in all of the code I'm aware of has been
2666always the case if there key is stored in the entry). Finally, when the client
2667stores an entry in the hash map, it @strong{must} provide a pointer to the key
2668within the entry, not just a pointer to a transient location of the key. If
2669the client code does not meet these requirements, the result is a dangling
2670pointer and undefined behavior of the (multi-)hash map API.
2671@c ***************************************************************************
2672@node Migration
2673@subsubsection Migration
2674@c %**end of header
2675
2676To use the new feature, first check that the values contain the respective key
2677(and never modify it). Then, all calls to
2678@code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be audited
2679and most likely changed to pass a pointer into the value's struct. For the
2680initial example, the new code would look like this:
2681@example
2682struct MyValue @{
2683struct GNUNET_HashCode key; unsigned int my_data; @};
2684
2685// ...
2686val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
268742; GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
2688@end example
2689
2690
2691Note that @code{&val} was changed to @code{&val->key} in the argument to the
2692@code{put} call. This is critical as often @code{key} is on the stack or in
2693some other transient data structure and thus having the hash map keep a pointer
2694to @code{key} would not work. Only the key inside of @code{val} has the same
2695lifetime as the entry in the map (this must of course be checked as well).
2696Naturally, @code{val->key} must be intiialized before the @code{put} call. Once
2697all @code{put} calls have been converted and double-checked, you can change the
2698call to create the hash map from
2699@example
2700map =
2701GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
2702@end example
2703
2704to
2705
2706@example
2707map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
2708@end example
2709
2710If everything was done correctly, you now use about 60 bytes less memory per
2711entry in @code{map}. However, if now (or in the future) any call to @code{put}
2712does not ensure that the given key is valid until the entry is removed from the
2713map, undefined behavior is likely to be observed.
2714@c ***************************************************************************
2715@node Conclusion
2716@subsubsection Conclusion
2717@c %**end of header
2718
2719The new optimization can is often applicable and can result in a reduction in
2720memory consumption of up to 30% in practice. However, it makes the code less
2721robust as additional invariants are imposed on the multi hash map client. Thus
2722applications should refrain from enabling the new mode unless the resulting
2723performance increase is deemed significant enough. In particular, it should
2724generally not be used in new code (wait at least until benchmarks exist).
2725@c ***************************************************************************
2726@node Availability
2727@subsubsection Availability
2728@c %**end of header
2729
2730The new multi hash map code was committed in SVN 24319 (will be in GNUnet
27310.9.4). Various subsystems (transport, core, dht, file-sharing) were
2732previously audited and modified to take advantage of the new capability. In
2733particular, memory consumption of the file-sharing service is expected to drop
2734by 20-30% due to this change.
2735
2736@c ***************************************************************************
2737@node The CONTAINER_MDLL API
2738@subsection The CONTAINER_MDLL API
2739@c %**end of header
2740
2741This text documents the GNUNET_CONTAINER_MDLL API. The GNUNET_CONTAINER_MDLL
2742API is similar to the GNUNET_CONTAINER_DLL API in that it provides operations
2743for the construction and manipulation of doubly-linked lists. The key
2744difference to the (simpler) DLL-API is that the MDLL-version allows a single
2745element (instance of a "struct") to be in multiple linked lists at the same
2746time.
2747
2748Like the DLL API, the MDLL API stores (most of) the data structures for the
2749doubly-linked list with the respective elements; only the 'head' and 'tail'
2750pointers are stored "elsewhere" --- and the application needs to provide the
2751locations of head and tail to each of the calls in the MDLL API. The key
2752difference for the MDLL API is that the "next" and "previous" pointers in the
2753struct can no longer be simply called "next" and "prev" --- after all, the
2754element may be in multiple doubly-linked lists, so we cannot just have one
2755"next" and one "prev" pointer!
2756
2757The solution is to have multiple fields that must have a name of the format
2758"next_XX" and "prev_XX" where "XX" is the name of one of the doubly-linked
2759lists. Here is a simple example:
2760@example
2761struct MyMultiListElement @{ struct
2762MyMultiListElement *next_ALIST; struct MyMultiListElement *prev_ALIST; struct
2763MyMultiListElement *next_BLIST; struct MyMultiListElement *prev_BLIST; void
2764*data; @};
2765@end example
2766
2767
2768Note that by convention, we use all-uppercase letters for the list names. In
2769addition, the program needs to have a location for the head and tail pointers
2770for both lists, for example:
2771@example
2772static struct MyMultiListElement
2773*head_ALIST; static struct MyMultiListElement *tail_ALIST; static struct
2774MyMultiListElement *head_BLIST; static struct MyMultiListElement *tail_BLIST;
2775@end example
2776
2777
2778Using the MDLL-macros, we can now insert an element into the ALIST:
2779@example
2780GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
2781@end example
2782
2783
2784Passing "ALIST" as the first argument to MDLL specifies which of the next/prev
2785fields in the 'struct MyMultiListElement' should be used. The extra "ALIST"
2786argument and the "_ALIST" in the names of the next/prev-members are the only
2787differences between the MDDL and DLL-API. Like the DLL-API, the MDLL-API offers
2788functions for inserting (at head, at tail, after a given element) and removing
2789elements from the list. Iterating over the list should be done by directly
2790accessing the "next_XX" and/or "prev_XX" members.
2791
2792@c ***************************************************************************
2793@node The Automatic Restart Manager (ARM)
2794@section The Automatic Restart Manager (ARM)
2795@c %**end of header
2796
2797GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible for
2798system initialization and service babysitting. ARM starts and halts services,
2799detects configuration changes and restarts services impacted by the changes as
2800needed. It's also responsible for restarting services in case of crashes and is
2801planned to incorporate automatic debugging for diagnosing service crashes
2802providing developers insights about crash reasons. The purpose of this document
2803is to give GNUnet developer an idea about how ARM works and how to interact
2804with it.
2805
2806@menu
2807* Basic functionality::
2808* Key configuration options::
2809* Availability2::
2810* Reliability::
2811@end menu
2812
2813@c ***************************************************************************
2814@node Basic functionality
2815@subsection Basic functionality
2816@c %**end of header
2817
2818@itemize @bullet
2819@item ARM source code can be found under "src/arm".@ Service processes are
2820managed by the functions in "gnunet-service-arm.c" which is controlled with
2821"gnunet-arm.c" (main function in that file is ARM's entry point).
2822
2823@item The functions responsible for communicating with ARM , starting and
2824stopping services -including ARM service itself- are provided by the ARM API
2825"arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller an ARM
2826handle after setting it to the caller's context (configuration and scheduler in
2827use). This handle can be used afterwards by the caller to communicate with ARM.
2828Functions GNUNET_ARM_start_service() and GNUNET_ARM_stop_service() are used for
2829starting and stopping services respectively.
2830
2831@item A typical example of using these basic ARM services can be found in file
2832test_arm_api.c. The test case connects to ARM, starts it, then uses it to start
2833a service "resolver", stops the "resolver" then stops "ARM".
2834@end itemize
2835
2836@c ***************************************************************************
2837@node Key configuration options
2838@subsection Key configuration options
2839@c %**end of header
2840
2841Configurations for ARM and services should be available in a .conf file (As an
2842example, see test_arm_api_data.conf). When running ARM, the configuration file
2843to use should be passed to the command:@ @code{@ $ gnunet-arm -s -c
2844configuration_to_use.conf@ }@ If no configuration is passed, the default
2845configuration file will be used (see GNUNET_PREFIX/share/gnunet/defaults.conf
2846which is created from contrib/defaults.conf).@ Each of the services is having a
2847section starting by the service name between square brackets, for example:
2848"[arm]". The following options configure how ARM configures or interacts with
2849the various services:
2850
2851@table @asis
2852
2853@item PORT Port number on which the service is listening for incoming TCP
2854connections. ARM will start the services should it notice a request at this
2855port.
2856
2857@item HOSTNAME Specifies on which host the service is deployed. Note
2858that ARM can only start services that are running on the local system (but will
2859not check that the hostname matches the local machine name). This option is
2860used by the @code{gnunet_client_lib.h} implementation to determine which system
2861to connect to. The default is "localhost".
2862
2863@item BINARY The name of the service binary file.
2864
2865@item OPTIONS To be passed to the service.
2866
2867@item PREFIX A command to pre-pend to the actual command, for example, running
2868a service with "valgrind" or "gdb"
2869
2870@item DEBUG Run in debug mode (much verbosity).
2871
2872@item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of the
2873service and start the service on-demand.
2874
2875@item FORCESTART ARM will always
2876start this service when the peer is started.
2877
2878@item ACCEPT_FROM IPv4 addresses the service accepts connections from.
2879
2880@item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
2881
2882@end table
2883
2884
2885Options that impact the operation of ARM overall are in the "[arm]" section.
2886ARM is a normal service and has (except for AUTOSTART) all of the options that
2887other services do. In addition, ARM has the following options:
2888@table @asis
2889
2890@item GLOBAL_PREFIX Command to be pre-pended to all services that are going to
2891run.@
2892
2893@item GLOBAL_POSTFIX Global option that will be supplied to all the services
2894that are going to run.@
2895
2896@end table
2897
2898@c ***************************************************************************
2899@node Availability2
2900@subsection Availability2
2901@c %**end of header
2902
2903As mentioned before, one of the features provided by ARM is starting services
2904on demand. Consider the example of one service "client" that wants to connect
2905to another service a "server". The "client" will ask ARM to run the "server".
2906ARM starts the "server". The "server" starts listening to incoming connections.
2907The "client" will establish a connection with the "server". And then, they will
2908start to communicate together.@ One problem with that scheme is that it's
2909slow!@ The "client" service wants to communicate with the "server" service at
2910once and is not willing wait for it to be started and listening to incoming
2911connections before serving its request.@ One solution for that problem will be
2912that ARM starts all services as default services. That solution will solve the
2913problem, yet, it's not quite practical, for some services that are going to be
2914started can never be used or are going to be used after a relatively long
2915time.@ The approach followed by ARM to solve this problem is as follows:
2916@itemize @bullet
2917
2918
2919@item For each service having a PORT field in the configuration file and that
2920is not one of the default services ( a service that accepts incoming
2921connections from clients), ARM creates listening sockets for all addresses
2922associated with that service.
2923
2924@item The "client" will immediately establish a connection with the "server".
2925
2926@item ARM --- pretending to be the "server" --- will listen on the respective
2927port and notice the incoming connection from the "client" (but not accept it),
2928instead
2929
2930@item Once there is an incoming connection, ARM will start the "server",
2931passing on the listen sockets (now, the service is started and can do its
2932work).
2933
2934@item Other client services now can directly connect directly to the "server".
2935@end itemize
2936
2937@c ***************************************************************************
2938@node Reliability
2939@subsection Reliability
2940
2941One of the features provided by ARM, is the automatic restart of crashed
2942services.@ ARM needs to know which of the running services died. Function
2943"gnunet-service-arm.c/maint_child_death()" is responsible for that. The
2944function is scheduled to run upon receiving a SIGCHLD signal. The function,
2945then, iterates ARM's list of services running and monitors which service has
2946died (crashed). For all crashing services, ARM restarts them.@ Now, considering
2947the case of a service having a serious problem causing it to crash each time
2948it's started by ARM. If ARM keeps blindly restarting such a service, we are
2949going to have the pattern: start-crash-restart-crash-restart-crash and so
2950forth!! Which is of course not practical.@ For that reason, ARM schedules the
2951service to be restarted after waiting for some delay that grows exponentially
2952with each crash/restart of that service.@ To clarify the idea, considering the
2953following example:
2954@itemize @bullet
2955
2956
2957@item Service S crashed.
2958
2959@item ARM receives the SIGCHLD and inspects its list of services to find the
2960dead one(s).
2961
2962@item ARM finds S dead and schedules it for restarting after "backoff" time
2963which is initially set to 1ms. ARM will double the backoff time correspondent
2964to S (now backoff(S) = 2ms)
2965
2966@item Because there is a severe problem with S, it crashed again.
2967
2968@item Again ARM receives the SIGCHLD and detects that it's S again that's
2969crashed. ARM schedules it for restarting but after its new backoff time (which
2970became 2ms), and doubles its backoff time (now backoff(S) = 4).
2971
2972@item and so on, until backoff(S) reaches a certain threshold
2973(EXPONENTIAL_BACKOFF_THRESHOLD is set to half an hour), after reaching it,
2974backoff(S) will remain half an hour, hence ARM won't be busy for a lot of time
2975trying to restart a problematic service.
2976@end itemize
2977
2978@c ***************************************************************************
2979@node GNUnet's TRANSPORT Subsystem
2980@section GNUnet's TRANSPORT Subsystem
2981@c %**end of header
2982
2983This chapter documents how the GNUnet transport subsystem works. The GNUnet
2984transport subsystem consists of three main components: the transport API (the
2985interface used by the rest of the system to access the transport service), the
2986transport service itself (most of the interesting functions, such as choosing
2987transports, happens here) and the transport plugins. A transport plugin is a
2988concrete implementation for how two GNUnet peers communicate; many plugins
2989exist, for example for communication via TCP, UDP, HTTP, HTTPS and others.
2990Finally, the transport subsystem uses supporting code, especially the NAT/UPnP
2991library to help with tasks such as NAT traversal.
2992
2993Key tasks of the transport service include:
2994@itemize @bullet
2995
2996
2997@item Create our HELLO message, notify clients and neighbours if our HELLO
2998changes (using NAT library as necessary)
2999
3000@item Validate HELLOs from other peers (send PING), allow other peers to
3001validate our HELLO's addresses (send PONG)
3002
3003@item Upon request, establish connections to other peers (using address
3004selection from ATS subsystem) and maintain them (again using PINGs and PONGs)
3005as long as desired
3006
3007@item Accept incoming connections, give ATS service the opportunity to switch
3008communication channels
3009
3010@item Notify clients about peers that have connected to us or that have been
3011disconnected from us
3012
3013@item If a (stateful) connection goes down unexpectedly (without explicit
3014DISCONNECT), quickly attempt to recover (without notifying clients) but do
3015notify clients quickly if reconnecting fails
3016
3017@item Send (payload) messages arriving from clients to other peers via
3018transport plugins and receive messages from other peers, forwarding those to
3019clients
3020
3021@item Enforce inbound traffic limits (using flow-control if it is applicable);
3022outbound traffic limits are enforced by CORE, not by us (!)
3023
3024@item Enforce restrictions on P2P connection as specified by the blacklist
3025configuration and blacklisting clients
3026@end itemize
3027
3028
3029Note that the term "clients" in the list above really refers to the GNUnet-CORE
3030service, as CORE is typically the only client of the transport service.
3031
3032@menu
3033* Address validation protocol::
3034@end menu
3035
3036@node Address validation protocol
3037@subsection Address validation protocol
3038@c %**end of header
3039
3040This section documents how the GNUnet transport service validates connections
3041with other peers. It is a high-level description of the protocol necessary to
3042understand the details of the implementation. It should be noted that when we
3043talk about PING and PONG messages in this section, we refer to transport-level
3044PING and PONG messages, which are different from core-level PING and PONG
3045messages (both in implementation and function).
3046
3047The goal of transport-level address validation is to minimize the chances of a
3048successful man-in-the-middle attack against GNUnet peers on the transport
3049level. Such an attack would not allow the adversary to decrypt the P2P
3050transmissions, but a successful attacker could at least measure traffic volumes
3051and latencies (raising the adversaries capablities by those of a global passive
3052adversary in the worst case). The scenarios we are concerned about is an
3053attacker, Mallory, giving a HELLO to Alice that claims to be for Bob, but
3054contains Mallory's IP address instead of Bobs (for some transport). Mallory
3055would then forward the traffic to Bob (by initiating a connection to Bob and
3056claiming to be Alice). As a further complication, the scheme has to work even
3057if say Alice is behind a NAT without traversal support and hence has no address
3058of her own (and thus Alice must always initiate the connection to Bob).
3059
3060An additional constraint is that HELLO messages do not contain a cryptographic
3061signature since other peers must be able to edit (i.e. remove) addresses from
3062the HELLO at any time (this was not true in GNUnet 0.8.x). A basic
3063@strong{assumption} is that each peer knows the set of possible network
3064addresses that it @strong{might} be reachable under (so for example, the
3065external IP address of the NAT plus the LAN address(es) with the respective
3066ports).
3067
3068The solution is the following. If Alice wants to validate that a given address
3069for Bob is valid (i.e. is actually established @strong{directly} with the
3070intended target), it sends a PING message over that connection to Bob. Note
3071that in this case, Alice initiated the connection so only she knows which
3072address was used for sure (Alice maybe behind NAT, so whatever address Bob
3073sees may not be an address Alice knows she has). Bob checks that the address
3074given in the PING is actually one of his addresses (does not belong to
3075Mallory), and if it is, sends back a PONG (with a signature that says that Bob
3076owns/uses the address from the PING). Alice checks the signature and is happy
3077if it is valid and the address in the PONG is the address she used. This is
3078similar to the 0.8.x protocol where the HELLO contained a signature from Bob
3079for each address used by Bob. Here, the purpose code for the signature is
3080@code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3081remember Bob's address and consider the address valid for a while (12h in the
3082current implementation). Note that after this exchange, Alice only considers
3083Bob's address to be valid, the connection itself is not considered
3084'established'. In particular, Alice may have many addresses for Bob that she
3085considers valid.
3086
3087The PONG message is protected with a nonce/challenge against replay attacks
3088and uses an expiration time for the signature (but those are almost
3089implementation details).
3090
3091@node NAT library
3092@section NAT library
3093@c %**end of header
3094
3095The goal of the GNUnet NAT library is to provide a general-purpose API for NAT
3096traversal @strong{without} third-party support. So protocols that involve
3097contacting a third peer to help establish a connection between two peers are
3098outside of the scope of this API. That does not mean that GNUnet doesn't
3099support involving a third peer (we can do this with the distance-vector
3100transport or using application-level protocols), it just means that the NAT API
3101is not concerned with this possibility. The API is written so that it will work
3102for IPv6-NAT in the future as well as current IPv4-NAT. Furthermore, the NAT
3103API is always used, even for peers that are not behind NAT --- in that case,
3104the mapping provided is simply the identity.
3105
3106NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a set
3107of addresses that the peer has locally bound to (TCP or UDP), the NAT library
3108will return (via callback) a (possibly longer) list of addresses the peer
3109@strong{might} be reachable under. Internally, depending on the configuration,
3110the NAT library will try to punch a hole (using UPnP) or just "know" that the
3111NAT was manually punched and generate the respective external IP address (the
3112one that should be globally visible) based on the given information.
3113
3114The NAT library also supports ICMP-based NAT traversal. Here, the other peer
3115can request connection-reversal by this peer (in this special case, the peer is
3116even allowed to configure a port number of zero). If the NAT library detects a
3117connection-reversal request, it returns the respective target address to the
3118client as well. It should be noted that connection-reversal is currently only
3119intended for TCP, so other plugins @strong{must} pass @code{NULL} for the
3120reversal callback. Naturally, the NAT library also supports requesting
3121connection reversal from a remote peer (@code{GNUNET_NAT_run_client}).
3122
3123Once initialized, the NAT handle can be used to test if a given address is
3124possibly a valid address for this peer (@code{GNUNET_NAT_test_address}). This
3125is used for validating our addresses when generating PONGs.
3126
3127Finally, the NAT library contains an API to test if our NAT configuration is
3128correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to the
3129respective port, the NAT library can be used to test if the configuration
3130works. The test function act as a local client, initialize the NAT traversal
3131and then contact a @code{gnunet-nat-server} (running by default on
3132@code{gnunet.org}) and ask for a connection to be established. This way, it is
3133easy to test if the current NAT configuration is valid.
3134
3135@node Distance-Vector plugin
3136@section Distance-Vector plugin
3137@c %**end of header
3138
3139The Distance Vector (DV) transport is a transport mechanism that allows peers
3140to act as relays for each other, thereby connecting peers that would otherwise
3141be unable to connect. This gives a larger connection set to applications that
3142may work better with more peers to choose from (for example, File Sharing
3143and/or DHT).
3144
3145The Distance Vector transport essentially has two functions. The first is
3146"gossiping" connection information about more distant peers to directly
3147connected peers. The second is taking messages intended for non-directly
3148connected peers and encapsulating them in a DV wrapper that contains the
3149required information for routing the message through forwarding peers. Via
3150gossiping, optimal routes through the known DV neighborhood are discovered and
3151utilized and the message encapsulation provides some benefits in addition to
3152simply getting the message from the correct source to the proper destination.
3153
3154The gossiping function of DV provides an up to date routing table of peers that
3155are available up to some number of hops. We call this a fisheye view of the
3156network (like a fish, nearby objects are known while more distant ones
3157unknown). Gossip messages are sent only to directly connected peers, but they
3158are sent about other knowns peers within the "fisheye distance". Whenever two
3159peers connect, they immediately gossip to each other about their appropriate
3160other neighbors. They also gossip about the newly connected peer to previously
3161connected neighbors. In order to keep the routing tables up to date, disconnect
3162notifications are propogated as gossip as well (because disconnects may not be
3163sent/received, timeouts are also used remove stagnant routing table entries).
3164
3165Routing of messages via DV is straightforward. When the DV transport is
3166notified of a message destined for a non-direct neighbor, the appropriate
3167forwarding peer is selected, and the base message is encapsulated in a DV
3168message which contains information about the initial peer and the intended
3169recipient. At each forwarding hop, the initial peer is validated (the
3170forwarding peer ensures that it has the initial peer in its neighborhood,
3171otherwise the message is dropped). Next the base message is re-encapsulated in
3172a new DV message for the next hop in the forwarding chain (or delivered to the
3173current peer, if it has arrived at the destination).
3174
3175Assume a three peer network with peers Alice, Bob and Carol. Assume that Alice
3176<-> Bob and Bob <-> Carol are direct (e.g. over TCP or UDP transports)
3177connections, but that Alice cannot directly connect to Carol. This may be the
3178case due to NAT or firewall restrictions, or perhaps based on one of the peers
3179respective configurations. If the Distance Vector transport is enabled on all
3180three peers, it will automatically discover (from the gossip protocol) that
3181Alice and Carol can connect via Bob and provide a "virtual" Alice <-> Carol
3182connection. Routing between Alice and Carol happens as follows; Alice creates a
3183message destined for Carol and notifies the DV transport about it. The DV
3184transport at Alice looks up Carol in the routing table and finds that the
3185message must be sent through Bob for Carol. The message is encapsulated setting
3186Alice as the initiator and Carol as the destination and sent to Bob. Bob
3187receives the messages, verifies both Alice and Carol are known to Bob, and
3188re-wraps the message in a new DV message for Carol. The DV transport at Carol
3189receives this message, unwraps the original message, and delivers it to Carol
3190as though it came directly from Alice.
3191
3192@node SMTP plugin
3193@section SMTP plugin
3194@c %**end of header
3195
3196This page describes the new SMTP transport plugin for GNUnet as it exists in
3197the 0.7.x and 0.8.x branch. SMTP support is currently not available in GNUnet
31980.9.x. This page also describes the transport layer abstraction (as it existed
3199in 0.7.x and 0.8.x) in more detail and gives some benchmarking results. The
3200performance results presented are quite old and maybe outdated at this point.
3201@itemize @bullet
3202@item Why use SMTP for a peer-to-peer transport?
3203@item SMTPHow does it work?
3204@item How do I configure my peer?
3205@item How do I test if it works?
3206@item How fast is it?
3207@item Is there any additional documentation?
3208@end itemize
3209
3210
3211@menu
3212* Why use SMTP for a peer-to-peer transport?::
3213* How does it work?::
3214* How do I configure my peer?::
3215* How do I test if it works?::
3216* How fast is it?::
3217@end menu
3218
3219@node Why use SMTP for a peer-to-peer transport?
3220@subsection Why use SMTP for a peer-to-peer transport?
3221@c %**end of header
3222
3223There are many reasons why one would not want to use SMTP:
3224@itemize @bullet
3225@item SMTP is using more bandwidth than TCP, UDP or HTTP
3226@item SMTP has a much higher latency.
3227@item SMTP requires significantly more computation (encoding and decoding time)
3228for the peers.
3229@item SMTP is significantly more complicated to configure.
3230@item SMTP may be abused by tricking GNUnet into sending mail to@
3231non-participating third parties.
3232@end itemize
3233
3234So why would anybody want to use SMTP?
3235@itemize @bullet
3236@item SMTP can be used to contact peers behind NAT boxes (in virtual private
3237networks).
3238@item SMTP can be used to circumvent policies that limit or prohibit
3239peer-to-peer traffic by masking as "legitimate" traffic.
3240@item SMTP uses E-mail addresses which are independent of a specific IP, which
3241can be useful to address peers that use dynamic IP addresses.
3242@item SMTP can be used to initiate a connection (e.g. initial address exchange)
3243and peers can then negotiate the use of a more efficient protocol (e.g. TCP)
3244for the actual communication.
3245@end itemize
3246
3247In summary, SMTP can for example be used to send a message to a peer behind a
3248NAT box that has a dynamic IP to tell the peer to establish a TCP connection
3249to a peer outside of the private network. Even an extraordinary overhead for
3250this first message would be irrelevant in this type of situation.
3251
3252@node How does it work?
3253@subsection How does it work?
3254@c %**end of header
3255
3256When a GNUnet peer needs to send a message to another GNUnet peer that has
3257advertised (only) an SMTP transport address, GNUnet base64-encodes the message
3258and sends it in an E-mail to the advertised address. The advertisement
3259contains a filter which is placed in the E-mail header, such that the
3260receiving host can filter the tagged E-mails and forward it to the GNUnet peer
3261process. The filter can be specified individually by each peer and be changed
3262over time. This makes it impossible to censor GNUnet E-mail messages by
3263searching for a generic filter.
3264
3265@node How do I configure my peer?
3266@subsection How do I configure my peer?
3267@c %**end of header
3268
3269First, you need to configure @code{procmail} to filter your inbound E-mail for
3270GNUnet traffic. The GNUnet messages must be delivered into a pipe, for example
3271@code{/tmp/gnunet.smtp}. You also need to define a filter that is used by
3272procmail to detect GNUnet messages. You are free to choose whichever filter
3273you like, but you should make sure that it does not occur in your other
3274E-mail. In our example, we will use @code{X-mailer: GNUnet}. The
3275@code{~/.procmailrc} configuration file then looks like this:
3276@example
3277:0:
3278* ^X-mailer: GNUnet
3279/tmp/gnunet.smtp
3280# where do you want your other e-mail delivered to (default: /var/spool/mail/)
3281:0: /var/spool/mail/
3282@end example
3283
3284After adding this file, first make sure that your regular E-mail still works
3285(e.g. by sending an E-mail to yourself). Then edit the GNUnet configuration.
3286In the section @code{SMTP} you need to specify your E-mail address under
3287@code{EMAIL}, your mail server (for outgoing mail) under @code{SERVER}, the
3288filter (X-mailer: GNUnet in the example) under @code{FILTER} and the name of
3289the pipe under @code{PIPE}.@ The completed section could then look like this:
3290@example
3291EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
3292"X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
3293@end example
3294
3295Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in the
3296@code{GNUNETD} section. GNUnet peers will use the E-mail address that you
3297specified to contact your peer until the advertisement times out. Thus, if you
3298are not sure if everything works properly or if you are not planning to be
3299online for a long time, you may want to configure this timeout to be short,
3300e.g. just one hour. For this, set @code{HELLOEXPIRES} to @code{1} in the
3301@code{GNUNETD} section.
3302
3303This should be it, but you may probably want to test it first.@
3304@node How do I test if it works?
3305@subsection How do I test if it works?
3306@c %**end of header
3307
3308Any transport can be subjected to some rudimentary tests using the
3309@code{gnunet-transport-check} tool. The tool sends a message to the local node
3310via the transport and checks that a valid message is received. While this test
3311does not involve other peers and can not check if firewalls or other network
3312obstacles prohibit proper operation, this is a great testcase for the SMTP
3313transport since it tests pretty much nearly all of the functionality.
3314
3315@code{gnunet-transport-check} should only be used without running
3316@code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
3317tests all transports that are specified in the configuration file. But you can
3318specifically test SMTP by giving the option @code{--transport=smtp}.
3319
3320Note that this test always checks if a transport can receive and send. While
3321you can configure most transports to only receive or only send messages, this
3322test will only work if you have configured the transport to send and receive
3323messages.
3324
3325@node How fast is it?
3326@subsection How fast is it?
3327@c %**end of header
3328
3329We have measured the performance of the UDP, TCP and SMTP transport layer
3330directly and when used from an application using the GNUnet core. Measureing
3331just the transport layer gives the better view of the actual overhead of the
3332protocol, whereas evaluating the transport from the application puts the
3333overhead into perspective from a practical point of view.
3334
3335The loopback measurements of the SMTP transport were performed on three
3336different machines spanning a range of modern SMTP configurations. We used a
3337PIII-800 running RedHat 7.3 with the Purdue Computer Science configuration
3338which includes filters for spam. We also used a Xenon 2 GHZ with a vanilla
3339RedHat 8.0 sendmail configuration. Furthermore, we used qmail on a PIII-1000
3340running Sorcerer GNU Linux (SGL). The numbers for UDP and TCP are provided
3341using the SGL configuration. The qmail benchmark uses qmail's internal
3342filtering whereas the sendmail benchmarks relies on procmail to filter and
3343deliver the mail. We used the transport layer to send a message of b bytes
3344(excluding transport protocol headers) directly to the local machine. This
3345way, network latency and packet loss on the wire have no impact on the
3346timings. n messages were sent sequentially over the transport layer, sending
3347message i+1 after the i-th message was received. All messages were sent over
3348the same connection and the time to establish the connection was not taken
3349into account since this overhead is miniscule in practice --- as long as a
3350connection is used for a significant number of messages.
3351
3352@multitable @columnfractions .20 .15 .15 .15 .15 .15
3353@headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail) @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
3354@item 11 bytes @tab 31 ms @tab 55 ms @tab 781 s @tab 77 s @tab 24 s
3355@item 407 bytes @tab 37 ms @tab 62 ms @tab 789 s @tab 78 s @tab 25 s
3356@item 1,221 bytes @tab 46 ms @tab 73 ms @tab 804 s @tab 78 s @tab 25 s
3357@end multitable
3358
3359The benchmarks show that UDP and TCP are, as expected, both significantly
3360faster compared with any of the SMTP services. Among the SMTP implementations,
3361there can be significant differences depending on the SMTP configuration.
3362Filtering with an external tool like procmail that needs to re-parse its
3363configuration for each mail can be very expensive. Applying spam filters can
3364also significantly impact the performance of the underlying SMTP
3365implementation. The microbenchmark shows that SMTP can be a viable solution
3366for initiating peer-to-peer sessions: a couple of seconds to connect to a peer
3367are probably not even going to be noticed by users. The next benchmark
3368measures the possible throughput for a transport. Throughput can be measured
3369by sending multiple messages in parallel and measuring packet loss. Note that
3370not only UDP but also the TCP transport can actually loose messages since the
3371TCP implementation drops messages if the @code{write} to the socket would
3372block. While the SMTP protocol never drops messages itself, it is often so
3373slow that only a fraction of the messages can be sent and received in the
3374given time-bounds. For this benchmark we report the message loss after
3375allowing t time for sending m messages. If messages were not sent (or
3376received) after an overall timeout of t, they were considered lost. The
3377benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0 with
3378sendmail. The machines were connected with a direct 100 MBit ethernet
3379connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the throughput
3380for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps and 6 kbps for
3381UDP, TCP and SMTP respectively. The high per-message overhead of SMTP can be
3382improved by increasing the MTU, for example, an MTU of 12,000 octets improves
3383the throughput to 13 kbps as figure smtp-MTUs shows. Our research paper) has
3384some more details on the benchmarking results.
3385
3386@node Bluetooth plugin
3387@section Bluetooth plugin
3388@c %**end of header
3389
3390This page describes the new Bluetooth transport plugin for GNUnet. The plugin
3391is still in the testing stage so don't expect it to work perfectly. If you
3392have any questions or problems just post them here or ask on the IRC channel.
3393@itemize @bullet
3394@item What do I need to use the Bluetooth plugin transport?
3395@item BluetoothHow does it work?
3396@item What possible errors should I be aware of?
3397@item How do I configure my peer?
3398@item How can I test it?
3399@end itemize
3400
3401
3402
3403@menu
3404* What do I need to use the Bluetooth plugin transport?::
3405* How does it work2?::
3406* What possible errors should I be aware of?::
3407* How do I configure my peer2?::
3408* How can I test it?::
3409* The implementation of the Bluetooth transport plugin::
3410@end menu
3411
3412@node What do I need to use the Bluetooth plugin transport?
3413@subsection What do I need to use the Bluetooth plugin transport?
3414@c %**end of header
3415
3416If you are a Linux user and you want to use the Bluetooth transport plugin you
3417should install the BlueZ development libraries (if they aren't already
3418installed). For instructions about how to install the libraries you should
3419check out the BlueZ site (@uref{http://www.bluez.org/, http://www.bluez.org}).
3420If you don't know if you have the necesarry libraries, don't worry, just run
3421the GNUnet configure script and you will be able to see a notification at the
3422end which will warn you if you don't have the necessary libraries.
3423
3424If you are a Windows user you should have installed the
3425@emph{MinGW}/@emph{MSys2} with the latest updates (especially the
3426@emph{ws2bth} header). If this is your first build of GNUnet on Windows you
3427should check out the SBuild repository. It will semi-automatically assembles a
3428@emph{MinGW}/@emph{MSys2} installation with a lot of extra packages which are
3429needed for the GNUnet build. So this will ease your work!@ Finally you just
3430have to be sure that you have the correct drivers for your Bluetooth device
3431installed and that your device is on and in a discoverable mode. The Windows
3432Bluetooth Stack supports only the RFCOMM protocol so we cannot turn on your
3433device programatically!
3434
3435@node How does it work2?
3436@subsection How does it work2?
3437@c %**end of header
3438
3439The Bluetooth transport plugin uses virtually the same code as the WLAN plugin
3440and only the helper binary is different. The helper takes a single argument,
3441which represents the interface name and is specified in the configuration
3442file. Here are the basic steps that are followed by the helper binary used on
3443Linux:
3444
3445@itemize @bullet
3446@item it verifies if the name corresponds to a Bluetooth interface name
3447@item it verifies if the iterface is up (if it is not, it tries to bring it up)
3448@item it tries to enable the page and inquiry scan in order to make the device
3449discoverable and to accept incoming connection requests
3450@emph{The above operations require root access so you should start the
3451transport plugin with root privileges.}
3452@item it finds an available port number and registers a SDP service which will
3453be used to find out on which port number is the server listening on and switch
3454the socket in listening mode
3455@item it sends a HELLO message with its address
3456@item finally it forwards traffic from the reading sockets to the STDOUT and
3457from the STDIN to the writing socket
3458@end itemize
3459
3460Once in a while the device will make an inquiry scan to discover the nearby
3461devices and it will send them randomly HELLO messages for peer discovery.
3462
3463@node What possible errors should I be aware of?
3464@subsection What possible errors should I be aware of?
3465@c %**end of header
3466
3467@emph{This section is dedicated for Linux users}
3468
3469Well there are many ways in which things could go wrong but I will try to
3470present some tools that you could use to debug and some scenarios.
3471@itemize @bullet
3472
3473@item @code{bluetoothd -n -d} : use this command to enable logging in the
3474foreground and to print the logging messages
3475
3476@item @code{hciconfig}: can be used to configure the Bluetooth devices. If you
3477run it without any arguments it will print information about the state of the
3478interfaces. So if you receive an error that the device couldn't be brought up
3479you should try to bring it manually and to see if it works (use @code{hciconfig
3480-a hciX up}). If you can't and the Bluetooth address has the form
348100:00:00:00:00:00 it means that there is something wrong with the D-Bus daemon
3482or with the Bluetooth daemon. Use @code{bluetoothd} tool to see the logs
3483
3484@item @code{sdptool} can be used to control and interogate SDP servers. If you
3485encounter problems regarding the SDP server (like the SDP server is down) you
3486should check out if the D-Bus daemon is running correctly and to see if the
3487Bluetooth daemon started correctly(use @code{bluetoothd} tool). Also, sometimes
3488the SDP service could work but somehow the device couldn't register his
3489service. Use @code{sdptool browse [dev-address]} to see if the service is
3490registered. There should be a service with the name of the interface and GNUnet
3491as provider.
3492
3493@item @code{hcitool} : another useful tool which can be used to configure the
3494device and to send some particular commands to it.
3495
3496@item @code{hcidump} : could be used for low level debugging
3497@end itemize
3498
3499@node How do I configure my peer2?
3500@subsection How do I configure my peer2?
3501@c %**end of header
3502
3503On Linux, you just have to be sure that the interface name corresponds to the
3504one that you want to use. Use the @code{hciconfig} tool to check that. By
3505default it is set to hci0 but you can change it.
3506
3507A basic configuration looks like this:
3508@example
3509[transport-bluetooth]
3510# Name of the interface (typically hciX)
3511INTERFACE = hci0
3512# Real hardware, no testing
3513TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
3514@end example
3515
3516
3517In order to use the Bluetooth transport plugin when the transport service is
3518started, you must add the plugin name to the default transport service plugins
3519list. For example:
3520@example
3521[transport] ... PLUGINS = dns bluetooth ...
3522@end example
3523
3524If you want to use only the Bluetooth plugin set @emph{PLUGINS = bluetooth}
3525
3526On Windows, you cannot specify which device to use. The only thing that you
3527should do is to add @emph{bluetooth} on the plugins list of the transport
3528service.
3529
3530@node How can I test it?
3531@subsection How can I test it?
3532@c %**end of header
3533
3534If you have two Bluetooth devices on the same machine which use Linux you
3535must:
3536@itemize @bullet
3537
3538@item create two different file configuration (one which will use the first
3539interface (@emph{hci0}) and the other which will use the second interface
3540(@emph{hci1})). Let's name them @emph{peer1.conf} and @emph{peer2.conf}.
3541
3542@item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
3543peers private keys. The @strong{X} must be replace with 1 or 2.
3544
3545@item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to start the
3546transport service. (Make sure that you have "bluetooth" on the transport
3547plugins list if the Bluetooth transport service doesn't start.)
3548
3549@item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's ID.
3550If you already know your peer ID (you saved it from the first command), this
3551can be skipped.
3552
3553@item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start sending
3554data for benchmarking to the other peer.
3555@end itemize
3556
3557
3558This scenario will try to connect the second peer to the first one and then
3559start sending data for benchmarking.
3560
3561On Windows you cannot test the plugin functionality using two Bluetooth devices
3562from the same machine because after you install the drivers there will occur
3563some conflicts between the Bluetooth stacks. (At least that is what happend on
3564my machine : I wasn't able to use the Bluesoleil stack and the WINDCOMM one in
3565the same time).
3566
3567If you have two different machines and your configuration files are good you
3568can use the same scenario presented on the begining of this section.
3569
3570Another way to test the plugin functionality is to create your own application
3571which will use the GNUnet framework with the Bluetooth transport service.
3572
3573@node The implementation of the Bluetooth transport plugin
3574@subsection The implementation of the Bluetooth transport plugin
3575@c %**end of header
3576
3577This page describes the implementation of the Bluetooth transport plugin.
3578
3579First I want to remind you that the Bluetooth transport plugin uses virtually
3580the same code as the WLAN plugin and only the helper binary is different. Also
3581the scope of the helper binary from the Bluetooth transport plugin is the same
3582as the one used for the wlan transport plugin: it acceses the interface and
3583then it forwards traffic in both directions between the Bluetooth interface
3584and stdin/stdout of the process involved.
3585
3586The Bluetooth plugin transport could be used both on Linux and Windows
3587platforms.
3588
3589@itemize @bullet
3590@item Linux functionality
3591@item Windows functionality
3592@item Pending Features
3593@end itemize
3594
3595
3596
3597@menu
3598* Linux functionality::
3599* THE INITIALIZATION::
3600* THE LOOP::
3601* Details about the broadcast implementation::
3602* Windows functionality::
3603* Pending features::
3604@end menu
3605
3606@node Linux functionality
3607@subsubsection Linux functionality
3608@c %**end of header
3609
3610In order to implement the plugin functionality on Linux I used the BlueZ
3611stack. For the communication with the other devices I used the RFCOMM
3612protocol. Also I used the HCI protocol to gain some control over the device.
3613The helper binary takes a single argument (the name of the Bluetooth
3614interface) and is separated in two stages:
3615
3616@c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
3617@c %** starting a new section?
3618@node THE INITIALIZATION
3619@subsubsection THE INITIALIZATION
3620
3621@itemize @bullet
3622@item first, it checks if we have root privilegies (@emph{Remember that we need
3623to have root privilegies in order to be able to bring the interface up if it is
3624down or to change its state.}).
3625
3626@item second, it verifies if the interface with the given name exists.
3627
3628@strong{If the interface with that name exists and it is a Bluetooth
3629interface:}
3630
3631@item it creates a RFCOMM socket which will be used for listening and call the
3632@emph{open_device} method
3633
3634On the @emph{open_device} method:
3635@itemize @bullet
3636@item creates a HCI socket used to send control events to the the device
3637@item searches for the device ID using the interface name
3638@item saves the device MAC address
3639@item checks if the interface is down and tries to bring it UP
3640@item checks if the interface is in discoverable mode and tries to make it
3641discoverable
3642@item closes the HCI socket and binds the RFCOMM one
3643@item switches the RFCOMM socket in listening mode
3644@item registers the SDP service (the service will be used by the other devices
3645to get the port on which this device is listening on)
3646@end itemize
3647
3648@item drops the root privilegies
3649
3650@strong{If the interface is not a Bluetooth interface the helper exits with a
3651suitable error}
3652@end itemize
3653
3654@c %** Same as for @node entry above
3655@node THE LOOP
3656@subsubsection THE LOOP
3657
3658The helper binary uses a list where it saves all the connected neighbour
3659devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
3660@emph{write_std}). The first message which is send is a control message with
3661the device's MAC address in order to announce the peer presence to the
3662neighbours. Here are a short description of what happens in the main loop:
3663
3664@itemize @bullet
3665@item Every time when it receives something from the STDIN it processes the
3666data and saves the message in the first buffer (@emph{write_pout}). When it has
3667something in the buffer, it gets the destination address from the buffer,
3668searches the destination address in the list (if there is no connection with
3669that device, it creates a new one and saves it to the list) and sends the
3670message.
3671@item Every time when it receives something on the listening socket it accepts
3672the connection and saves the socket on a list with the reading sockets.
3673@item Every time when it receives something from a reading socket it parses the
3674message, verifies the CRC and saves it in the @emph{write_std} buffer in order
3675to be sent later to the STDOUT.
3676@end itemize
3677
3678So in the main loop we use the select function to wait until one of the file
3679descriptor saved in one of the two file descriptors sets used is ready to use.
3680The first set (@emph{rfds}) represents the reading set and it could contain the
3681list with the reading sockets, the STDIN file descriptor or the listening
3682socket. The second set (@emph{wfds}) is the writing set and it could contain
3683the sending socket or the STDOUT file descriptor. After the select function
3684returns, we check which file descriptor is ready to use and we do what is
3685supposed to do on that kind of event. @emph{For example:} if it is the
3686listening socket then we accept a new connection and save the socket in the
3687reading list; if it is the STDOUT file descriptor, then we write to STDOUT the
3688message from the @emph{write_std} buffer.
3689
3690To find out on which port a device is listening on we connect to the local SDP
3691server and searche the registered service for that device.
3692
3693@emph{You should be aware of the fact that if the device fails to connect to
3694another one when trying to send a message it will attempt one more time. If it
3695fails again, then it skips the message.}
3696@emph{Also you should know that the
3697transport Bluetooth plugin has support for @strong{broadcast messages}.}
3698
3699@node Details about the broadcast implementation
3700@subsubsection Details about the broadcast implementation
3701@c %**end of header
3702
3703First I want to point out that the broadcast functionality for the CONTROL
3704messages is not implemented in a conventional way. Since the inquiry scan time
3705is too big and it will take some time to send a message to all the
3706discoverable devices I decided to tackle the problem in a different way. Here
3707is how I did it:
3708
3709@itemize @bullet
3710@item If it is the first time when I have to broadcast a message I make an
3711inquiry scan and save all the devices' addresses to a vector.
3712@item After the inquiry scan ends I take the first address from the list and I
3713try to connect to it. If it fails, I try to connect to the next one. If it
3714succeeds, I save the socket to a list and send the message to the device.
3715@item When I have to broadcast another message, first I search on the list for
3716a new device which I'm not connected to. If there is no new device on the list
3717I go to the beginning of the list and send the message to the old devices.
3718After 5 cycles I make a new inquiry scan to check out if there are new
3719discoverable devices and save them to the list. If there are no new
3720discoverable devices I reset the cycling counter and go again through the old
3721list and send messages to the devices saved in it.
3722@end itemize
3723
3724@strong{Therefore}:
3725
3726@itemize @bullet
3727@item every time when I have a broadcast message I look up on the list for a
3728new device and send the message to it
3729@item if I reached the end of the list for 5 times and I'm connected to all the
3730devices from the list I make a new inquiry scan. @emph{The number of the list's
3731cycles after an inquiry scan could be increased by redefining the MAX_LOOPS
3732variable}
3733@item when there are no new devices I send messages to the old ones.
3734@end itemize
3735
3736Doing so, the broadcast control messages will reach the devices but with delay.
3737
3738@emph{NOTICE:} When I have to send a message to a certain device first I check
3739on the broadcast list to see if we are connected to that device. If not we try
3740to connect to it and in case of success we save the address and the socket on
3741the list. If we are already connected to that device we simply use the socket.
3742
3743@node Windows functionality
3744@subsubsection Windows functionality
3745@c %**end of header
3746
3747For Windows I decided to use the Microsoft Bluetooth stack which has the
3748advantage of coming standard from Windows XP SP2. The main disadvantage is
3749that it only supports the RFCOMM protocol so we will not be able to have a low
3750level control over the Bluetooth device. Therefore it is the user
3751responsability to check if the device is up and in the discoverable mode. Also
3752there are no tools which could be used for debugging in order to read the data
3753coming from and going to a Bluetooth device, which obviously hindered my work.
3754Another thing that slowed down the implementation of the plugin (besides that
3755I wasn't too accomodated with the win32 API) was that there were some bugs on
3756MinGW regarding the Bluetooth. Now they are solved but you should keep in mind
3757that you should have the latest updates (especially the @emph{ws2bth} header).
3758
3759Besides the fact that it uses the Windows Sockets, the Windows implemenation
3760follows the same principles as the Linux one:
3761
3762@itemize @bullet
3763@item
3764It has a initalization part where it initializes the Windows Sockets, creates a
3765RFCOMM socket which will be binded and switched to the listening mode and
3766registers a SDP service.
3767In the Microsoft Bluetooth API there are two ways to work with the SDP:
3768@itemize @bullet
3769@item an easy way which works with very simple service records
3770@item a hard way which is useful when you need to update or to delete the
3771record
3772@end itemize
3773@end itemize
3774
3775Since I only needed the SDP service to find out on which port the device is
3776listening on and that did not change, I decided to use the easy way. In order
3777to register the service I used the @emph{WSASetService} function and I
3778generated the @emph{Universally Unique Identifier} with the @emph{guidgen.exe}
3779Windows's tool.
3780
3781In the loop section the only difference from the Linux implementation is that
3782I used the GNUNET_NETWORK library for functions like @emph{accept},
3783@emph{bind}, @emph{connect} or @emph{select}. I decided to use the
3784GNUNET_NETWORK library because I also needed to interact with the STDIN and
3785STDOUT handles and on Windows the select function is only defined for sockets,
3786and it will not work for arbitrary file handles.
3787
3788Another difference between Linux and Windows implementation is that in Linux,
3789the Bluetooth address is represented in 48 bits while in Windows is
3790represented in 64 bits. Therefore I had to do some changes on
3791@emph{plugin_transport_wlan} header.
3792
3793Also, currently on Windows the Bluetooth plugin doesn't have support for
3794broadcast messages. When it receives a broadcast message it will skip it.
3795
3796@node Pending features
3797@subsubsection Pending features
3798@c %**end of header
3799
3800@itemize @bullet
3801@item Implement the broadcast functionality on Windows @emph{(currently working
3802on)}
3803@item Implement a testcase for the helper :@ @emph{@ The testcase consists of a
3804program which emaluates the plugin and uses the helper. It will simulate
3805connections, disconnections and data transfers.@ }
3806@end itemize
3807
3808If you have a new idea about a feature of the plugin or suggestions about how
3809I could improve the implementation you are welcome to comment or to contact
3810me.
3811
3812@node WLAN plugin
3813@section WLAN plugin
3814@c %**end of header
3815
3816This section documents how the wlan transport plugin works. Parts which are not
3817implemented yet or could be better implemented are described at the end.
3818
3819@node The ATS Subsystem
3820@section The ATS Subsystem
3821@c %**end of header
3822
3823ATS stands for "automatic transport selection", and the function of ATS in
3824GNUnet is to decide on which address (and thus transport plugin) should be used
3825for two peers to communicate, and what bandwidth limits should be imposed on
3826such an individual connection. To help ATS make an informed decision,
3827higher-level services inform the ATS service about their requirements and the
3828quality of the service rendered. The ATS service also interacts with the
3829transport service to be appraised of working addresses and to communicate its
3830resource allocation decisions. Finally, the ATS service's operation can be
3831observed using a monitoring API.
3832
3833The main logic of the ATS service only collects the available addresses, their
3834performance characteristics and the applications requirements, but does not
3835make the actual allocation decision. This last critical step is left to an ATS
3836plugin, as we have implemented (currently three) different allocation
3837strategies which differ significantly in their performance and maturity, and it
3838is still unclear if any particular plugin is generally superior.
3839
3840@node GNUnet's CORE Subsystem
3841@section GNUnet's CORE Subsystem
3842@c %**end of header
3843
3844The CORE subsystem in GNUnet is responsible for securing link-layer
3845communications between nodes in the GNUnet overlay network. CORE builds on the
3846TRANSPORT subsystem which provides for the actual, insecure, unreliable
3847link-layer communication (for example, via UDP or WLAN), and then adds
3848fundamental security to the connections:
3849
3850@itemize @bullet
3851@item confidentiality with so-called perfect forward secrecy; we use
3852@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman,
3853ECDHE} powered by @uref{http://cr.yp.to/ecdh.html, Curve25519} for the key
3854exchange and then use symmetric encryption, encrypting with both
3855@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256} and
3856@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}
3857@item @uref{http://en.wikipedia.org/wiki/Authentication, authentication} is
3858achieved by signing the ephemeral keys using @uref{http://ed25519.cr.yp.to/,
3859Ed25519}, a deterministic variant of @uref{http://en.wikipedia.org/wiki/ECDSA,
3860ECDSA}
3861@item integrity protection (using @uref{http://en.wikipedia.org/wiki/SHA-2,
3862SHA-512} to do @uref{http://en.wikipedia.org/wiki/Authenticated_encryption,
3863encrypt-then-MAC)}
3864@item @uref{http://en.wikipedia.org/wiki/Replay_attack, replay} protection
3865(using nonces, timestamps, challenge-response, message counters and ephemeral
3866keys)
3867@item liveness (keep-alive messages, timeout)
3868@end itemize
3869
3870@menu
3871* Limitations::
3872* When is a peer "connected"?::
3873* libgnunetcore::
3874* The CORE Client-Service Protocol::
3875* The CORE Peer-to-Peer Protocol::
3876@end menu
3877
3878@node Limitations
3879@subsection Limitations
3880@c %**end of header
3881
3882CORE does not perform @uref{http://en.wikipedia.org/wiki/Routing, routing};
3883using CORE it is only possible to communicate with peers that happen to
3884already be "directly" connected with each other. CORE also does not have an
3885API to allow applications to establish such "direct" connections --- for this,
3886applications can ask TRANSPORT, but TRANSPORT might not be able to establish a
3887"direct" connection. The TOPOLOGY subsystem is responsible for trying to keep
3888a few "direct" connections open at all times. Applications that need to talk
3889to particular peers should use the CADET subsystem, as it can establish
3890arbitrary "indirect" connections.
3891
3892Because CORE does not perform routing, CORE must only be used directly by
3893applications that either perform their own routing logic (such as anonymous
3894file-sharing) or that do not require routing, for example because they are
3895based on flooding the network. CORE communication is unreliable and delivery
3896is possibly out-of-order. Applications that require reliable communication
3897should use the CADET service. Each application can only queue one message per
3898target peer with the CORE service at any time; messages cannot be larger than
3899approximately 63 kilobytes. If messages are small, CORE may group multiple
3900messages (possibly from different applications) prior to encryption. If
3901permitted by the application (using the @uref{http://baus.net/on-tcp_cork/,
3902cork} option), CORE may delay transmissions to facilitate grouping of multiple
3903small messages. If cork is not enabled, CORE will transmit the message as soon
3904as TRANSPORT allows it (TRANSPORT is responsible for limiting bandwidth and
3905congestion control). CORE does not allow flow control; applications are
3906expected to process messages at line-speed. If flow control is needed,
3907applications should use the CADET service.
3908
3909@node When is a peer "connected"?
3910@subsection When is a peer "connected"?
3911@c %**end of header
3912
3913In addition to the security features mentioned above, CORE also provides one
3914additional key feature to applications using it, and that is a limited form of
3915protocol-compatibility checking. CORE distinguishes between TRANSPORT-level
3916connections (which enable communication with other peers) and
3917application-level connections. Applications using the CORE API will
3918(typically) learn about application-level connections from CORE, and not about
3919TRANSPORT-level connections. When a typical application uses CORE, it will
3920specify a set of message types (from @code{gnunet_protocols.h}) that it
3921understands. CORE will then notify the application about connections it has
3922with other peers if and only if those applications registered an intersecting
3923set of message types with their CORE service. Thus, it is quite possible that
3924CORE only exposes a subset of the established direct connections to a
3925particular application --- and different applications running above CORE might
3926see different sets of connections at the same time.
3927
3928A special case are applications that do not register a handler for any message
3929type. CORE assumes that these applications merely want to monitor connections
3930(or "all" messages via other callbacks) and will notify those applications
3931about all connections. This is used, for example, by the @code{gnunet-core}
3932command-line tool to display the active connections. Note that it is also
3933possible that the TRANSPORT service has more active connections than the CORE
3934service, as the CORE service first has to perform a key exchange with
3935connecting peers before exchanging information about supported message types
3936and notifying applications about the new connection.
3937
3938@node libgnunetcore
3939@subsection libgnunetcore
3940@c %**end of header
3941
3942The CORE API (defined in @code{gnunet_core_service.h}) is the basic messaging
3943API used by P2P applications built using GNUnet. It provides applications the
3944ability to send and receive encrypted messages to the peer's "directly"
3945connected neighbours.
3946
3947As CORE connections are generally "direct" connections,@ applications must not
3948assume that they can connect to arbitrary peers this way, as "direct"
3949connections may not always be possible. Applications using CORE are notified
3950about which peers are connected. Creating new "direct" connections must be
3951done using the TRANSPORT API.
3952
3953The CORE API provides unreliable, out-of-order delivery. While the
3954implementation tries to ensure timely, in-order delivery, both message losses
3955and reordering are not detected and must be tolerated by the application. Most
3956important, the core will NOT perform retransmission if messages could not be
3957delivered.
3958
3959Note that CORE allows applications to queue one message per connected peer.
3960The rate at which each connection operates is influenced by the preferences
3961expressed by local application as well as restrictions imposed by the other
3962peer. Local applications can express their preferences for particular
3963connections using the "performance" API of the ATS service.
3964
3965Applications that require more sophisticated transmission capabilities such as
3966TCP-like behavior, or if you intend to send messages to arbitrary remote
3967peers, should use the CADET API.
3968
3969The typical use of the CORE API is to connect to the CORE service using
3970@code{GNUNET_CORE_connect}, process events from the CORE service (such as
3971peers connecting, peers disconnecting and incoming messages) and send messages
3972to connected peers using @code{GNUNET_CORE_notify_transmit_ready}. Note that
3973applications must cancel pending transmission requests if they receive a
3974disconnect event for a peer that had a transmission pending; furthermore,
3975queueing more than one transmission request per peer per application using the
3976service is not permitted.
3977
3978The CORE API also allows applications to monitor all communications of the
3979peer prior to encryption (for outgoing messages) or after decryption (for
3980incoming messages). This can be useful for debugging, diagnostics or to
3981establish the presence of cover traffic (for anonymity). As monitoring
3982applications are often not interested in the payload, the monitoring callbacks
3983can be configured to only provide the message headers (including the message
3984type and size) instead of copying the full data stream to the monitoring
3985client.
3986
3987The init callback of the @code{GNUNET_CORE_connect} function is called with
3988the hash of the public key of the peer. This public key is used to identify
3989the peer globally in the GNUnet network. Applications are encouraged to check
3990that the provided hash matches the hash that they are using (as theoretically
3991the application may be using a different configuration file with a different
3992private key, which would result in hard to find bugs).
3993
3994As with most service APIs, the CORE API isolates applications from crashes of
3995the CORE service. If the CORE service crashes, the application will see
3996disconnect events for all existing connections. Once the connections are
3997re-established, the applications will be receive matching connect events.
3998
3999@node The CORE Client-Service Protocol
4000@subsection The CORE Client-Service Protocol
4001@c %**end of header
4002
4003This section describes the protocol between an application using the CORE
4004service (the client) and the CORE service process itself.
4005
4006
4007@menu
4008* Setup2::
4009* Notifications::
4010* Sending::
4011@end menu
4012
4013@node Setup2
4014@subsubsection Setup2
4015@c %**end of header
4016
4017When a client connects to the CORE service, it first sends a
4018@code{InitMessage} which specifies options for the connection and a set of
4019message type values which are supported by the application. The options
4020bitmask specifies which events the client would like to be notified about. The
4021options include:
4022
4023@table @asis
4024@item GNUNET_CORE_OPTION_NOTHING No notifications
4025@item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4026@item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after decryption) with
4027full payload
4028@item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4029of all inbound messages
4030@item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4031messages (prior to encryption) with full payload
4032@item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all outbound
4033messages
4034@end table
4035
4036Typical applications will only monitor for connection status changes.
4037
4038The CORE service responds to the @code{InitMessage} with an
4039@code{InitReplyMessage} which contains the peer's identity. Afterwards, both
4040CORE and the client can send messages.
4041
4042@node Notifications
4043@subsubsection Notifications
4044@c %**end of header
4045
4046The CORE will send @code{ConnectNotifyMessage}s and
4047@code{DisconnectNotifyMessage}s whenever peers connect or disconnect from the
4048CORE (assuming their type maps overlap with the message types registered by
4049the client). When the CORE receives a message that matches the set of message
4050types specified during the @code{InitMessage} (or if monitoring is enabled in
4051for inbound messages in the options), it sends a @code{NotifyTrafficMessage}
4052with the peer identity of the sender and the decrypted payload. The same
4053message format (except with @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND}
4054for the message type) is used to notify clients monitoring outbound messages;
4055here, the peer identity given is that of the receiver.
4056
4057@node Sending
4058@subsubsection Sending
4059@c %**end of header
4060
4061When a client wants to transmit a message, it first requests a transmission
4062slot by sending a @code{SendMessageRequest} which specifies the priority,
4063deadline and size of the message. Note that these values may be ignored by
4064CORE. When CORE is ready for the message, it answers with a
4065@code{SendMessageReady} response. The client can then transmit the payload
4066with a @code{SendMessage} message. Note that the actual message size in the
4067@code{SendMessage} is allowed to be smaller than the size in the original
4068request. A client may at any time send a fresh @code{SendMessageRequest},
4069which then superceeds the previous @code{SendMessageRequest}, which is then no
4070longer valid. The client can tell which @code{SendMessageRequest} the CORE
4071service's @code{SendMessageReady} message is for as all of these messages
4072contain a "unique" request ID (based on a counter incremented by the client
4073for each request).
4074
4075@node The CORE Peer-to-Peer Protocol
4076@subsection The CORE Peer-to-Peer Protocol
4077@c %**end of header
4078
4079
4080@menu
4081* Creating the EphemeralKeyMessage::
4082* Establishing a connection::
4083* Encryption and Decryption::
4084* Type maps::
4085@end menu
4086
4087@node Creating the EphemeralKeyMessage
4088@subsubsection Creating the EphemeralKeyMessage
4089@c %**end of header
4090
4091When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4092public-private key pair and signs the corresponding @code{EphemeralKeyMessage}
4093with its long-term key (which we usually call the peer's identity; the hash of
4094the public long term key is what results in a @code{struct
4095GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral key is ONLY used for an
4096@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman,
4097ECDHE} exchange by the CORE service to establish symmetric session keys. A
4098peer will use the same @code{EphemeralKeyMessage} for all peers for
4099@code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it will
4100create a fresh ephemeral key (forgetting the old one) and broadcast the new
4101@code{EphemeralKeyMessage} to all connected peers, resulting in fresh
4102symmetric session keys. Note that peers independently decide on when to
4103discard ephemeral keys; it is not a protocol violation to discard keys more
4104often. Ephemeral keys are also never stored to disk; restarting a peer will
4105thus always create a fresh ephemeral key. The use of ephemeral keys is what
4106provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
4107
4108Just before transmission, the @code{EphemeralKeyMessage} is patched to reflect
4109the current sender_status, which specifies the current state of the connection
4110from the point of view of the sender. The possible values are:
4111
4112@table @asis
4113@item KX_STATE_DOWN Initial value, never used on the network
4114@item KX_STATE_KEY_SENT We sent our ephemeral key, do not know the key of the other
4115peer
4116@item KX_STATE_KEY_RECEIVED This peer has received a valid ephemeral key
4117of the other peer, but we are waiting for the other peer to confirm it's
4118authenticity (ability to decode) via challenge-response.
4119@item KX_STATE_UP The
4120connection is fully up from the point of view of the sender (now performing
4121keep-alives)
4122@item KX_STATE_REKEY_SENT The sender has initiated a rekeying
4123operation; the other peer has so far failed to confirm a working connection
4124using the new ephemeral key
4125@end table
4126
4127@node Establishing a connection
4128@subsubsection Establishing a connection
4129@c %**end of header
4130
4131Peers begin their interaction by sending a @code{EphemeralKeyMessage} to the
4132other peer once the TRANSPORT service notifies the CORE service about the
4133connection. A peer receiving an @code{EphemeralKeyMessage} with a status
4134indicating that the sender does not have the receiver's ephemeral key, the
4135receiver's @code{EphemeralKeyMessage} is sent in response.@ Additionally, if
4136the receiver has not yet confirmed the authenticity of the sender, it also
4137sends an (encrypted)@code{PingMessage} with a challenge (and the identity of
4138the target) to the other peer. Peers receiving a @code{PingMessage} respond
4139with an (encrypted) @code{PongMessage} which includes the challenge. Peers
4140receiving a @code{PongMessage} check the challenge, and if it matches set the
4141connection to @code{KX_STATE_UP}.
4142
4143@node Encryption and Decryption
4144@subsubsection Encryption and Decryption
4145@c %**end of header
4146
4147All functions related to the key exchange and encryption/decryption of
4148messages can be found in @code{gnunet-service-core_kx.c} (except for the
4149cryptographic primitives, which are in @code{util/crypto*.c}).@ Given the key
4150material from ECDHE, a
4151@uref{http://en.wikipedia.org/wiki/Key_derivation_function, Key derivation
4152function} is used to derive two pairs of encryption and decryption keys for
4153AES-256 and TwoFish, as well as initialization vectors and authentication keys
4154(for @uref{http://en.wikipedia.org/wiki/HMAC, HMAC}). The HMAC is computed
4155over the encrypted payload. Encrypted messages include an iv_seed and the HMAC
4156in the header.
4157
4158Each encrypted message in the CORE service includes a sequence number and a
4159timestamp in the encrypted payload. The CORE service remembers the largest
4160observed sequence number and a bit-mask which represents which of the previous
416132 sequence numbers were already used. Messages with sequence numbers lower
4162than the largest observed sequence number minus 32 are discarded. Messages
4163with a timestamp that is less than @code{REKEY_TOLERANCE} off (5 minutes) are
4164also discarded. This of course means that system clocks need to be reasonably
4165synchronized for peers to be able to communicate. Additionally, as the
4166ephemeral key changes every 12h, a peer would not even be able to decrypt
4167messages older than 12h.
4168
4169@node Type maps
4170@subsubsection Type maps
4171@c %**end of header
4172
4173Once an encrypted connection has been established, peers begin to exchange
4174type maps. Type maps are used to allow the CORE service to determine which
4175(encrypted) connections should be shown to which applications. A type map is
4176an array of 65536 bits representing the different types of messages understood
4177by applications using the CORE service. Each CORE service maintains this map,
4178simply by setting the respective bit for each message type supported by any of
4179the applications using the CORE service. Note that bits for message types
4180embedded in higher-level protocols (such as MESH) will not be included in
4181these type maps.
4182
4183Typically, the type map of a peer will be sparse. Thus, the CORE service
4184attempts to compress its type map using @code{gzip}-style compression
4185("deflate") prior to transmission. However, if the compression fails to
4186compact the map, the map may also be transmitted without compression
4187(resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
4188@code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively). Upon
4189receiving a type map, the respective CORE service notifies applications about
4190the connection to the other peer if they support any message type indicated in
4191the type map (or no message type at all). If the CORE service experience a
4192connect or disconnect event from an application, it updates its type map
4193(setting or unsetting the respective bits) and notifies its neighbours about
4194the change. The CORE services of the neighbours then in turn generate connect
4195and disconnect events for the peer that sent the type map for their respective
4196applications. As CORE messages may be lost, the CORE service confirms
4197receiving a type map by sending back a
4198@code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation (with
4199the correct hash of the type map) is not received, the sender will retransmit
4200the type map (with exponential back-off).
4201
4202@node GNUnet's CADET subsystem
4203@section GNUnet's CADET subsystem
4204
4205The CADET subsystem in GNUnet is responsible for secure end-to-end
4206communications between nodes in the GNUnet overlay network. CADET builds on the
4207CORE subsystem which provides for the link-layer communication and then adds
4208routing, forwarding and additional security to the connections. CADET offers
4209the same cryptographic services as CORE, but on an end-to-end level. This is
4210done so peers retransmitting traffic on behalf of other peers cannot access the
4211payload data.
4212
4213@itemize @bullet
4214@item CADET provides confidentiality with so-called perfect forward secrecy; we
4215use ECDHE powered by Curve25519 for the key exchange and then use symmetric
4216encryption, encrypting with both AES-256 and Twofish
4217@item authentication is achieved by signing the ephemeral keys using Ed25519, a
4218deterministic variant of ECDSA
4219@item integrity protection (using SHA-512 to do encrypt-then-MAC, although only
4220256 bits are sent to reduce overhead)
4221@item replay protection (using nonces, timestamps, challenge-response, message
4222counters and ephemeral keys)
4223@item liveness (keep-alive messages, timeout)
4224@end itemize
4225
4226Additional to the CORE-like security benefits, CADET offers other properties
4227that make it a more universal service than CORE.
4228
4229@itemize @bullet
4230@item CADET can establish channels to arbitrary peers in GNUnet. If a peer is
4231not immediately reachable, CADET will find a path through the network and ask
4232other peers to retransmit the traffic on its behalf.
4233@item CADET offers (optional) reliability mechanisms. In a reliable channel
4234traffic is guaranteed to arrive complete, unchanged and in-order.
4235@item CADET takes care of flow and congestion control mechanisms, not allowing
4236the sender to send more traffic than the receiver or the network are able to
4237process.
4238@end itemize
4239
4240@menu
4241* libgnunetcadet::
4242@end menu
4243
4244@node libgnunetcadet
4245@subsection libgnunetcadet
4246
4247
4248The CADET API (defined in gnunet_cadet_service.h) is the messaging API used by
4249P2P applications built using GNUnet. It provides applications the ability to
4250send and receive encrypted messages to any peer participating in GNUnet. The
4251API is heavily base on the CORE API.
4252
4253CADET delivers messages to other peers in "channels". A channel is a permanent
4254connection defined by a destination peer (identified by its public key) and a
4255port number. Internally, CADET tunnels all channels towards a destiantion peer
4256using one session key and relays the data on multiple "connections",
4257independent from the channels.
4258
4259Each channel has optional paramenters, the most important being the reliability
4260flag. Should a message get lost on TRANSPORT/CORE level, if a channel is
4261created with as reliable, CADET will retransmit the lost message and deliver it
4262in order to the destination application.
4263
4264To communicate with other peers using CADET, it is necessary to first connect
4265to the service using @code{GNUNET_CADET_connect}. This function takes several
4266parameters in form of callbacks, to allow the client to react to various
4267events, like incoming channels or channels that terminate, as well as specify a
4268list of ports the client wishes to listen to (at the moment it is not possible
4269to start listening on further ports once connected, but nothing prevents a
4270client to connect several times to CADET, even do one connection per listening
4271port). The function returns a handle which has to be used for any further
4272interaction with the service.
4273
4274To connect to a remote peer a client has to call the
4275@code{GNUNET_CADET_channel_create} function. The most important parameters
4276given are the remote peer's identity (it public key) and a port, which
4277specifies which application on the remote peer to connect to, similar to
4278TCP/UDP ports. CADET will then find the peer in the GNUnet network and
4279establish the proper low-level connections and do the necessary key exchanges
4280to assure and authenticated, secure and verified communication. Similar to
4281@code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel} returns a handle
4282to interact with the created channel.
4283
4284For every message the client wants to send to the remote application,
4285@code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
4286channel on which the message should be sent and the size of the message (but
4287not the message itself!). Once CADET is ready to send the message, the provided
4288callback will fire, and the message contents are provided to this callback.
4289
4290Please note the CADET does not provide an explicit notification of when a
4291channel is connected. In loosely connected networks, like big wireless mesh
4292networks, this can take several seconds, even minutes in the worst case. To be
4293alerted when a channel is online, a client can call
4294@code{GNUNET_CADET_notify_transmit_ready} immediately after
4295@code{GNUNET_CADET_create_channel}. When the callback is activated, it means
4296that the channel is online. The callback can give 0 bytes to CADET if no
4297message is to be sent, this is ok.
4298
4299If a transmission was requested but before the callback fires it is no longer
4300needed, it can be cancelled with
4301@code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle given
4302back by @code{GNUNET_CADET_notify_transmit_ready}. As in the case of CORE, only
4303one message can be requested at a time: a client must not call
4304@code{GNUNET_CADET_notify_transmit_ready} again until the callback is called or
4305the request is cancelled.
4306
4307When a channel is no longer needed, a client can call
4308@code{GNUNET_CADET_channel_destroy} to get rid of it. Note that CADET will try
4309to transmit all pending traffic before notifying the remote peer of the
4310destruction of the channel, including retransmitting lost messages if the
4311channel was reliable.
4312
4313Incoming channels, channels being closed by the remote peer, and traffic on any
4314incoming or outgoing channels are given to the client when CADET executes the
4315callbacks given to it at the time of @code{GNUNET_CADET_connect}.
4316
4317Finally, when an application no longer wants to use CADET, it should call
4318@code{GNUNET_CADET_disconnect}, but first all channels and pending
4319transmissions must be closed (otherwise CADET will complain).
4320
4321@node GNUnet's NSE subsystem
4322@section GNUnet's NSE subsystem
4323
4324
4325NSE stands for Network Size Estimation. The NSE subsystem provides other
4326subsystems and users with a rough estimate of the number of peers currently
4327participating in the GNUnet overlay. The computed value is not a precise number
4328as producing a precise number in a decentralized, efficient and secure way is
4329impossible. While NSE's estimate is inherently imprecise, NSE also gives the
4330expected range. For a peer that has been running in a stable network for a
4331while, the real network size will typically (99.7% of the time) be in the range
4332of [2/3 estimate, 3/2 estimate]. We will now give an overview of the algorithm
4333used to calcualte the estimate; all of the details can be found in this
4334technical report.
4335
4336@menu
4337* Motivation::
4338* Principle::
4339* libgnunetnse::
4340* The NSE Client-Service Protocol::
4341* The NSE Peer-to-Peer Protocol::
4342@end menu
4343
4344@node Motivation
4345@subsection Motivation
4346
4347
4348Some subsytems, like DHT, need to know the size of the GNUnet network to
4349optimize some parameters of their own protocol. The decentralized nature of
4350GNUnet makes efficient and securely counting the exact number of peers
4351infeasable. Although there are several decentralized algorithms to count the
4352number of peers in a system, so far there is none to do so securely. Other
4353protocols may allow any malicious peer to manipulate the final result or to
4354take advantage of the system to perform DoS (Denial of Service) attacks against
4355the network. GNUnet's NSE protocol avoids these drawbacks.
4356
4357
4358
4359@menu
4360* Security::
4361@end menu
4362
4363@node Security
4364@subsubsection Security
4365
4366
4367The NSE subsystem is designed to be resilient against these attacks. It uses
4368@uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work} to
4369prevent one peer from impersonating a large number of participants, which would
4370otherwise allow an adversary to artifically inflate the estimate. The DoS
4371protection comes from the time-based nature of the protocol: the estimates are
4372calculated periodically and out-of-time traffic is either ignored or stored for
4373later retransmission by benign peers. In particular, peers cannot trigger
4374global network communication at will.
4375
4376@node Principle
4377@subsection Principle
4378
4379
4380The algorithm calculates the estimate by finding the globally closest peer ID
4381to a random, time-based value.
4382
4383The idea is that the closer the ID is to the random value, the more "densely
4384packed" the ID space is, and therefore, more peers are in the network.
4385
4386
4387
4388@menu
4389* Example::
4390* Algorithm::
4391* Target value::
4392* Timing::
4393* Controlled Flooding::
4394* Calculating the estimate::
4395@end menu
4396
4397@node Example
4398@subsubsection Example
4399
4400
4401Suppose all peers have IDs between 0 and 100 (our ID space), and the random
4402value is 42. If the closest peer has the ID 70 we can imagine that the average
4403"distance" between peers is around 30 and therefore the are around 3 peers in
4404the whole ID space. On the other hand, if the closest peer has the ID 44, we
4405can imagine that the space is rather packed with peers, maybe as much as 50 of
4406them. Naturally, we could have been rather unlucky, and there is only one peer
4407and happens to have the ID 44. Thus, the current estimate is calculated as the
4408average over multiple rounds, and not just a single sample.
4409
4410@node Algorithm
4411@subsubsection Algorithm
4412
4413
4414Given that example, one can imagine that the job of the subsystem is to
4415efficiently communicate the ID of the closest peer to the target value to all
4416the other peers, who will calculate the estimate from it.
4417
4418@node Target value
4419@subsubsection Target value
4420
4421@c %**end of header
4422
4423The target value itself is generated by hashing the current time, rounded down
4424to an agreed value. If the rounding amount is 1h (default) and the time is
442512:34:56, the time to hash would be 12:00:00. The process is repeated each
4426rouning amount (in this example would be every hour). Every repetition is
4427called a round.
4428
4429@node Timing
4430@subsubsection Timing
4431@c %**end of header
4432
4433The NSE subsystem has some timing control to avoid everybody broadcasting its
4434ID all at one. Once each peer has the target random value, it compares its own
4435ID to the target and calculates the hypothetical size of the network if that
4436peer were to be the closest. Then it compares the hypothetical size with the
4437estimate from the previous rounds. For each value there is an assiciated point
4438in the period, let's call it "broadcast time". If its own hypothetical estimate
4439is the same as the previous global estimate, its "broadcast time" will be in
4440the middle of the round. If its bigger it will be earlier and if its smaler
4441(the most likely case) it will be later. This ensures that the peers closests
4442to the target value start broadcasting their ID the first.
4443
4444@node Controlled Flooding
4445@subsubsection Controlled Flooding
4446
4447@c %**end of header
4448
4449When a peer receives a value, first it verifies that it is closer than the
4450closest value it had so far, otherwise it answers the incoming message with a
4451message containing the better value. Then it checks a proof of work that must
4452be included in the incoming message, to ensure that the other peer's ID is not
4453made up (otherwise a malicious peer could claim to have an ID of exactly the
4454target value every round). Once validated, it compares the brodcast time of the
4455received value with the current time and if it's not too early, sends the
4456received value to its neighbors. Otherwise it stores the value until the
4457correct broadcast time comes. This prevents unnecessary traffic of sub-optimal
4458values, since a better value can come before the broadcast time, rendering the
4459previous one obsolete and saving the traffic that would have been used to
4460broadcast it to the neighbors.
4461
4462@node Calculating the estimate
4463@subsubsection Calculating the estimate
4464
4465@c %**end of header
4466
4467Once the closest ID has been spread across the network each peer gets the exact
4468distance betweed this ID and the target value of the round and calculates the
4469estimate with a mathematical formula described in the tech report. The estimate
4470generated with this method for a single round is not very precise. Remember the
4471case of the example, where the only peer is the ID 44 and we happen to generate
4472the target value 42, thinking there are 50 peers in the network. Therefore, the
4473NSE subsystem remembers the last 64 estimates and calculates an average over
4474them, giving a result of which usually has one bit of uncertainty (the real
4475size could be half of the estimate or twice as much). Note that the actual
4476network size is calculated in powers of two of the raw input, thus one bit of
4477uncertainty means a factor of two in the size estimate.
4478
4479@node libgnunetnse
4480@subsection libgnunetnse
4481
4482@c %**end of header
4483
4484The NSE subsystem has the simplest API of all services, with only two calls:
4485@code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
4486
4487The connect call gets a callback function as a parameter and this function is
4488called each time the network agrees on an estimate. This usually is once per
4489round, with some exceptions: if the closest peer has a late local clock and
4490starts spreading his ID after everyone else agreed on a value, the callback
4491might be activated twice in a round, the second value being always bigger than
4492the first. The default round time is set to 1 hour.
4493
4494The disconnect call disconnects from the NSE subsystem and the callback is no
4495longer called with new estimates.
4496
4497
4498
4499@menu
4500* Results::
4501* Examples2::
4502@end menu
4503
4504@node Results
4505@subsubsection Results
4506
4507@c %**end of header
4508
4509The callback provides two values: the average and the
4510@uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation} of
4511the last 64 rounds. The values provided by the callback function are
4512logarithmic, this means that the real estimate numbers can be obtained by
4513calculating 2 to the power of the given value (2average). From a statistics
4514point of view this means that:
4515
4516@itemize @bullet
4517@item 68% of the time the real size is included in the interval
4518[(2average-stddev), 2]
4519@item 95% of the time the real size is included in the interval
4520[(2average-2*stddev, 2^average+2*stddev]
4521@item 99.7% of the time the real size is included in the interval
4522[(2average-3*stddev, 2average+3*stddev]
4523@end itemize
4524
4525The expected standard variation for 64 rounds in a network of stable size is
45260.2. Thus, we can say that normally:
4527
4528@itemize @bullet
4529@item 68% of the time the real size is in the range [-13%, +15%]
4530@item 95% of the time the real size is in the range [-24%, +32%]
4531@item 99.7% of the time the real size is in the range [-34%, +52%]
4532@end itemize
4533
4534As said in the introduction, we can be quite sure that usually the real size is
4535between one third and three times the estimate. This can of course vary with
4536network conditions. Thus, applications may want to also consider the provided
4537standard deviation value, not only the average (in particular, if the standard
4538veriation is very high, the average maybe meaningless: the network size is
4539changing rapidly).
4540
4541@node Examples2
4542@subsubsection Examples2
4543
4544@c %**end of header
4545
4546Let's close with a couple examples.
4547
4548@table @asis
4549
4550@item Average: 10, std dev: 1 Here the estimate would be 2^10 = 1024 peers.@
4551The range in which we can be 95% sure is: [2^8, 2^12] = [256, 4096]. We can be
4552very (>99.7%) sure that the network is not a hundred peers and absolutely sure
4553that it is not a million peers, but somewhere around a thousand.
4554
4555@item Average 22, std dev: 0.2 Here the estimate would be 2^22 = 4 Million peers.@
4556The range in which we can be 99.7% sure is: [2^21.4, 2^22.6] = [2.8M, 6.3M].
4557We can be sure that the network size is around four million, with absolutely
4558way of it being 1 million.
4559
4560@end table
4561
4562To put this in perspective, if someone remembers the LHC Higgs boson results,
4563were announced with "5 sigma" and "6 sigma" certainties. In this case a 5 sigma
4564minimum would be 2 million and a 6 sigma minimum, 1.8 million.
4565
4566@node The NSE Client-Service Protocol
4567@subsection The NSE Client-Service Protocol
4568
4569@c %**end of header
4570
4571As with the API, the client-service protocol is very simple, only has 2
4572different messages, defined in @code{src/nse/nse.h}:
4573
4574@itemize @bullet
4575@item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters and
4576is sent from the client to the service upon connection.
4577@item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from the
4578service to the client for every new estimate and upon connection. Contains a
4579timestamp for the estimate, the average and the standard deviation for the
4580respective round.
4581@end itemize
4582
4583When the @code{GNUNET_NSE_disconnect} API call is executed, the client simply
4584disconnects from the service, with no message involved.
4585
4586@node The NSE Peer-to-Peer Protocol
4587@subsection The NSE Peer-to-Peer Protocol
4588
4589@c %**end of header
4590
4591The NSE subsystem only has one message in the P2P protocol, the
4592@code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
4593
4594This message key contents are the timestamp to identify the round (differences
4595in system clocks may cause some peers to send messages way too early or way too
4596late, so the timestamp allows other peers to identify such messages easily),
4597the @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
4598used to make it difficult to mount a
4599@uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the public
4600key, which is used to verify the signature on the message.
4601
4602Every peer stores a message for the previous, current and next round. The
4603messages for the previous and current round are given to peers that connect to
4604us. The message for the next round is simply stored until our system clock
4605advances to the next round. The message for the current round is what we are
4606flooding the network with right now. At the beginning of each round the peer
4607does the following:
4608
4609@itemize @bullet
4610@item calculates his own distance to the target value
4611@item creates, signs and stores the message for the current round (unless it
4612has a better message in the "next round" slot which came early in the previous
4613round)
4614@item calculates, based on the stored round message (own or received) when to
4615stard flooding it to its neighbors
4616@end itemize
4617
4618Upon receiving a message the peer checks the validity of the message (round,
4619proof of work, signature). The next action depends on the contents of the
4620incoming message:
4621
4622@itemize @bullet
4623@item if the message is worse than the current stored message, the peer sends
4624the current message back immediately, to stop the other peer from spreading
4625suboptimal results
4626@item if the message is better than the current stored message, the peer stores
4627the new message and calculates the new target time to start spreading it to its
4628neighbors (excluding the one the message came from)
4629@item if the message is for the previous round, it is compared to the message
4630stored in the "previous round slot", which may then be updated
4631@item if the message is for the next round, it is compared to the message
4632stored in the "next round slot", which again may then be updated
4633@end itemize
4634
4635Finally, when it comes to send the stored message for the current round to the
4636neighbors there is a random delay added for each neighbor, to avoid traffic
4637spikes and minimize cross-messages.
4638
4639@node GNUnet's HOSTLIST subsystem
4640@section GNUnet's HOSTLIST subsystem
4641
4642@c %**end of header
4643
4644Peers in the GNUnet overlay network need address information so that they can
4645connect with other peers. GNUnet uses so called HELLO messages to store and
4646exchange peer addresses. GNUnet provides several methods for peers to obtain
4647this information:
4648
4649@itemize @bullet
4650@item out-of-band exchange of HELLO messages (manually, using for example
4651gnunet-peerinfo)
4652@item HELLO messages shipped with GNUnet (automatic with distribution)
4653@item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
4654@item topology gossiping (learning from other peers we already connected to),
4655and
4656@item the HOSTLIST daemon covered in this section, which is particularly
4657relevant for bootstrapping new peers.
4658@end itemize
4659
4660New peers have no existing connections (and thus cannot learn from gossip among
4661peers), may not have other peers in their LAN and might be started with an
4662outdated set of HELLO messages from the distribution. In this case, getting new
4663peers to connect to the network requires either manual effort or the use of a
4664HOSTLIST to obtain HELLOs.
4665
4666@menu
4667* HELLOs::
4668* Overview for the HOSTLIST subsystem::
4669* Interacting with the HOSTLIST daemon::
4670* Hostlist security address validation::
4671* The HOSTLIST daemon::
4672* The HOSTLIST server::
4673* The HOSTLIST client::
4674* Usage::
4675@end menu
4676
4677@node HELLOs
4678@subsection HELLOs
4679
4680@c %**end of header
4681
4682The basic information peers require to connect to other peers are contained in
4683so called HELLO messages you can think of as a business card. Besides the
4684identity of the peer (based on the cryptographic public key) a HELLO message
4685may contain address information that specifies ways to contact a peer. By
4686obtaining HELLO messages, a peer can learn how to contact other peers.
4687
4688@node Overview for the HOSTLIST subsystem
4689@subsection Overview for the HOSTLIST subsystem
4690
4691@c %**end of header
4692
4693The HOSTLIST subsystem provides a way to distribute and obtain contact
4694information to connect to other peers using a simple HTTP GET request. It's
4695implementation is split in three parts, the main file for the daemon itself
4696(gnunet-daemon-hostlist.c), the HTTP client used to download peer information
4697(hostlist-client.c) and the server component used to provide this information
4698to other peers (hostlist-server.c). The server is basically a small HTTP web
4699server (based on GNU libmicrohttpd) which provides a list of HELLOs known to
4700the local peer for download. The client component is basically a HTTP client
4701(based on libcurl) which can download hostlists from one or more websites. The
4702hostlist format is a binary blob containing a sequence of HELLO messages. Note
4703that any HTTP server can theoretically serve a hostlist, the build-in hostlist
4704server makes it simply convenient to offer this service.
4705
4706
4707@menu
4708* Features::
4709* Limitations2::
4710@end menu
4711
4712@node Features
4713@subsubsection Features
4714
4715@c %**end of header
4716
4717The HOSTLIST daemon can:
4718
4719@itemize @bullet
4720@item provide HELLO messages with validated addresses obtained from PEERINFO to
4721download for other peers
4722@item download HELLO messages and forward these message to the TRANSPORT
4723subsystem for validation
4724@item advertises the URL of this peer's hostlist address to other peers via
4725gossip
4726@item automatically learn about hostlist servers from the gossip of other peers
4727@end itemize
4728
4729@node Limitations2
4730@subsubsection Limitations2
4731
4732@c %**end of header
4733
4734The HOSTLIST daemon does not:
4735
4736@itemize @bullet
4737@item verify the cryptographic information in the HELLO messages
4738@item verify the address information in the HELLO messages
4739@end itemize
4740
4741@node Interacting with the HOSTLIST daemon
4742@subsection Interacting with the HOSTLIST daemon
4743
4744@c %**end of header
4745
4746The HOSTLIST subsystem is currently implemented as a daemon, so there is no
4747need for the user to interact with it and therefore there is no command line
4748tool and no API to communicate with the daemon. In the future, we can envision
4749changing this to allow users to manually trigger the download of a hostlist.
4750
4751Since there is no command line interface to interact with HOSTLIST, the only
4752way to interact with the hostlist is to use STATISTICS to obtain or modify
4753information about the status of HOSTLIST:
4754@example
4755$ gnunet-statistics -s hostlist
4756@end example
4757
4758In particular, HOSTLIST includes a @strong{persistent} value in statistics that
4759specifies when the hostlist server might be queried next. As this value is
4760exponentially increasing during runtime, developers may want to reset or
4761manually adjust it. Note that HOSTLIST (but not STATISTICS) needs to be
4762shutdown if changes to this value are to have any effect on the daemon (as
4763HOSTLIST does not monitor STATISTICS for changes to the download
4764frequency).
4765
4766@node Hostlist security address validation
4767@subsection Hostlist security address validation
4768
4769@c %**end of header
4770
4771Since information obtained from other parties cannot be trusted without
4772validation, we have to distinguish between @emph{validated} and @emph{not
4773validated} addresses. Before using (and so trusting) information from other
4774parties, this information has to be double-checked (validated). Address
4775validation is not done by HOSTLIST but by the TRANSPORT service.
4776
4777The HOSTLIST component is functionally located between the PEERINFO and the
4778TRANSPORT subsystem. When acting as a server, the daemon obtains valid
4779(@emph{validated}) peer information (HELLO messages) from the PEERINFO service
4780and provides it to other peers. When acting as a client, it contacts the
4781HOSTLIST servers specified in the configuration, downloads the (unvalidated)
4782list of HELLO messages and forwards these information to the TRANSPORT server
4783to validate the addresses.
4784
4785@node The HOSTLIST daemon
4786@subsection The HOSTLIST daemon
4787
4788@c %**end of header
4789
4790The hostlist daemon is the main component of the HOSTLIST subsystem. It is
4791started by the ARM service and (if configured) starts the HOSTLIST client and
4792server components.
4793
4794If the daemon provides a hostlist itself it can advertise it's own hostlist to
4795other peers. To do so it sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
4796message to other peers when they connect to this peer on the CORE level. This
4797hostlist advertisement message contains the URL to access the HOSTLIST HTTP
4798server of the sender. The daemon may also subscribe to this type of message
4799from CORE service, and then forward these kind of message to the HOSTLIST
4800client. The client then uses all available URLs to download peer information
4801when necessary.
4802
4803When starting, the HOSTLIST daemon first connects to the CORE subsystem and if
4804hostlist learning is enabled, registers a CORE handler to receive this kind of
4805messages. Next it starts (if configured) the client and server. It passes
4806pointers to CORE connect and disconnect and receive handlers where the client
4807and server store their functions, so the daemon can notify them about CORE
4808events.
4809
4810To clean up on shutdown, the daemon has a cleaning task, shutting down all
4811subsystems and disconnecting from CORE.
4812
4813@node The HOSTLIST server
4814@subsection The HOSTLIST server
4815
4816@c %**end of header
4817
4818The server provides a way for other peers to obtain HELLOs. Basically it is a
4819small web server other peers can connect to and download a list of HELLOs using
4820standard HTTP; it may also advertise the URL of the hostlist to other peers
4821connecting on CORE level.
4822
4823
4824@menu
4825* The HTTP Server::
4826* Advertising the URL::
4827@end menu
4828
4829@node The HTTP Server
4830@subsubsection The HTTP Server
4831
4832@c %**end of header
4833
4834During startup, the server starts a web server listening on the port specified
4835with the HTTPPORT value (default 8080). In addition it connects to the PEERINFO
4836service to obtain peer information. The HOSTLIST server uses the
4837GNUNET_PEERINFO_iterate function to request HELLO information for all peers and
4838adds their information to a new hostlist if they are suitable (expired
4839addresses and HELLOs without addresses are both not suitable) and the maximum
4840size for a hostlist is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When
4841PEERINFO finishes (with a last NULL callback), the server destroys the previous
4842hostlist response available for download on the web server and replaces it with
4843the updated hostlist. The hostlist format is basically a sequence of HELLO
4844messages (as obtained from PEERINFO) without any special tokenization. Since
4845each HELLO message contains a size field, the response can easily be split into
4846separate HELLO messages by the client.
4847
4848A HOSTLIST client connecting to the HOSTLIST server will receive the hostlist
4849as a HTTP response and the the server will terminate the connection with the
4850result code HTTP 200 OK. The connection will be closed immediately if no
4851hostlist is available.
4852
4853@node Advertising the URL
4854@subsubsection Advertising the URL
4855
4856@c %**end of header
4857
4858The server also advertises the URL to download the hostlist to other peers if
4859hostlist advertisement is enabled. When a new peer connects and has hostlist
4860learning enabled, the server sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
4861message to this peer using the CORE service.
4862
4863@node The HOSTLIST client
4864@subsection The HOSTLIST client
4865
4866@c %**end of header
4867
4868The client provides the functionality to download the list of HELLOs from a set
4869of URLs. It performs a standard HTTP request to the URLs configured and learned
4870from advertisement messages received from other peers. When a HELLO is
4871downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT service for
4872validation.
4873
4874The client supports two modes of operation: download of HELLOs (bootstrapping)
4875and learning of URLs.
4876
4877
4878@menu
4879* Bootstrapping::
4880* Learning::
4881@end menu
4882
4883@node Bootstrapping
4884@subsubsection Bootstrapping
4885
4886@c %**end of header
4887
4888For bootstrapping, it schedules a task to download the hostlist from the set of
4889known URLs. The downloads are only performed if the number of current
4890connections is smaller than a minimum number of connections (at the moment 4).
4891The interval between downloads increases exponentially; however, the
4892exponential growth is limited if it becomes longer than an hour. At that point,
4893the frequency growth is capped at (#number of connections * 1h).
4894
4895Once the decision has been taken to download HELLOs, the daemon chooses a
4896random URL from the list of known URLs. URLs can be configured in the
4897configuration or be learned from advertisement messages. The client uses a HTTP
4898client library (libcurl) to initiate the download using the libcurl multi
4899interface. Libcurl passes the data to the callback_download function which
4900stores the data in a buffer if space is available and the maximum size for a
4901hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When a
4902full HELLO was downloaded, the HOSTLIST client offers this HELLO message to the
4903TRANSPORT service for validation. When the download is finished or failed,
4904statistical information about the quality of this URL is updated.
4905
4906@node Learning
4907@subsubsection Learning
4908
4909@c %**end of header
4910
4911The client also manages hostlist advertisements from other peers. The HOSTLIST
4912daemon forwards GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT messages to the
4913client subsystem, which extracts the URL from the message. Next, a test of the
4914newly obtained URL is performed by triggering a download from the new URL. If
4915the URL works correctly, it is added to the list of working URLs.
4916
4917The size of the list of URLs is restricted, so if an additional server is added
4918and the list is full, the URL with the worst quality ranking (determined
4919through successful downloads and number of HELLOs e.g.) is discarded. During
4920shutdown the list of URLs is saved to a file for persistance and loaded on
4921startup. URLs from the configuration file are never discarded.
4922
4923@node Usage
4924@subsection Usage
4925
4926@c %**end of header
4927
4928To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES section
4929for the ARM services. This is done in the default configuration.
4930
4931For more information on how to configure the HOSTLIST subsystem see the
4932installation handbook:@ Configuring the hostlist to bootstrap@ Configuring your
4933peer to provide a hostlist
4934
4935@node GNUnet's IDENTITY subsystem
4936@section GNUnet's IDENTITY subsystem
4937
4938@c %**end of header
4939
4940Identities of "users" in GNUnet are called egos. Egos can be used as pseudonyms
4941(fake names) or be tied to an organization (for example, GNU) or even the
4942actual identity of a human. GNUnet users are expected to have many egos. They
4943might have one tied to their real identity, some for organizations they manage,
4944and more for different domains where they want to operate under a pseudonym.
4945
4946The IDENTITY service allows users to manage their egos. The identity service
4947manages the private keys egos of the local user; it does not manage identities
4948of other users (public keys). Public keys for other users need names to become
4949manageable. GNUnet uses the GNU Name System (GNS) to give names to other users
4950and manage their public keys securely. This chapter is about the IDENTITY
4951service, which is about the management of private keys.
4952
4953On the network, an ego corresponds to an ECDSA key (over Curve25519, using RFC
49546979, as required by GNS). Thus, users can perform actions under a particular
4955ego by using (signing with) a particular private key. Other users can then
4956confirm that the action was really performed by that ego by checking the
4957signature against the respective public key.
4958
4959The IDENTITY service allows users to associate a human-readable name with each
4960ego. This way, users can use names that will remind them of the purpose of a
4961particular ego. The IDENTITY service will store the respective private keys and
4962allows applications to access key information by name. Users can change the
4963name that is locally (!) associated with an ego. Egos can also be deleted,
4964which means that the private key will be removed and it thus will not be
4965possible to perform actions with that ego in the future.
4966
4967Additionally, the IDENTITY subsystem can associate service functions with egos.
4968For example, GNS requires the ego that should be used for the shorten zone. GNS
4969will ask IDENTITY for an ego for the "gns-short" service. The IDENTITY service
4970has a mapping of such service strings to the name of the ego that the user
4971wants to use for this service, for example "my-short-zone-ego".
4972
4973Finally, the IDENTITY API provides access to a special ego, the anonymous ego.
4974The anonymous ego is special in that its private key is not really private, but
4975fixed and known to everyone. Thus, anyone can perform actions as anonymous.
4976This can be useful as with this trick, code does not have to contain a special
4977case to distinguish between anonymous and pseudonymous egos.
4978
4979@menu
4980* libgnunetidentity::
4981* The IDENTITY Client-Service Protocol::
4982@end menu
4983
4984@node libgnunetidentity
4985@subsection libgnunetidentity
4986@c %**end of header
4987
4988
4989@menu
4990* Connecting to the service::
4991* Operations on Egos::
4992* The anonymous Ego::
4993* Convenience API to lookup a single ego::
4994* Associating egos with service functions::
4995@end menu
4996
4997@node Connecting to the service
4998@subsubsection Connecting to the service
4999
5000@c %**end of header
5001
5002First, typical clients connect to the identity service using
5003@code{GNUNET_IDENTITY_connect}. This function takes a callback as a parameter.
5004If the given callback parameter is non-null, it will be invoked to notify the
5005application about the current state of the identities in the system.
5006
5007@itemize @bullet
5008@item First, it will be invoked on all known egos at the time of the
5009connection. For each ego, a handle to the ego and the user's name for the ego
5010will be passed to the callback. Furthermore, a @code{void **} context argument
5011will be provided which gives the client the opportunity to associate some state
5012with the ego.
5013@item Second, the callback will be invoked with NULL for the ego, the name and
5014the context. This signals that the (initial) iteration over all egos has
5015completed.
5016@item Then, the callback will be invoked whenever something changes about an
5017ego. If an ego is renamed, the callback is invoked with the ego handle of the
5018ego that was renamed, and the new name. If an ego is deleted, the callback is
5019invoked with the ego handle and a name of NULL. In the deletion case, the
5020application should also release resources stored in the context.
5021@item When the application destroys the connection to the identity service
5022using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked with the
5023ego and a name of NULL (equivalent to deletion of the egos). This should again
5024be used to clean up the per-ego context.
5025@end itemize
5026
5027The ego handle passed to the callback remains valid until the callback is
5028invoked with a name of NULL, so it is safe to store a reference to the ego's
5029handle.
5030
5031@node Operations on Egos
5032@subsubsection Operations on Egos
5033
5034@c %**end of header
5035
5036Given an ego handle, the main operations are to get its associated private key
5037using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated public key
5038using @code{GNUNET_IDENTITY_ego_get_public_key}.
5039
5040The other operations on egos are pretty straightforward. Using
5041@code{GNUNET_IDENTITY_create}, an application can request the creation of an
5042ego by specifying the desired name. The operation will fail if that name is
5043already in use. Using @code{GNUNET_IDENTITY_rename} the name of an existing ego
5044can be changed. Finally, egos can be deleted using
5045@code{GNUNET_IDENTITY_delete}. All of these operations will trigger updates to
5046the callback given to the @code{GNUNET_IDENTITY_connect} function of all
5047applications that are connected with the identity service at the time.
5048@code{GNUNET_IDENTITY_cancel} can be used to cancel the operations before the
5049respective continuations would be called. It is not guaranteed that the
5050operation will not be completed anyway, only the continuation will no longer be
5051called.
5052
5053@node The anonymous Ego
5054@subsubsection The anonymous Ego
5055
5056@c %**end of header
5057
5058A special way to obtain an ego handle is to call
5059@code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5060"anonymous" user --- anyone knows and can get the private key for this user, so
5061it is suitable for operations that are supposed to be anonymous but require
5062signatures (for example, to avoid a special path in the code). The anonymous
5063ego is always valid and accessing it does not require a connection to the
5064identity service.
5065
5066@node Convenience API to lookup a single ego
5067@subsubsection Convenience API to lookup a single ego
5068
5069
5070As applications commonly simply have to lookup a single ego, there is a
5071convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5072lookup a single ego by name. Note that this is the user's name for the ego, not
5073the service function. The resulting ego will be returned via a callback and
5074will only be valid during that callback. The operation can be cancelled via
5075@code{GNUNET_IDENTITY_ego_lookup_cancel} (cancellation is only legal before the
5076callback is invoked).
5077
5078@node Associating egos with service functions
5079@subsubsection Associating egos with service functions
5080
5081
5082The @code{GNUNET_IDENTITY_set} function is used to associate a particular ego
5083with a service function. The name used by the service and the ego are given as
5084arguments. Afterwards, the service can use its name to lookup the associated
5085ego using @code{GNUNET_IDENTITY_get}.
5086
5087@node The IDENTITY Client-Service Protocol
5088@subsection The IDENTITY Client-Service Protocol
5089
5090@c %**end of header
5091
5092A client connecting to the identity service first sends a message with type
5093@code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5094client will receive information about changes to the egos by receiving messages
5095of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}. Those messages contain the
5096private key of the ego and the user's name of the ego (or zero bytes for the
5097name to indicate that the ego was deleted). A special bit @code{end_of_list} is
5098used to indicate the end of the initial iteration over the identity service's
5099egos.
5100
5101The client can trigger changes to the egos by sending CREATE, RENAME or DELETE
5102messages. The CREATE message contains the private key and the desired name. The
5103RENAME message contains the old name and the new name. The DELETE message only
5104needs to include the name of the ego to delete. The service responds to each of
5105these messages with a RESULT_CODE message which indicates success or error of
5106the operation, and possibly a human-readable error message.
5107
5108Finally, the client can bind the name of a service function to an ego by
5109sending a SET_DEFAULT message with the name of the service function and the
5110private key of the ego. Such bindings can then be resolved using a GET_DEFAULT
5111message, which includes the name of the service function. The identity service
5112will respond to a GET_DEFAULT request with a SET_DEFAULT message containing the
5113respective information, or with a RESULT_CODE to indicate an error.
5114
5115@node GNUnet's NAMESTORE Subsystem
5116@section GNUnet's NAMESTORE Subsystem
5117
5118@c %**end of header
5119
5120The NAMESTORE subsystem provides persistent storage for local GNS zone
5121information. All local GNS zone information are managed by NAMESTORE. It
5122provides both the functionality to administer local GNS information (e.g.
5123delete and add records) as well as to retrieve GNS information (e.g to list
5124name information in a client). NAMESTORE does only manage the persistent
5125storage of zone information belonging to the user running the service: GNS
5126information from other users obtained from the DHT are stored by the NAMECACHE
5127subsystem.
5128
5129NAMESTORE uses a plugin-based database backend to store GNS information with
5130good performance. Here sqlite, MySQL and PostgreSQL are supported database
5131backends. NAMESTORE clients interact with the IDENTITY subsystem to obtain
5132cryptographic information about zones based on egos as described with the
5133IDENTITY subsystem., but internally NAMESTORE refers to zones using the ECDSA
5134private key. In addition, it collaborates with the NAMECACHE subsystem and
5135stores zone information when local information are modified in the GNS cache to
5136increase look-up performance for local information.
5137
5138NAMESTORE provides functionality to look-up and store records, to iterate over
5139a specific or all zones and to monitor zones for changes. NAMESTORE
5140functionality can be accessed using the NAMESTORE api or the NAMESTORE command
5141line tool.
5142
5143@menu
5144* libgnunetnamestore::
5145@end menu
5146
5147@node libgnunetnamestore
5148@subsection libgnunetnamestore
5149
5150@c %**end of header
5151
5152To interact with NAMESTORE clients first connect to the NAMESTORE service using
5153the @code{GNUNET_NAMESTORE_connect} passing a configuration handle. As a result
5154they obtain a NAMESTORE handle, they can use for operations, or NULL is
5155returned if the connection failed.
5156
5157To disconnect from NAMESTORE, clients use @code{GNUNET_NAMESTORE_disconnect}
5158and specify the handle to disconnect.
5159
5160NAMESTORE internally uses the ECDSA private key to refer to zones. These
5161private keys can be obtained from the IDENTITY subsytem. Here @emph{egos@emph{
5162can be used to refer to zones or the default ego assigned to the GNS subsystem
5163can be used to obtained the master zone's private key.}}
5164
5165
5166@menu
5167* Editing Zone Information::
5168* Iterating Zone Information::
5169* Monitoring Zone Information::
5170@end menu
5171
5172@node Editing Zone Information
5173@subsubsection Editing Zone Information
5174
5175@c %**end of header
5176
5177NAMESTORE provides functions to lookup records stored under a label in a zone
5178and to store records under a label in a zone.
5179
5180To store (and delete) records, the client uses the
5181@code{GNUNET_NAMESTORE_records_store} function and has to provide namestore
5182handle to use, the private key of the zone, the label to store the records
5183under, the records and number of records plus an callback function. After the
5184operation is performed NAMESTORE will call the provided callback function with
5185the result GNUNET_SYSERR on failure (including timeout/queue drop/failure to
5186validate), GNUNET_NO if content was already there or not found GNUNET_YES (or
5187other positive value) on success plus an additional error message.
5188
5189Records are deleted by using the store command with 0 records to store. It is
5190important to note, that records are not merged when records exist with the
5191label. So a client has first to retrieve records, merge with existing records
5192and then store the result.
5193
5194To perform a lookup operation, the client uses the
5195@code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the
5196namestore handle, the private key of the zone and the label. He also has to
5197provide a callback function which will be called with the result of the lookup
5198operation: the zone for the records, the label, and the records including the
5199number of records included.
5200
5201A special operation is used to set the preferred nickname for a zone. This
5202nickname is stored with the zone and is automatically merged with all labels
5203and records stored in a zone. Here the client uses the
5204@code{GNUNET_NAMESTORE_set_nick} function and passes the private key of the
5205zone, the nickname as string plus a the callback with the result of the
5206operation.
5207
5208@node Iterating Zone Information
5209@subsubsection Iterating Zone Information
5210
5211@c %**end of header
5212
5213A client can iterate over all information in a zone or all zones managed by
5214NAMESTORE. Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
5215function and passes the namestore handle, the zone to iterate over and a
5216callback function to call with the result. If the client wants to iterate over
5217all the, he passes NULL for the zone. A @code{GNUNET_NAMESTORE_ZoneIterator}
5218handle is returned to be used to continue iteration.
5219
5220NAMESTORE calls the callback for every result and expects the client to call@
5221@code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
5222@code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration. When
5223NAMESTORE reached the last item it will call the callback with a NULL value to
5224indicate.
5225
5226@node Monitoring Zone Information
5227@subsubsection Monitoring Zone Information
5228
5229@c %**end of header
5230
5231Clients can also monitor zones to be notified about changes. Here the clients
5232uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and passes the
5233private key of the zone and and a callback function to call with updates for a
5234zone. The client can specify to obtain zone information first by iterating over
5235the zone and specify a synchronization callback to be called when the client
5236and the namestore are synced.
5237
5238On an update, NAMESTORE will call the callback with the private key of the
5239zone, the label and the records and their number.
5240
5241To stop monitoring, the client call @code{GNUNET_NAMESTORE_zone_monitor_stop}
5242and passes the handle obtained from the function to start the monitoring.
5243
5244@node GNUnet's PEERINFO subsystem
5245@section GNUnet's PEERINFO subsystem
5246
5247@c %**end of header
5248
5249The PEERINFO subsystem is used to store verified (validated) information about
5250known peers in a persistent way. It obtains these addresses for example from
5251TRANSPORT service which is in charge of address validation. Validation means
5252that the information in the HELLO message are checked by connecting to the
5253addresses and performing a cryptographic handshake to authenticate the peer
5254instance stating to be reachable with these addresses. Peerinfo does not
5255validate the HELLO messages itself but only stores them and gives them to
5256interested clients.
5257
5258As future work, we think about moving from storing just HELLO messages to
5259providing a generic persistent per-peer information store. More and more
5260subsystems tend to need to store per-peer information in persistent way. To not
5261duplicate this functionality we plan to provide a PEERSTORE service providing
5262this functionality
5263
5264@menu
5265* Features2::
5266* Limitations3::
5267* DeveloperPeer Information::
5268* Startup::
5269* Managing Information::
5270* Obtaining Information::
5271* The PEERINFO Client-Service Protocol::
5272* libgnunetpeerinfo::
5273@end menu
5274
5275@node Features2
5276@subsection Features2
5277
5278@c %**end of header
5279
5280@itemize @bullet
5281@item Persistent storage
5282@item Client notification mechanism on update
5283@item Periodic clean up for expired information
5284@item Differentiation between public and friend-only HELLO
5285@end itemize
5286
5287@node Limitations3
5288@subsection Limitations3
5289
5290
5291@itemize @bullet
5292@item Does not perform HELLO validation
5293@end itemize
5294
5295@node DeveloperPeer Information
5296@subsection DeveloperPeer Information
5297
5298@c %**end of header
5299
5300The PEERINFO subsystem stores these information in the form of HELLO messages
5301you can think of as business cards. These HELLO messages contain the public key
5302of a peer and the addresses a peer can be reached under. The addresses include
5303an expiration date describing how long they are valid. This information is
5304updated regularly by the TRANSPORT service by revalidating the address. If an
5305address is expired and not renewed, it can be removed from the HELLO message.
5306
5307Some peer do not want to have their HELLO messages distributed to other peers ,
5308especially when GNUnet's friend-to-friend modus is enabled. To prevent this
5309undesired distribution. PEERINFO distinguishes between @emph{public} and
5310@emph{friend-only} HELLO messages. Public HELLO messages can be freely
5311distributed to other (possibly unknown) peers (for example using the hostlist,
5312gossiping, broadcasting), whereas friend-only HELLO messages may not be
5313distributed to other peers. Friend-only HELLO messages have an additional flag
5314@code{friend_only} set internally. For public HELLO message this flag is not
5315set. PEERINFO does and cannot not check if a client is allowed to obtain a
5316specific HELLO type.
5317
5318The HELLO messages can be managed using the GNUnet HELLO library. Other GNUnet
5319systems can obtain these information from PEERINFO and use it for their
5320purposes. Clients are for example the HOSTLIST component providing these
5321information to other peers in form of a hostlist or the TRANSPORT subsystem
5322using these information to maintain connections to other peers.
5323
5324@node Startup
5325@subsection Startup
5326
5327@c %**end of header
5328
5329During startup the PEERINFO services loads persistent HELLOs from disk. First
5330PEERINFO parses the directory configured in the HOSTS value of the
5331@code{PEERINFO} configuration section to store PEERINFO information.@ For all
5332files found in this directory valid HELLO messages are extracted. In addition
5333it loads HELLO messages shipped with the GNUnet distribution. These HELLOs are
5334used to simplify network bootstrapping by providing valid peer information with
5335the distribution. The use of these HELLOs can be prevented by setting the
5336@code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
5337@code{NO}. Files containing invalid information are removed.
5338
5339@node Managing Information
5340@subsection Managing Information
5341
5342@c %**end of header
5343
5344The PEERINFO services stores information about known PEERS and a single HELLO
5345message for every peer. A peer does not need to have a HELLO if no information
5346are available. HELLO information from different sources, for example a HELLO
5347obtained from a remote HOSTLIST and a second HELLO stored on disk, are combined
5348and merged into one single HELLO message per peer which will be given to
5349clients. During this merge process the HELLO is immediately written to disk to
5350ensure persistence.
5351
5352PEERINFO in addition periodically scans the directory where information are
5353stored for empty HELLO messages with expired TRANSPORT addresses.@ This
5354periodic task scans all files in the directory and recreates the HELLO messages
5355it finds. Expired TRANSPORT addresses are removed from the HELLO and if the
5356HELLO does not contain any valid addresses, it is discarded and removed from
5357disk.
5358
5359@node Obtaining Information
5360@subsection Obtaining Information
5361
5362@c %**end of header
5363
5364When a client requests information from PEERINFO, PEERINFO performs a lookup
5365for the respective peer or all peers if desired and transmits this information
5366to the client. The client can specify if friend-only HELLOs have to be included
5367or not and PEERINFO filters the respective HELLO messages before transmitting
5368information.
5369
5370To notify clients about changes to PEERINFO information, PEERINFO maintains a
5371list of clients interested in this notifications. Such a notification occurs if
5372a HELLO for a peer was updated (due to a merge for example) or a new peer was
5373added.
5374
5375@node The PEERINFO Client-Service Protocol
5376@subsection The PEERINFO Client-Service Protocol
5377
5378@c %**end of header
5379
5380To connect and disconnect to and from the PEERINFO Service PEERINFO utilizes
5381the util client/server infrastructure, so no special messages types are used
5382here.
5383
5384To add information for a peer, the plain HELLO message is transmitted to the
5385service without any wrapping. Alle information required are stored within the
5386HELLO message. The PEERINFO service provides a message handler accepting and
5387processing these HELLO messages.
5388
5389When obtaining PEERINFO information using the iterate functionality specific
5390messages are used. To obtain information for all peers, a @code{struct
5391ListAllPeersMessage} with message type
5392@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag include_friend_only to
5393indicate if friend-only HELLO messages should be included are transmitted. If
5394information for a specific peer is required a @code{struct ListAllPeersMessage}
5395with @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
5396used.
5397
5398For both variants the PEERINFO service replies for each HELLO message he wants
5399to transmit with a @code{struct ListAllPeersMessage} with type
5400@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO. The final
5401message is @code{struct GNUNET_MessageHeader} with type
5402@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this message,
5403he can proceed with the next request if any is pending
5404
5405@node libgnunetpeerinfo
5406@subsection libgnunetpeerinfo
5407
5408@c %**end of header
5409
5410The PEERINFO API consists mainly of three different functionalities:
5411maintaining a connection to the service, adding new information and retrieving
5412information form the PEERINFO service.
5413
5414
5415@menu
5416* Connecting to the Service::
5417* Adding Information::
5418* Obtaining Information2::
5419@end menu
5420
5421@node Connecting to the Service
5422@subsubsection Connecting to the Service
5423
5424@c %**end of header
5425
5426To connect to the PEERINFO service the function @code{GNUNET_PEERINFO_connect}
5427is used, taking a configuration handle as an argument, and to disconnect from
5428PEERINFO the function @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
5429handle returned from the connect function has to be called.
5430
5431@node Adding Information
5432@subsubsection Adding Information
5433
5434@c %**end of header
5435
5436@code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
5437storage. This function takes the PEERINFO handle as an argument, the HELLO
5438message to store and a continuation with a closure to be called with the result
5439of the operation. The @code{GNUNET_PEERINFO_add_peer} returns a handle to this
5440operation allowing to cancel the operation with the respective cancel function
5441@code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from PEERINFO
5442you can iterate over all information stored with PEERINFO or you can tell
5443PEERINFO to notify if new peer information are available.
5444
5445@node Obtaining Information2
5446@subsubsection Obtaining Information2
5447
5448@c %**end of header
5449
5450To iterate over information in PEERINFO you use @code{GNUNET_PEERINFO_iterate}.
5451This function expects the PEERINFO handle, a flag if HELLO messages intended
5452for friend only mode should be included, a timeout how long the operation
5453should take and a callback with a callback closure to be called for the
5454results. If you want to obtain information for a specific peer, you can specify
5455the peer identity, if this identity is NULL, information for all peers are
5456returned. The function returns a handle to allow to cancel the operation using
5457@code{GNUNET_PEERINFO_iterate_cancel}.
5458
5459To get notified when peer information changes, you can use
5460@code{GNUNET_PEERINFO_notify}. This function expects a configuration handle and
5461a flag if friend-only HELLO messages should be included. The PEERINFO service
5462will notify you about every change and the callback function will be called to
5463notify you about changes. The function returns a handle to cancel notifications
5464with @code{GNUNET_PEERINFO_notify_cancel}.
5465
5466
5467@node GNUnet's PEERSTORE subsystem
5468@section GNUnet's PEERSTORE subsystem
5469
5470@c %**end of header
5471
5472GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
5473GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently store
5474and retrieve arbitrary data. Each data record stored with PEERSTORE contains
5475the following fields:
5476
5477@itemize @bullet
5478@item subsystem: Name of the subsystem responsible for the record.
5479@item peerid: Identity of the peer this record is related to.
5480@item key: a key string identifying the record.
5481@item value: binary record value.
5482@item expiry: record expiry date.
5483@end itemize
5484
5485@menu
5486* Functionality::
5487* Architecture::
5488* libgnunetpeerstore::
5489@end menu
5490
5491@node Functionality
5492@subsection Functionality
5493
5494@c %**end of header
5495
5496Subsystems can store any type of value under a (subsystem, peerid, key)
5497combination. A "replace" flag set during store operations forces the PEERSTORE
5498to replace any old values stored under the same (subsystem, peerid, key)
5499combination with the new value. Additionally, an expiry date is set after which
5500the record is *possibly* deleted by PEERSTORE.
5501
5502Subsystems can iterate over all values stored under any of the following
5503combination of fields:
5504
5505@itemize @bullet
5506@item (subsystem)
5507@item (subsystem, peerid)
5508@item (subsystem, key)
5509@item (subsystem, peerid, key)
5510@end itemize
5511
5512Subsystems can also request to be notified about any new values stored under a
5513(subsystem, peerid, key) combination by sending a "watch" request to
5514PEERSTORE.
5515
5516@node Architecture
5517@subsection Architecture
5518
5519@c %**end of header
5520
5521PEERSTORE implements the following components:
5522
5523@itemize @bullet
5524@item PEERSTORE service: Handles store, iterate and watch operations.
5525@item PEERSTORE API: API to be used by other subsystems to communicate and
5526issue commands to the PEERSTORE service.
5527@item PEERSTORE plugins: Handles the persistent storage. At the moment, only an
5528"sqlite" plugin is implemented.
5529@end itemize
5530
5531@node libgnunetpeerstore
5532@subsection libgnunetpeerstore
5533
5534@c %**end of header
5535
5536libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
5537wishing to communicate with the PEERSTORE service use this API to open a
5538connection to PEERSTORE. This is done by calling
5539@code{GNUNET_PEERSTORE_connect} which returns a handle to the newly created
5540connection. This handle has to be used with any further calls to the API.
5541
5542To store a new record, the function @code{GNUNET_PEERSTORE_store} is to be used
5543which requires the record fields and a continuation function that will be
5544called by the API after the STORE request is sent to the PEERSTORE service.
5545Note that calling the continuation function does not mean that the record is
5546successfully stored, only that the STORE request has been successfully sent to
5547the PEERSTORE service. @code{GNUNET_PEERSTORE_store_cancel} can be called to
5548cancel the STORE request only before the continuation function has been called.
5549
5550To iterate over stored records, the function @code{GNUNET_PEERSTORE_iterate} is
5551to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
5552callback function will be called with each matching record found and a NULL
5553record at the end to signal the end of result set.
5554@code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
5555request before the iterator callback is called with a NULL record.
5556
5557To be notified with new values stored under a (subsystem, peerid, key)
5558combination, the function @code{GNUNET_PEERSTORE_watch} is to be used. This
5559will register the watcher with the PEERSTORE service, any new records matching
5560the given combination will trigger the callback function passed to
5561@code{GNUNET_PEERSTORE_watch}. This continues until
5562@code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the service
5563is destroyed.
5564
5565After the connection is no longer needed, the function
5566@code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
5567PEERSTORE service. Any pending ITERATE or WATCH requests will be destroyed. If
5568the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will delay the
5569disconnection until all pending STORE requests are sent to the PEERSTORE
5570service, otherwise, the pending STORE requests will be destroyed as well.
5571
5572@node GNUnet's SET Subsystem
5573@section GNUnet's SET Subsystem
5574
5575@c %**end of header
5576
5577The SET service implements efficient set operations between two peers over a
5578mesh tunnel. Currently, set union and set intersection are the only supported
5579operations. Elements of a set consist of an @emph{element type} and arbitrary
5580binary @emph{data}. The size of an element's data is limited to around 62
5581KB.
5582
5583@menu
5584* Local Sets::
5585* Set Modifications::
5586* Set Operations::
5587* Result Elements::
5588* libgnunetset::
5589* The SET Client-Service Protocol::
5590* The SET Intersection Peer-to-Peer Protocol::
5591* The SET Union Peer-to-Peer Protocol::
5592@end menu
5593
5594@node Local Sets
5595@subsection Local Sets
5596
5597@c %**end of header
5598
5599Sets created by a local client can be modified and reused for multiple
5600operations. As each set operation requires potentially expensive special
5601auxilliary data to be computed for each element of a set, a set can only
5602participate in one type of set operation (i.e. union or intersection). The type
5603of a set is determined upon its creation. If a the elements of a set are needed
5604for an operation of a different type, all of the set's element must be copied
5605to a new set of appropriate type.
5606
5607@node Set Modifications
5608@subsection Set Modifications
5609
5610@c %**end of header
5611
5612Even when set operations are active, one can add to and remove elements from a
5613set. However, these changes will only be visible to operations that have been
5614created after the changes have taken place. That is, every set operation only
5615sees a snapshot of the set from the time the operation was started. This
5616mechanism is @emph{not} implemented by copying the whole set, but by attaching
5617@emph{generation information} to each element and operation.
5618
5619@node Set Operations
5620@subsection Set Operations
5621
5622@c %**end of header
5623
5624Set operations can be started in two ways: Either by accepting an operation
5625request from a remote peer, or by requesting a set operation from a remote
5626peer. Set operations are uniquely identified by the involved @emph{peers}, an
5627@emph{application id} and the @emph{operation type}.
5628
5629The client is notified of incoming set operations by @emph{set listeners}. A
5630set listener listens for incoming operations of a specific operation type and
5631application id. Once notified of an incoming set request, the client can
5632accept the set request (providing a local set for the operation) or reject
5633it.
5634
5635@node Result Elements
5636@subsection Result Elements
5637
5638@c %**end of header
5639
5640The SET service has three @emph{result modes} that determine how an operation's
5641result set is delivered to the client:
5642
5643@itemize @bullet
5644@item @strong{Full Result Set.} All elements of set resulting from the set
5645operation are returned to the client.
5646@item @strong{Added Elements.} Only elements that result from the operation and
5647are not already in the local peer's set are returned. Note that for some
5648operations (like set intersection) this result mode will never return any
5649elements. This can be useful if only the remove peer is actually interested in
5650the result of the set operation.
5651@item @strong{Removed Elements.} Only elements that are in the local peer's
5652initial set but not in the operation's result set are returned. Note that for
5653some operations (like set union) this result mode will never return any
5654elements. This can be useful if only the remove peer is actually interested in
5655the result of the set operation.
5656@end itemize
5657
5658@node libgnunetset
5659@subsection libgnunetset
5660
5661@c %**end of header
5662
5663@menu
5664* Sets::
5665* Listeners::
5666* Operations::
5667* Supplying a Set::
5668* The Result Callback::
5669@end menu
5670
5671@node Sets
5672@subsubsection Sets
5673
5674@c %**end of header
5675
5676New sets are created with @code{GNUNET_SET_create}. Both the local peer's
5677configuration (as each set has its own client connection) and the operation
5678type must be specified. The set exists until either the client calls
5679@code{GNUNET_SET_destroy} or the client's connection to the service is
5680disrupted. In the latter case, the client is notified by the return value of
5681functions dealing with sets. This return value must always be checked.
5682
5683Elements are added and removed with @code{GNUNET_SET_add_element} and
5684@code{GNUNET_SET_remove_element}.
5685
5686@node Listeners
5687@subsubsection Listeners
5688
5689@c %**end of header
5690
5691Listeners are created with @code{GNUNET_SET_listen}. Each time time a remote
5692peer suggests a set operation with an application id and operation type
5693matching a listener, the listener's callack is invoked. The client then must
5694synchronously call either @code{GNUNET_SET_accept} or @code{GNUNET_SET_reject}.
5695Note that the operation will not be started until the client calls
5696@code{GNUNET_SET_commit} (see Section "Supplying a Set").
5697
5698@node Operations
5699@subsubsection Operations
5700
5701@c %**end of header
5702
5703Operations to be initiated by the local peer are created with
5704@code{GNUNET_SET_prepare}. Note that the operation will not be started until
5705the client calls @code{GNUNET_SET_commit} (see Section "Supplying a
5706Set").
5707
5708@node Supplying a Set
5709@subsubsection Supplying a Set
5710
5711@c %**end of header
5712
5713To create symmetry between the two ways of starting a set operation (accepting
5714and nitiating it), the operation handles returned by @code{GNUNET_SET_accept}
5715and @code{GNUNET_SET_prepare} do not yet have a set to operate on, thus they
5716can not do any work yet.
5717
5718The client must call @code{GNUNET_SET_commit} to specify a set to use for an
5719operation. @code{GNUNET_SET_commit} may only be called once per set
5720operation.
5721
5722@node The Result Callback
5723@subsubsection The Result Callback
5724
5725@c %**end of header
5726
5727Clients must specify both a result mode and a result callback with
5728@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result callback
5729with a status indicating either that an element was received, or the operation
5730failed or succeeded. The interpretation of the received element depends on the
5731result mode. The callback needs to know which result mode it is used in, as the
5732arguments do not indicate if an element is part of the full result set, or if
5733it is in the difference between the original set and the final set.
5734
5735@node The SET Client-Service Protocol
5736@subsection The SET Client-Service Protocol
5737
5738@c %**end of header
5739
5740@menu
5741* Creating Sets::
5742* Listeners2::
5743* Initiating Operations::
5744* Modifying Sets::
5745* Results and Operation Status::
5746* Iterating Sets::
5747@end menu
5748
5749@node Creating Sets
5750@subsubsection Creating Sets
5751
5752@c %**end of header
5753
5754For each set of a client, there exists a client connection to the service. Sets
5755are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message over a new
5756client connection. Multiple operations for one set are multiplexed over one
5757client connection, using a request id supplied by the client.
5758
5759@node Listeners2
5760@subsubsection Listeners2
5761
5762@c %**end of header
5763
5764Each listener also requires a seperate client connection. By sending the
5765@code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service of
5766the application id and operation type it is interested in. A client rejects an
5767incoming request by sending @code{GNUNET_SERVICE_SET_REJECT} on the listener's
5768client connection. In contrast, when accepting an incoming request, a a
5769@code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that is
5770supplied for the set operation.
5771
5772@node Initiating Operations
5773@subsubsection Initiating Operations
5774
5775@c %**end of header
5776
5777Operations with remote peers are initiated by sending a
5778@code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
5779connection that this message is sent by determines the set to use.
5780
5781@node Modifying Sets
5782@subsubsection Modifying Sets
5783
5784@c %**end of header
5785
5786Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
5787@code{GNUNET_SERVICE_SET_REMOVE} messages.
5788
5789
5790@c %@menu
5791@c %* Results and Operation Status::
5792@c %* Iterating Sets::
5793@c %@end menu
5794
5795@node Results and Operation Status
5796@subsubsection Results and Operation Status
5797@c %**end of header
5798
5799The service notifies the client of result elements and success/failure of a set
5800operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
5801
5802@node Iterating Sets
5803@subsubsection Iterating Sets
5804
5805@c %**end of header
5806
5807All elements of a set can be requested by sending
5808@code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
5809@code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the iteration
5810with @code{GNUNET_SERVICE_SET_ITER_DONE}. After each received element, the
5811client@ must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
5812iteration may be active for a set at any given time.
5813
5814@node The SET Intersection Peer-to-Peer Protocol
5815@subsection The SET Intersection Peer-to-Peer Protocol
5816
5817@c %**end of header
5818
5819The intersection protocol operates over CADET and starts with a
5820GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
5821the operation to the peer listening for inbound requests. It includes the
5822number of elements of the initiating peer, which is used to decide which side
5823will send a Bloom filter first.
5824
5825The listening peer checks if the operation type and application identifier are
5826acceptable for its current state. If not, it responds with a
5827GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
5828terminates the CADET channel).
5829
5830If the application accepts the request, the listener sends back a@
5831GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO if it has more elements
5832in the set than the client. Otherwise, it immediately starts with the Bloom
5833filter exchange. If the initiator receives a
5834GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO response, it beings the
5835Bloom filter exchange, unless the set size is indicated to be zero, in which
5836case the intersection is considered finished after just the initial
5837handshake.
5838
5839
5840@menu
5841* The Bloom filter exchange::
5842* Salt::
5843@end menu
5844
5845@node The Bloom filter exchange
5846@subsubsection The Bloom filter exchange
5847
5848@c %**end of header
5849
5850In this phase, each peer transmits a Bloom filter over the remaining keys of
5851the local set to the other peer using a
5852GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF message. This message additionally
5853includes the number of elements left in the sender's set, as well as the XOR
5854over all of the keys in that set.
5855
5856The number of bits 'k' set per element in the Bloom filter is calculated based
5857on the relative size of the two sets. Furthermore, the size of the Bloom filter
5858is calculated based on 'k' and the number of elements in the set to maximize
5859the amount of data filtered per byte transmitted on the wire (while avoiding an
5860excessively high number of iterations).
5861
5862The receiver of the message removes all elements from its local set that do not
5863pass the Bloom filter test. It then checks if the set size of the sender and
5864the XOR over the keys match what is left of his own set. If they do, he sends
5865a@ GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE back to indicate that the
5866latest set is the final result. Otherwise, the receiver starts another Bloom
5867fitler exchange, except this time as the sender.
5868
5869@node Salt
5870@subsubsection Salt
5871
5872@c %**end of header
5873
5874Bloomfilter operations are probablistic: With some non-zero probability the
5875test may incorrectly say an element is in the set, even though it is not.
5876
5877To mitigate this problem, the intersection protocol iterates exchanging Bloom
5878filters using a different random 32-bit salt in each iteration (the salt is
5879also included in the message). With different salts, set operations may fail
5880for different elements. Merging the results from the executions, the
5881probability of failure drops to zero.
5882
5883The iterations terminate once both peers have established that they have sets
5884of the same size, and where the XOR over all keys computes the same 512-bit
5885value (leaving a failure probability of 2-511).
5886
5887@node The SET Union Peer-to-Peer Protocol
5888@subsection The SET Union Peer-to-Peer Protocol
5889
5890@c %**end of header
5891
5892The SET union protocol is based on Eppstein's efficient set reconciliation
5893without prior context. You should read this paper first if you want to
5894understand the protocol.
5895
5896The union protocol operates over CADET and starts with a
5897GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
5898the operation to the peer listening for inbound requests. It includes the
5899number of elements of the initiating peer, which is currently not used.
5900
5901The listening peer checks if the operation type and application identifier are
5902acceptable for its current state. If not, it responds with a
5903GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
5904terminates the CADET channel).
5905
5906If the application accepts the request, it sends back a strata estimator using
5907a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The initiator evaluates
5908the strata estimator and initiates the exchange of invertible Bloom filters,
5909sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
5910
5911During the IBF exchange, if the receiver cannot invert the Bloom filter or
5912detects a cycle, it sends a larger IBF in response (up to a defined maximum
5913limit; if that limit is reached, the operation fails). Elements decoded while
5914processing the IBF are transmitted to the other peer using
5915GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the other peer using
5916GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, depending on the sign
5917observed during decoding of the IBF. Peers respond to a
5918GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message with the respective
5919element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS message. If the IBF fully
5920decodes, the peer responds with a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE
5921message instead of another GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
5922
5923All Bloom filter operations use a salt to mingle keys before hasing them into
5924buckets, such that future iterations have a fresh chance of succeeding if they
5925failed due to collisions before.
5926
5927@node GNUnet's STATISTICS subsystem
5928@section GNUnet's STATISTICS subsystem
5929
5930@c %**end of header
5931
5932In GNUnet, the STATISTICS subsystem offers a central place for all subsystems
5933to publish unsigned 64-bit integer run-time statistics. Keeping this
5934information centrally means that there is a unified way for the user to obtain
5935data on all subsystems, and individual subsystems do not have to always include
5936a custom data export method for performance metrics and other statistics. For
5937example, the TRANSPORT system uses STATISTICS to update information about the
5938number of directly connected peers and the bandwidth that has been consumed by
5939the various plugins. This information is valuable for diagnosing connectivity
5940and performance issues.
5941
5942Following the GNUnet service architecture, the STATISTICS subsystem is divided
5943into an API which is exposed through the header
5944@strong{gnunet_statistics_service.h} and the STATISTICS service
5945@strong{gnunet-service-statistics}. The @strong{gnunet-statistics} command-line
5946tool can be used to obtain (and change) information about the values stored by
5947the STATISTICS service. The STATISTICS service does not communicate with other
5948peers.
5949
5950Data is stored in the STATISTICS service in the form of tuples
5951@strong{(subsystem, name, value, persistence)}. The subsystem determines to
5952which other GNUnet's subsystem the data belongs. name is the name through which
5953value is associated. It uniquely identifies the record from among other records
5954belonging to the same subsystem. In some parts of the code, the pair
5955@strong{(subsystem, name)} is called a @strong{statistic} as it identifies the
5956values stored in the STATISTCS service.The persistence flag determines if the
5957record has to be preserved across service restarts. A record is said to be
5958persistent if this flag is set for it; if not, the record is treated as a
5959non-persistent record and it is lost after service restart. Persistent records
5960are written to and read from the file @strong{statistics.data} before shutdown
5961and upon startup. The file is located in the HOME directory of the peer.
5962
5963An anomaly of the STATISTICS service is that it does not terminate immediately
5964upon receiving a shutdown signal if it has any clients connected to it. It
5965waits for all the clients that are not monitors to close their connections
5966before terminating itself. This is to prevent the loss of data during peer
5967shutdown --- delaying the STATISTICS service shutdown helps other services to
5968store important data to STATISTICS during shutdown.
5969
5970@menu
5971* libgnunetstatistics::
5972* The STATISTICS Client-Service Protocol::
5973@end menu
5974
5975@node libgnunetstatistics
5976@subsection libgnunetstatistics
5977
5978@c %**end of header
5979
5980@strong{libgnunetstatistics} is the library containing the API for the
5981STATISTICS subsystem. Any process requiring to use STATISTICS should use this
5982API by to open a connection to the STATISTICS service. This is done by calling
5983the function @code{GNUNET_STATISTICS_create()}. This function takes the
5984subsystem's name which is trying to use STATISTICS and a configuration. All
5985values written to STATISTICS with this connection will be placed in the section
5986corresponding to the given subsystem's name. The connection to STATISTICS can
5987be destroyed with the function GNUNET_STATISTICS_destroy(). This function
5988allows for the connection to be destroyed immediately or upon transferring all
5989pending write requests to the service.
5990
5991Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
5992under the @code{[STATISTICS]} section in the configuration. With such a
5993configuration all calls to @code{GNUNET_STATISTICS_create()} return @code{NULL}
5994as the STATISTICS subsystem is unavailable and no other functions from the API
5995can be used.
5996
5997
5998@menu
5999* Statistics retrieval::
6000* Setting statistics and updating them::
6001* Watches::
6002@end menu
6003
6004@node Statistics retrieval
6005@subsubsection Statistics retrieval
6006
6007@c %**end of header
6008
6009Once a connection to the statistics service is obtained, information about any
6010other system which uses statistics can be retrieved with the function
6011GNUNET_STATISTICS_get(). This function takes the connection handle, the name of
6012the subsystem whose information we are interested in (a @code{NULL} value will
6013retrieve information of all available subsystems using STATISTICS), the name of
6014the statistic we are interested in (a @code{NULL} value will retrieve all
6015available statistics), a continuation callback which is called when all of
6016requested information is retrieved, an iterator callback which is called for
6017each parameter in the retrieved information and a closure for the
6018aforementioned callbacks. The library then invokes the iterator callback for
6019each value matching the request.
6020
6021Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be canceled with
6022the function @code{GNUNET_STATISTICS_get_cancel()}. This is helpful when
6023retrieving statistics takes too long and especially when we want to shutdown
6024and cleanup everything.
6025
6026@node Setting statistics and updating them
6027@subsubsection Setting statistics and updating them
6028
6029@c %**end of header
6030
6031So far we have seen how to retrieve statistics, here we will learn how we can
6032set statistics and update them so that other subsystems can retrieve them.
6033
6034A new statistic can be set using the function @code{GNUNET_STATISTICS_set()}.
6035This function takes the name of the statistic and its value and a flag to make
6036the statistic persistent. The value of the statistic should be of the type
6037@code{uint64_t}. The function does not take the name of the subsystem; it is
6038determined from the previous @code{GNUNET_STATISTICS_create()} invocation. If
6039the given statistic is already present, its value is overwritten.
6040
6041An existing statistics can be updated, i.e its value can be increased or
6042decreased by an amount with the function @code{GNUNET_STATISTICS_update()}. The
6043parameters to this function are similar to @code{GNUNET_STATISTICS_set()},
6044except that it takes the amount to be changed as a type @code{int64_t} instead
6045of the value.
6046
6047The library will combine multiple set or update operations into one message if
6048the client performs requests at a rate that is faster than the available IPC
6049with the STATISTICS service. Thus, the client does not have to worry about
6050sending requests too quickly.
6051
6052@node Watches
6053@subsubsection Watches
6054
6055@c %**end of header
6056
6057As interesting feature of STATISTICS lies in serving notifications whenever a
6058statistic of our interest is modified. This is achieved by registering a watch
6059through the function @code{GNUNET_STATISTICS_watch()}. The parameters of this
6060function are similar to those of @code{GNUNET_STATISTICS_get()}. Changes to the
6061respective statistic's value will then cause the given iterator callback to be
6062called. Note: A watch can only be registered for a specific statistic. Hence
6063the subsystem name and the parameter name cannot be @code{NULL} in a call to
6064@code{GNUNET_STATISTICS_watch()}.
6065
6066A registered watch will keep notifying any value changes until
6067@code{GNUNET_STATISTICS_watch_cancel()} is called with the same parameters that
6068are used for registering the watch.
6069
6070@node The STATISTICS Client-Service Protocol
6071@subsection The STATISTICS Client-Service Protocol
6072@c %**end of header
6073
6074
6075@menu
6076* Statistics retrieval2::
6077* Setting and updating statistics::
6078* Watching for updates::
6079@end menu
6080
6081@node Statistics retrieval2
6082@subsubsection Statistics retrieval2
6083
6084@c %**end of header
6085
6086To retrieve statistics, the client transmits a message of type
6087@code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem name
6088and statistic parameter to the STATISTICS service. The service responds with a
6089message of type @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the
6090statistics parameters that match the client request for the client. The end of
6091information retrieved is signaled by the service by sending a message of type
6092@code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
6093
6094@node Setting and updating statistics
6095@subsubsection Setting and updating statistics
6096
6097@c %**end of header
6098
6099The subsystem name, parameter name, its value and the persistence flag are
6100communicated to the service through the message
6101@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
6102
6103When the service receives a message of type
6104@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem name and
6105checks for a statistic parameter with matching the name given in the message.
6106If a statistic parameter is found, the value is overwritten by the new value
6107from the message; if not found then a new statistic parameter is created with
6108the given name and value.
6109
6110In addition to just setting an absolute value, it is possible to perform a
6111relative update by sending a message of type
6112@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
6113(@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in the
6114message should be treated as an update value.
6115
6116@node Watching for updates
6117@subsubsection Watching for updates
6118
6119@c %**end of header
6120
6121The function registers the watch at the service by sending a message of type
6122@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
6123notifications through messages of type
6124@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
6125parameter's value is changed.
6126
6127@node GNUnet's Distributed Hash Table (DHT)
6128@section GNUnet's Distributed Hash Table (DHT)
6129
6130@c %**end of header
6131
6132GNUnet includes a generic distributed hash table that can be used by developers
6133building P2P applications in the framework. This section documents high-level
6134features and how developers are expected to use the DHT. We have a research
6135paper detailing how the DHT works. Also, Nate's thesis includes a detailed
6136description and performance analysis (in chapter 6).
6137
6138Key features of GNUnet's DHT include:
6139
6140@itemize @bullet
6141@item stores key-value pairs with values up to (approximately) 63k in size
6142@item works with many underlay network topologies (small-world, random graph),
6143underlay does not need to be a full mesh / clique
6144@item support for extended queries (more than just a simple 'key'), filtering
6145duplicate replies within the network (bloomfilter) and content validation (for
6146details, please read the subsection on the block library)
6147@item can (optionally) return paths taken by the PUT and GET operations to the
6148application
6149@item provides content replication to handle churn
6150@end itemize
6151
6152GNUnet's DHT is randomized and unreliable. Unreliable means that there is no
6153strict guarantee that a value stored in the DHT is always found --- values are
6154only found with high probability. While this is somewhat true in all P2P DHTs,
6155GNUnet developers should be particularly wary of this fact (this will help you
6156write secure, fault-tolerant code). Thus, when writing any application using
6157the DHT, you should always consider the possibility that a value stored in the
6158DHT by you or some other peer might simply not be returned, or returned with a
6159significant delay. Your application logic must be written to tolerate this
6160(naturally, some loss of performance or quality of service is expected in this
6161case).
6162
6163@menu
6164* Block library and plugins::
6165* libgnunetdht::
6166* The DHT Client-Service Protocol::
6167* The DHT Peer-to-Peer Protocol::
6168@end menu
6169
6170@node Block library and plugins
6171@subsection Block library and plugins
6172
6173@c %**end of header
6174
6175@menu
6176* What is a Block?::
6177* The API of libgnunetblock::
6178* Queries::
6179* Sample Code::
6180* Conclusion2::
6181@end menu
6182
6183@node What is a Block?
6184@subsubsection What is a Block?
6185
6186@c %**end of header
6187
6188Blocks are small (< 63k) pieces of data stored under a key (struct
6189GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
6190their data format. Blocks are used in GNUnet as units of static data exchanged
6191between peers and stored (or cached) locally. Uses of blocks include
6192file-sharing (the files are broken up into blocks), the VPN (DNS information is
6193stored in blocks) and the DHT (all information in the DHT and meta-information
6194for the maintenance of the DHT are both stored using blocks). The block
6195subsystem provides a few common functions that must be available for any type
6196of block.
6197
6198@node The API of libgnunetblock
6199@subsubsection The API of libgnunetblock
6200
6201@c %**end of header
6202
6203The block library requires for each (family of) block type(s) a block plugin
6204(implementing gnunet_block_plugin.h) that provides basic functions that are
6205needed by the DHT (and possibly other subsystems) to manage the block. These
6206block plugins are typically implemented within their respective subsystems.@
6207The main block library is then used to locate, load and query the appropriate
6208block plugin. Which plugin is appropriate is determined by the block type
6209(which is just a 32-bit integer). Block plugins contain code that specifies
6210which block types are supported by a given plugin. The block library loads all
6211block plugins that are installed at the local peer and forwards the application
6212request to the respective plugin.
6213
6214The central functions of the block APIs (plugin and main library) are to allow
6215the mapping of blocks to their respective key (if possible) and the ability to
6216check that a block is well-formed and matches a given request (again, if
6217possible). This way, GNUnet can avoid storing invalid blocks, storing blocks
6218under the wrong key and forwarding blocks in response to a query that they do
6219not answer.
6220
6221One key function of block plugins is that it allows GNUnet to detect duplicate
6222replies (via the Bloom filter). All plugins MUST support detecting duplicate
6223replies (by adding the current response to the Bloom filter and rejecting it if
6224it is encountered again). If a plugin fails to do this, responses may loop in
6225the network.
6226
6227@node Queries
6228@subsubsection Queries
6229@c %**end of header
6230
6231The query format for any block in GNUnet consists of four main components.
6232First, the type of the desired block must be specified. Second, the query must
6233contain a hash code. The hash code is used for lookups in hash tables and
6234databases and must not be unique for the block (however, if possible a unique
6235hash should be used as this would be best for performance). Third, an optional
6236Bloom filter can be specified to exclude known results; replies that hash to
6237the bits set in the Bloom filter are considered invalid. False-positives can be
6238eliminated by sending the same query again with a different Bloom filter
6239mutator value, which parameterizes the hash function that is used. Finally, an
6240optional application-specific "eXtended query" (xquery) can be specified to
6241further constrain the results. It is entirely up to the type-specific plugin to
6242determine whether or not a given block matches a query (type, hash, Bloom
6243filter, and xquery). Naturally, not all xquery's are valid and some types of
6244blocks may not support Bloom filters either, so the plugin also needs to check
6245if the query is valid in the first place.
6246
6247Depending on the results from the plugin, the DHT will then discard the
6248(invalid) query, forward the query, discard the (invalid) reply, cache the
6249(valid) reply, and/or forward the (valid and non-duplicate) reply.
6250
6251@node Sample Code
6252@subsubsection Sample Code
6253
6254@c %**end of header
6255
6256The source code in @strong{plugin_block_test.c} is a good starting point for
6257new block plugins --- it does the minimal work by implementing a plugin that
6258performs no validation at all. The respective @strong{Makefile.am} shows how to
6259build and install a block plugin.
6260
6261@node Conclusion2
6262@subsubsection Conclusion2
6263
6264@c %**end of header
6265
6266In conclusion, GNUnet subsystems that want to use the DHT need to define a
6267block format and write a plugin to match queries and replies. For testing, the
6268"GNUNET_BLOCK_TYPE_TEST" block type can be used; it accepts any query as valid
6269and any reply as matching any query. This type is also used for the DHT command
6270line tools. However, it should NOT be used for normal applications due to the
6271lack of error checking that results from this primitive implementation.
6272
6273@node libgnunetdht
6274@subsection libgnunetdht
6275
6276@c %**end of header
6277
6278The DHT API itself is pretty simple and offers the usual GET and PUT functions
6279that work as expected. The specified block type refers to the block library
6280which allows the DHT to run application-specific logic for data stored in the
6281network.
6282
6283
6284@menu
6285* GET::
6286* PUT::
6287* MONITOR::
6288* DHT Routing Options::
6289@end menu
6290
6291@node GET
6292@subsubsection GET
6293
6294@c %**end of header
6295
6296When using GET, the main consideration for developers (other than the block
6297library) should be that after issuing a GET, the DHT will continuously cause
6298(small amounts of) network traffic until the operation is explicitly canceled.
6299So GET does not simply send out a single network request once; instead, the
6300DHT will continue to search for data. This is needed to achieve good success
6301rates and also handles the case where the respective PUT operation happens
6302after the GET operation was started. Developers should not cancel an existing
6303GET operation and then explicitly re-start it to trigger a new round of
6304network requests; this is simply inefficient, especially as the internal
6305automated version can be more efficient, for example by filtering results in
6306the network that have already been returned.
6307
6308If an application that performs a GET request has a set of replies that it
6309already knows and would like to filter, it can call@
6310@code{GNUNET_DHT_get_filter_known_results} with an array of hashes over the
6311respective blocks to tell the DHT that these results are not desired (any
6312more). This way, the DHT will filter the respective blocks using the block
6313library in the network, which may result in a significant reduction in
6314bandwidth consumption.
6315
6316@node PUT
6317@subsubsection PUT
6318
6319@c %**end of header
6320
6321In contrast to GET operations, developers @strong{must} manually re-run PUT
6322operations periodically (if they intend the content to continue to be
6323available). Content stored in the DHT expires or might be lost due to churn.
6324Furthermore, GNUnet's DHT typically requires multiple rounds of PUT operations
6325before a key-value pair is consistently available to all peers (the DHT
6326randomizes paths and thus storage locations, and only after multiple rounds of
6327PUTs there will be a sufficient number of replicas in large DHTs). An explicit
6328PUT operation using the DHT API will only cause network traffic once, so in
6329order to ensure basic availability and resistance to churn (and adversaries),
6330PUTs must be repeated. While the exact frequency depends on the application, a
6331rule of thumb is that there should be at least a dozen PUT operations within
6332the content lifetime. Content in the DHT typically expires after one day, so
6333DHT PUT operations should be repeated at least every 1-2 hours.
6334
6335@node MONITOR
6336@subsubsection MONITOR
6337
6338@c %**end of header
6339
6340The DHT API also allows applications to monitor messages crossing the local
6341DHT service. The types of messages used by the DHT are GET, PUT and RESULT
6342messages. Using the monitoring API, applications can choose to monitor these
6343requests, possibly limiting themselves to requests for a particular block
6344type.
6345
6346The monitoring API is not only usefu only for diagnostics, it can also be used
6347to trigger application operations based on PUT operations. For example, an
6348application may use PUTs to distribute work requests to other peers. The
6349workers would then monitor for PUTs that give them work, instead of looking
6350for work using GET operations. This can be beneficial, especially if the
6351workers have no good way to guess the keys under which work would be stored.
6352Naturally, additional protocols might be needed to ensure that the desired
6353number of workers will process the distributed workload.
6354
6355@node DHT Routing Options
6356@subsubsection DHT Routing Options
6357
6358@c %**end of header
6359
6360There are two important options for GET and PUT requests:
6361
6362@table @asis
6363@item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all peers
6364should process the request, even if their peer ID is not closest to the key.
6365For a PUT request, this means that all peers that a request traverses may make
6366a copy of the data. Similarly for a GET request, all peers will check their
6367local database for a result. Setting this option can thus significantly improve
6368caching and reduce bandwidth consumption --- at the expense of a larger DHT
6369database. If in doubt, we recommend that this option should be used.
6370@item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record the path
6371that a GET or a PUT request is taking through the overlay network. The
6372resulting paths are then returned to the application with the respective
6373result. This allows the receiver of a result to construct a path to the
6374originator of the data, which might then be used for routing. Naturally,
6375setting this option requires additional bandwidth and disk space, so
6376applications should only set this if the paths are needed by the application
6377logic.
6378@item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
6379the DHT's peer discovery mechanism and should not be used by applications.
6380@item GNUNET_DHT_RO_BART This option is currently not implemented. It may in
6381the future offer performance improvements for clique topologies.
6382@end table
6383
6384@node The DHT Client-Service Protocol
6385@subsection The DHT Client-Service Protocol
6386
6387@c %**end of header
6388
6389@menu
6390* PUTting data into the DHT::
6391* GETting data from the DHT::
6392* Monitoring the DHT::
6393@end menu
6394
6395@node PUTting data into the DHT
6396@subsubsection PUTting data into the DHT
6397
6398@c %**end of header
6399
6400To store (PUT) data into the DHT, the client sends a@ @code{struct
6401GNUNET_DHT_ClientPutMessage} to the service. This message specifies the block
6402type, routing options, the desired replication level, the expiration time, key,
6403value and a 64-bit unique ID for the operation. The service responds with a@
6404@code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same 64-bit
6405unique ID. Note that the service sends the confirmation as soon as it has
6406locally processed the PUT request. The PUT may still be propagating through the
6407network at this time.
6408
6409In the future, we may want to change this to provide (limited) feedback to the
6410client, for example if we detect that the PUT operation had no effect because
6411the same key-value pair was already stored in the DHT. However, changing this
6412would also require additional state and messages in the P2P
6413interaction.
6414
6415@node GETting data from the DHT
6416@subsubsection GETting data from the DHT
6417
6418@c %**end of header
6419
6420To retrieve (GET) data from the DHT, the client sends a@ @code{struct
6421GNUNET_DHT_ClientGetMessage} to the service. The message specifies routing
6422options, a replication level (for replicating the GET, not the content), the
6423desired block type, the key, the (optional) extended query and unique 64-bit
6424request ID.
6425
6426Additionally, the client may send any number of@ @code{struct
6427GNUNET_DHT_ClientGetResultSeenMessage}s to notify the service about results
6428that the client is already aware of. These messages consist of the key, the
6429unique 64-bit ID of the request, and an arbitrary number of hash codes over the
6430blocks that the client is already aware of. As messages are restricted to 64k,
6431a client that already knows more than about a thousand blocks may need to send
6432several of these messages. Naturally, the client should transmit these messages
6433as quickly as possible after the original GET request such that the DHT can
6434filter those results in the network early on. Naturally, as these messages are
6435send after the original request, it is conceivalbe that the DHT service may
6436return blocks that match those already known to the client anyway.
6437
6438In response to a GET request, the service will send @code{struct
6439GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
6440block type, expiration, key, unique ID of the request and of course the value
6441(a block). Depending on the options set for the respective operations, the
6442replies may also contain the path the GET and/or the PUT took through the
6443network.
6444
6445A client can stop receiving replies either by disconnecting or by sending a
6446@code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the key and
6447the 64-bit unique ID of the original request. Using an explicit "stop" message
6448is more common as this allows a client to run many concurrent GET operations
6449over the same connection with the DHT service --- and to stop them
6450individually.
6451
6452@node Monitoring the DHT
6453@subsubsection Monitoring the DHT
6454
6455@c %**end of header
6456
6457To begin monitoring, the client sends a @code{struct
6458GNUNET_DHT_MonitorStartStop} message to the DHT service. In this message, flags
6459can be set to enable (or disable) monitoring of GET, PUT and RESULT messages
6460that pass through a peer. The message can also restrict monitoring to a
6461particular block type or a particular key. Once monitoring is enabled, the DHT
6462service will notify the client about any matching event using @code{struct
6463GNUNET_DHT_MonitorGetMessage}s for GET events, @code{struct
6464GNUNET_DHT_MonitorPutMessage} for PUT events and@ @code{struct
6465GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of these messages contains
6466all of the information about the event.
6467
6468@node The DHT Peer-to-Peer Protocol
6469@subsection The DHT Peer-to-Peer Protocol
6470@c %**end of header
6471
6472
6473@menu
6474* Routing GETs or PUTs::
6475* PUTting data into the DHT2::
6476* GETting data from the DHT2::
6477@end menu
6478
6479@node Routing GETs or PUTs
6480@subsubsection Routing GETs or PUTs
6481
6482@c %**end of header
6483
6484When routing GETs or PUTs, the DHT service selects a suitable subset of
6485neighbours for forwarding. The exact number of neighbours can be zero or more
6486and depends on the hop counter of the query (initially zero) in relation to the
6487(log of) the network size estimate, the desired replication level and the
6488peer's connectivity. Depending on the hop counter and our network size
6489estimate, the selection of the peers maybe randomized or by proximity to the
6490key. Furthermore, requests include a set of peers that a request has already
6491traversed; those peers are also excluded from the selection.
6492
6493@node PUTting data into the DHT2
6494@subsubsection PUTting data into the DHT2
6495
6496@c %**end of header
6497
6498To PUT data into the DHT, the service sends a @code{struct PeerPutMessage} of
6499type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective neighbour. In
6500addition to the usual information about the content (type, routing options,
6501desired replication level for the content, expiration time, key and value), the
6502message contains a fixed-size Bloom filter with information about which peers
6503(may) have already seen this request. This Bloom filter is used to ensure that
6504DHT messages never loop back to a peer that has already processed the request.
6505Additionally, the message includes the current hop counter and, depending on
6506the routing options, the message may include the full path that the message has
6507taken so far. The Bloom filter should already contain the identity of the
6508previous hop; however, the path should not include the identity of the previous
6509hop and the receiver should append the identity of the sender to the path, not
6510its own identity (this is done to reduce bandwidth).
6511
6512@node GETting data from the DHT2
6513@subsubsection GETting data from the DHT2
6514
6515@c %**end of header
6516
6517A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
6518@code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the usual
6519information about the request (type, routing options, desired replication level
6520for the request, the key and the extended query), a GET request also again
6521contains a hop counter, a Bloom filter over the peers that have processed the
6522request already and depending on the routing options the full path traversed by
6523the GET. Finally, a GET request includes a variable-size second Bloom filter
6524and a so-called Bloom filter mutator value which together indicate which
6525replies the sender has already seen. During the lookup, each block that matches
6526they block type, key and extended query is additionally subjected to a test
6527against this Bloom filter. The block plugin is expected to take the hash of the
6528block and combine it with the mutator value and check if the result is not yet
6529in the Bloom filter. The originator of the query will from time to time modify
6530the mutator to (eventually) allow false-positives filtered by the Bloom filter
6531to be returned.
6532
6533Peers that receive a GET request perform a local lookup (depending on their
6534proximity to the key and the query options) and forward the request to other
6535peers. They then remember the request (including the Bloom filter for blocking
6536duplicate results) and when they obtain a matching, non-filtered response a
6537@code{struct PeerResultMessage} of type@
6538@code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous hop.
6539Whenver a result is forwarded, the block plugin is used to update the Bloom
6540filter accordingly, to ensure that the same result is never forwarded more than
6541once. The DHT service may also cache forwarded results locally if the
6542"CACHE_RESULTS" option is set to "YES" in the configuration.
6543
6544@node The GNU Name System (GNS)
6545@section The GNU Name System (GNS)
6546
6547@c %**end of header
6548
6549The GNU Name System (GNS) is a decentralized database that enables users to
6550securely resolve names to values. Names can be used to identify other users
6551(for example, in social networking), or network services (for example, VPN
6552services running at a peer in GNUnet, or purely IP-based services on the
6553Internet). Users interact with GNS by typing in a hostname that ends in ".gnu"
6554or ".zkey".
6555
6556Videos giving an overview of most of the GNS and the motivations behind it is
6557available here and here. The remainder of this chapter targets developers that
6558are familiar with high level concepts of GNS as presented in these talks.
6559
6560GNS-aware applications should use the GNS resolver to obtain the respective
6561records that are stored under that name in GNS. Each record consists of a type,
6562value, expiration time and flags.
6563
6564The type specifies the format of the value. Types below 65536 correspond to DNS
6565record types, larger values are used for GNS-specific records. Applications can
6566define new GNS record types by reserving a number and implementing a plugin
6567(which mostly needs to convert the binary value representation to a
6568human-readable text format and vice-versa). The expiration time specifies how
6569long the record is to be valid. The GNS API ensures that applications are only
6570given non-expired values. The flags are typically irrelevant for applications,
6571as GNS uses them internally to control visibility and validity of records.
6572
6573Records are stored along with a signature. The signature is generated using the
6574private key of the authoritative zone. This allows any GNS resolver to verify
6575the correctness of a name-value mapping.
6576
6577Internally, GNS uses the NAMECACHE to cache information obtained from other
6578users, the NAMESTORE to store information specific to the local users, and the
6579DHT to exchange data between users. A plugin API is used to enable applications
6580to define new GNS record types.
6581
6582@menu
6583* libgnunetgns::
6584* libgnunetgnsrecord::
6585* GNS plugins::
6586* The GNS Client-Service Protocol::
6587* Hijacking the DNS-Traffic using gnunet-service-dns::
6588* Serving DNS lookups via GNS on W32::
6589@end menu
6590
6591@node libgnunetgns
6592@subsection libgnunetgns
6593
6594@c %**end of header
6595
6596The GNS API itself is extremely simple. Clients first connec to the GNS service
6597using @code{GNUNET_GNS_connect}. They can then perform lookups using
6598@code{GNUNET_GNS_lookup} or cancel pending lookups using
6599@code{GNUNET_GNS_lookup_cancel}. Once finished, clients disconnect using
6600@code{GNUNET_GNS_disconnect}.
6601
6602
6603@menu
6604* Looking up records::
6605* Accessing the records::
6606* Creating records::
6607* Future work::
6608@end menu
6609
6610@node Looking up records
6611@subsubsection Looking up records
6612
6613@c %**end of header
6614
6615@code{GNUNET_GNS_lookup} takes a number of arguments:
6616
6617@table @asis
6618@item handle This is simply the GNS connection handle from
6619@code{GNUNET_GNS_connect}.
6620@item name The client needs to specify the name to
6621be resolved. This can be any valid DNS or GNS hostname.
6622@item zone The client
6623needs to specify the public key of the GNS zone against which the resolution
6624should be done (the ".gnu" zone). Note that a key must be provided, even if the
6625name ends in ".zkey". This should typically be the public key of the
6626master-zone of the user.
6627@item type This is the desired GNS or DNS record type
6628to look for. While all records for the given name will be returned, this can be
6629important if the client wants to resolve record types that themselves delegate
6630resolution, such as CNAME, PKEY or GNS2DNS. Resolving a record of any of these
6631types will only work if the respective record type is specified in the request,
6632as the GNS resolver will otherwise follow the delegation and return the records
6633from the respective destination, instead of the delegating record.
6634@item only_cached This argument should typically be set to @code{GNUNET_NO}. Setting
6635it to @code{GNUNET_YES} disables resolution via the overlay network.
6636@item shorten_zone_key If GNS encounters new names during resolution, their
6637respective zones can automatically be learned and added to the "shorten zone".
6638If this is desired, clients must pass the private key of the shorten zone. If
6639NULL is passed, shortening is disabled.
6640@item proc This argument identifies
6641the function to call with the result. It is given proc_cls, the number of
6642records found (possilby zero) and the array of the records as arguments. proc
6643will only be called once. After proc,> has been called, the lookup must no
6644longer be cancelled.
6645@item proc_cls The closure for proc.
6646@end table
6647
6648@node Accessing the records
6649@subsubsection Accessing the records
6650
6651@c %**end of header
6652
6653The @code{libgnunetgnsrecord} library provides an API to manipulate the GNS
6654record array that is given to proc. In particular, it offers functions such as
6655converting record values to human-readable strings (and back). However, most
6656@code{libgnunetgnsrecord} functions are not interesting to GNS client
6657applications.
6658
6659For DNS records, the @code{libgnunetdnsparser} library provides functions for
6660parsing (and serializing) common types of DNS records.
6661
6662@node Creating records
6663@subsubsection Creating records
6664
6665@c %**end of header
6666
6667Creating GNS records is typically done by building the respective record
6668information (possibly with the help of @code{libgnunetgnsrecord} and
6669@code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
6670publish the information. The GNS API is not involved in this
6671operation.
6672
6673@node Future work
6674@subsubsection Future work
6675
6676@c %**end of header
6677
6678In the future, we want to expand @code{libgnunetgns} to allow applications to
6679observe shortening operations performed during GNS resolution, for example so
6680that users can receive visual feedback when this happens.
6681
6682@node libgnunetgnsrecord
6683@subsection libgnunetgnsrecord
6684
6685@c %**end of header
6686
6687The @code{libgnunetgnsrecord} library is used to manipulate GNS records (in
6688plaintext or in their encrypted format). Applications mostly interact with
6689@code{libgnunetgnsrecord} by using the functions to convert GNS record values
6690to strings or vice-versa, or to lookup a GNS record type number by name (or
6691vice-versa). The library also provides various other functions that are mostly
6692used internally within GNS, such as converting keys to names, checking for
6693expiration, encrypting GNS records to GNS blocks, verifying GNS block
6694signatures and decrypting GNS records from GNS blocks.
6695
6696We will now discuss the four commonly used functions of the API.@
6697@code{libgnunetgnsrecord} does not perform these operations itself, but instead
6698uses plugins to perform the operation. GNUnet includes plugins to support
6699common DNS record types as well as standard GNS record types.
6700
6701
6702@menu
6703* Value handling::
6704* Type handling::
6705@end menu
6706
6707@node Value handling
6708@subsubsection Value handling
6709
6710@c %**end of header
6711
6712@code{GNUNET_GNSRECORD_value_to_string} can be used to convert the (binary)
6713representation of a GNS record value to a human readable, 0-terminated UTF-8
6714string. NULL is returned if the specified record type is not supported by any
6715available plugin.
6716
6717@code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a human
6718readable string to the respective (binary) representation of a GNS record
6719value.
6720
6721@node Type handling
6722@subsubsection Type handling
6723
6724@c %**end of header
6725
6726@code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the numeric
6727value associated with a given typename. For example, given the typename "A"
6728(for DNS A reocrds), the function will return the number 1. A list of common
6729DNS record types is
6730@uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here. Note that
6731not all DNS record types are supported by GNUnet GNSRECORD plugins at this
6732time.}
6733
6734@code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the typename
6735associated with a given numeric value. For example, given the type number 1,
6736the function will return the typename "A".
6737
6738@node GNS plugins
6739@subsection GNS plugins
6740
6741@c %**end of header
6742
6743Adding a new GNS record type typically involves writing (or extending) a
6744GNSRECORD plugin. The plugin needs to implement the
6745@code{gnunet_gnsrecord_plugin.h} API which provides basic functions that are
6746needed by GNSRECORD to convert typenames and values of the respective record
6747type to strings (and back). These gnsrecord plugins are typically implemented
6748within their respective subsystems. Examples for such plugins can be found in
6749the GNSRECORD, GNS and CONVERSATION subsystems.
6750
6751The @code{libgnunetgnsrecord} library is then used to locate, load and query
6752the appropriate gnsrecord plugin. Which plugin is appropriate is determined by
6753the record type (which is just a 32-bit integer). The @code{libgnunetgnsrecord}
6754library loads all block plugins that are installed at the local peer and
6755forwards the application request to the plugins. If the record type is not
6756supported by the plugin, it should simply return an error code.
6757
6758The central functions of the block APIs (plugin and main library) are the same
6759four functions for converting between values and strings, and typenames and
6760numbers documented in the previous subsection.
6761
6762@node The GNS Client-Service Protocol
6763@subsection The GNS Client-Service Protocol
6764
6765@c %**end of header
6766
6767The GNS client-service protocol consists of two simple messages, the
6768@code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP} message
6769contains a unique 32-bit identifier, which will be included in the
6770corresponding response. Thus, clients can send many lookup requests in parallel
6771and receive responses out-of-order. A @code{LOOKUP} request also includes the
6772public key of the GNS zone, the desired record type and fields specifying
6773whether shortening is enabled or networking is disabled. Finally, the
6774@code{LOOKUP} message includes the name to be resolved.
6775
6776The response includes the number of records and the records themselves in the
6777format created by @code{GNUNET_GNSRECORD_records_serialize}. They can thus be
6778deserialized using @code{GNUNET_GNSRECORD_records_deserialize}.
6779
6780@node Hijacking the DNS-Traffic using gnunet-service-dns
6781@subsection Hijacking the DNS-Traffic using gnunet-service-dns
6782
6783@c %**end of header
6784
6785This section documents how the gnunet-service-dns (and the gnunet-helper-dns)
6786intercepts DNS queries from the local system.@ This is merely one method for
6787how we can obtain GNS queries. It is also possible to change @code{resolv.conf}
6788to point to a machine running @code{gnunet-dns2gns} or to modify libc's name
6789system switch (NSS) configuration to include a GNS resolution plugin. The
6790method described in this chaper is more of a last-ditch catch-all approach.
6791
6792@code{gnunet-service-dns} enables intercepting DNS traffic using policy based
6793routing. We MARK every outgoing DNS-packet if it was not sent by our
6794application. Using a second routing table in the Linux kernel these marked
6795packets are then routed through our virtual network interface and can thus be
6796captured unchanged.
6797
6798Our application then reads the query and decides how to handle it: A query to
6799an address ending in ".gnu" or ".zkey" is hijacked by @code{gnunet-service-gns}
6800and resolved internally using GNS. In the future, a reverse query for an
6801address of the configured virtual network could be answered with records kept
6802about previous forward queries. Queries that are not hijacked by some
6803application using the DNS service will be sent to the original recipient. The
6804answer to the query will always be sent back through the virtual interface with
6805the original nameserver as source address.
6806
6807
6808@menu
6809* Network Setup Details::
6810@end menu
6811
6812@node Network Setup Details
6813@subsubsection Network Setup Details
6814
6815@c %**end of header
6816
6817The DNS interceptor adds the following rules to the Linux kernel:
6818@example
6819iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 -j
6820ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK --set-mark 3 ip
6821rule add fwmark 3 table2 ip route add default via $VIRTUALDNS table2
6822@end example
6823
6824Line 1 makes sure that all packets coming from a port our application opened
6825beforehand (@code{$LOCALPORT}) will be routed normally. Line 2 marks every
6826other packet to a DNS-Server with mark 3 (chosen arbitrarily). The third line
6827adds a routing policy based on this mark 3 via the routing table.
6828
6829@node Serving DNS lookups via GNS on W32
6830@subsection Serving DNS lookups via GNS on W32
6831
6832@c %**end of header
6833
6834This section documents how the libw32nsp (and gnunet-gns-helper-service-w32) do
6835DNS resolutions of DNS queries on the local system. This only applies to GNUnet
6836running on W32.
6837
6838W32 has a concept of "Namespaces" and "Namespace providers". These are used to
6839present various name systems to applications in a generic way. Namespaces
6840include DNS, mDNS, NLA and others. For each namespace any number of providers
6841could be registered, and they are queried in an order of priority (which is
6842adjustable).
6843
6844Applications can resolve names by using WSALookupService*() family of
6845functions.
6846
6847However, these are WSA-only facilities. Common BSD socket functions for
6848namespace resolutions are gethostbyname and getaddrinfo (among others). These
6849functions are implemented internally (by default - by mswsock, which also
6850implements the default DNS provider) as wrappers around WSALookupService*()
6851functions (see "Sample Code for a Service Provider" on MSDN).
6852
6853On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
6854installed into the system by using w32nsp-install (and uninstalled by
6855w32nsp-uninstall), as described in "Installation Handbook".
6856
6857libw32nsp is very simple and has almost no dependencies. As a response to
6858NSPLookupServiceBegin(), it only checks that the provider GUID passed to it by
6859the caller matches GNUnet DNS Provider GUID, checks that name being resolved
6860ends in ".gnu" or ".zkey", then connects to gnunet-gns-helper-service-w32 at
6861127.0.0.1:5353 (hardcoded) and sends the name resolution request there,
6862returning the connected socket to the caller.
6863
6864When the caller invokes NSPLookupServiceNext(), libw32nsp reads a completely
6865formed reply from that socket, unmarshalls it, then gives it back to the
6866caller.
6867
6868At the moment gnunet-gns-helper-service-w32 is implemented to ever give only
6869one reply, and subsequent calls to NSPLookupServiceNext() will fail with
6870WSA_NODATA (first call to NSPLookupServiceNext() might also fail if GNS failed
6871to find the name, or there was an error connecting to it).
6872
6873gnunet-gns-helper-service-w32 does most of the processing:
6874
6875@itemize @bullet
6876@item Maintains a connection to GNS.
6877@item Reads GNS config and loads appropriate keys.
6878@item Checks service GUID and decides on the type of record to look up,
6879refusing to make a lookup outright when unsupported service GUID is passed.
6880@item Launches the lookup
6881@end itemize
6882
6883When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
6884reply (including filling a WSAQUERYSETW structure and, possibly, a binary blob
6885with a hostent structure for gethostbyname() client), marshalls it, and sends
6886it back to libw32nsp. If no records were found, it sends an empty header.
6887
6888This works for most normal applications that use gethostbyname() or
6889getaddrinfo() to resolve names, but fails to do anything with applications that
6890use alternative means of resolving names (such as sending queries to a DNS
6891server directly by themselves). This includes some of well known utilities,
6892like "ping" and "nslookup".
6893
6894@node The GNS Namecache
6895@section The GNS Namecache
6896
6897@c %**end of header
6898
6899The NAMECACHE subsystem is responsible for caching (encrypted) resolution
6900results of the GNU Name System (GNS). GNS makes zone information available to
6901other users via the DHT. However, as accessing the DHT for every lookup is
6902expensive (and as the DHT's local cache is lost whenever the peer is
6903restarted), GNS uses the NAMECACHE as a more persistent cache for DHT lookups.
6904Thus, instead of always looking up every name in the DHT, GNS first checks if
6905the result is already available locally in the NAMECACHE. Only if there is no
6906result in the NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the
6907same (encrypted) format as the DHT. It thus makes no sense to iterate over all
6908items in the NAMECACHE --- the NAMECACHE does not have a way to provide the
6909keys required to decrypt the entries.
6910
6911Blocks in the NAMECACHE share the same expiration mechanism as blocks in the
6912DHT --- the block expires wheneever any of the records in the (encrypted) block
6913expires. The expiration time of the block is the only information stored in
6914plaintext. The NAMECACHE service internally performs all of the required work
6915to expire blocks, clients do not have to worry about this. Also, given that
6916NAMECACHE stores only GNS blocks that local users requested, there is no
6917configuration option to limit the size of the NAMECACHE. It is assumed to be
6918always small enough (a few MB) to fit on the drive.
6919
6920The NAMECACHE supports the use of different database backends via a plugin API.
6921
6922@menu
6923* libgnunetnamecache::
6924* The NAMECACHE Client-Service Protocol::
6925* The NAMECACHE Plugin API::
6926@end menu
6927
6928@node libgnunetnamecache
6929@subsection libgnunetnamecache
6930
6931@c %**end of header
6932
6933The NAMECACHE API consists of five simple functions. First, there is
6934@code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service. This
6935returns the handle required for all other operations on the NAMECACHE. Using
6936@code{GNUNET_NAMECACHE_block_cache} clients can insert a block into the cache.
6937@code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that were
6938stored in the NAMECACHE. Both operations can be cancelled using
6939@code{GNUNET_NAMECACHE_cancel}. Note that cancelling a
6940@code{GNUNET_NAMECACHE_block_cache} operation can result in the block being
6941stored in the NAMECACHE --- or not. Cancellation primarily ensures that the
6942continuation function with the result of the operation will no longer be
6943invoked. Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to
6944the NAMECACHE.
6945
6946The maximum size of a block that can be stored in the NAMECACHE is
6947@code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
6948
6949@node The NAMECACHE Client-Service Protocol
6950@subsection The NAMECACHE Client-Service Protocol
6951
6952@c %**end of header
6953
6954All messages in the NAMECACHE IPC protocol start with the @code{struct
6955GNUNET_NAMECACHE_Header} which adds a request ID (32-bit integer) to the
6956standard message header. The request ID is used to match requests with the
6957respective responses from the NAMECACHE, as they are allowed to happen
6958out-of-order.
6959
6960
6961@menu
6962* Lookup::
6963* Store::
6964@end menu
6965
6966@node Lookup
6967@subsubsection Lookup
6968
6969@c %**end of header
6970
6971The @code{struct LookupBlockMessage} is used to lookup a block stored in the
6972cache. It contains the query hash. The NAMECACHE always responds with a
6973@code{struct LookupBlockResponseMessage}. If the NAMECACHE has no response, it
6974sets the expiration time in the response to zero. Otherwise, the response is
6975expected to contain the expiration time, the ECDSA signature, the derived key
6976and the (variable-size) encrypted data of the block.
6977
6978@node Store
6979@subsubsection Store
6980
6981@c %**end of header
6982
6983The @code{struct BlockCacheMessage} is used to cache a block in the NAMECACHE.
6984It has the same structure as the @code{struct LookupBlockResponseMessage}. The
6985service responds with a @code{struct BlockCacheResponseMessage} which contains
6986the result of the operation (success or failure). In the future, we might want
6987to make it possible to provide an error message as well.
6988
6989@node The NAMECACHE Plugin API
6990@subsection The NAMECACHE Plugin API
6991@c %**end of header
6992
6993The NAMECACHE plugin API consists of two functions, @code{cache_block} to store
6994a block in the database, and @code{lookup_block} to lookup a block in the
6995database.
6996
6997
6998@menu
6999* Lookup2::
7000* Store2::
7001@end menu
7002
7003@node Lookup2
7004@subsubsection Lookup2
7005
7006@c %**end of header
7007
7008The @code{lookup_block} function is expected to return at most one block to the
7009iterator, and return @code{GNUNET_NO} if there were no non-expired results. If
7010there are multiple non-expired results in the cache, the lookup is supposed to
7011return the result with the largest expiration time.
7012
7013@node Store2
7014@subsubsection Store2
7015
7016@c %**end of header
7017
7018The @code{cache_block} function is expected to try to store the block in the
7019database, and return @code{GNUNET_SYSERR} if this was not possible for any
7020reason. Furthermore, @code{cache_block} is expected to implicitly perform cache
7021maintenance and purge blocks from the cache that have expired. Note that
7022@code{cache_block} might encounter the case where the database already has
7023another block stored under the same key. In this case, the plugin must ensure
7024that the block with the larger expiration time is preserved. Obviously, this
7025can done either by simply adding new blocks and selecting for the most recent
7026expiration time during lookup, or by checking which block is more recent during
7027the store operation.
7028
7029@node The REVOCATION Subsystem
7030@section The REVOCATION Subsystem
7031@c %**end of header
7032
7033The REVOCATION subsystem is responsible for key revocation of Egos. If a user
7034learns that his private key has been compromised or has lost it, he can use the
7035REVOCATION system to inform all of the other users that this private key is no
7036longer valid. The subsystem thus includes ways to query for the validity of
7037keys and to propagate revocation messages.
7038
7039@menu
7040* Dissemination::
7041* Revocation Message Design Requirements::
7042* libgnunetrevocation::
7043* The REVOCATION Client-Service Protocol::
7044* The REVOCATION Peer-to-Peer Protocol::
7045@end menu
7046
7047@node Dissemination
7048@subsection Dissemination
7049
7050@c %**end of header
7051
7052When a revocation is performed, the revocation is first of all disseminated by
7053flooding the overlay network. The goal is to reach every peer, so that when a
7054peer needs to check if a key has been revoked, this will be purely a local
7055operation where the peer looks at his local revocation list. Flooding the
7056network is also the most robust form of key revocation --- an adversary would
7057have to control a separator of the overlay graph to restrict the propagation of
7058the revocation message. Flooding is also very easy to implement --- peers that
7059receive a revocation message for a key that they have never seen before simply
7060pass the message to all of their neighbours.
7061
7062Flooding can only distribute the revocation message to peers that are online.
7063In order to notify peers that join the network later, the revocation service
7064performs efficient set reconciliation over the sets of known revocation
7065messages whenever two peers (that both support REVOCATION dissemination)
7066connect. The SET service is used to perform this operation
7067efficiently.
7068
7069@node Revocation Message Design Requirements
7070@subsection Revocation Message Design Requirements
7071
7072@c %**end of header
7073
7074However, flooding is also quite costly, creating O(|E|) messages on a network
7075with |E| edges. Thus, revocation messages are required to contain a
7076proof-of-work, the result of an expensive computation (which, however, is cheap
7077to verify). Only peers that have expended the CPU time necessary to provide
7078this proof will be able to flood the network with the revocation message. This
7079ensures that an attacker cannot simply flood the network with millions of
7080revocation messages. The proof-of-work required by GNUnet is set to take days
7081on a typical PC to compute; if the ability to quickly revoke a key is needed,
7082users have the option to pre-compute revocation messages to store off-line and
7083use instantly after their key has expired.
7084
7085Revocation messages must also be signed by the private key that is being
7086revoked. Thus, they can only be created while the private key is in the
7087possession of the respective user. This is another reason to create a
7088revocation message ahead of time and store it in a secure location.
7089
7090@node libgnunetrevocation
7091@subsection libgnunetrevocation
7092
7093@c %**end of header
7094
7095The REVOCATION API consists of two parts, to query and to issue
7096revocations.
7097
7098
7099@menu
7100* Querying for revoked keys::
7101* Preparing revocations::
7102* Issuing revocations::
7103@end menu
7104
7105@node Querying for revoked keys
7106@subsubsection Querying for revoked keys
7107
7108@c %**end of header
7109
7110@code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public key has
7111been revoked. The given callback will be invoked with the result of the check.
7112The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on the
7113return value.
7114
7115@node Preparing revocations
7116@subsubsection Preparing revocations
7117
7118@c %**end of header
7119
7120It is often desirable to create a revocation record ahead-of-time and store it
7121in an off-line location to be used later in an emergency. This is particularly
7122true for GNUnet revocations, where performing the revocation operation itself
7123is computationally expensive and thus is likely to take some time. Thus, if
7124users want the ability to perform revocations quickly in an emergency, they
7125must pre-compute the revocation message. The revocation API enables this with
7126two functions that are used to compute the revocation message, but not trigger
7127the actual revocation operation.
7128
7129@code{GNUNET_REVOCATION_check_pow} should be used to calculate the
7130proof-of-work required in the revocation message. This function takes the
7131public key, the required number of bits for the proof of work (which in GNUnet
7132is a network-wide constant) and finally a proof-of-work number as arguments.
7133The function then checks if the given proof-of-work number is a valid proof of
7134work for the given public key. Clients preparing a revocation are expected to
7135call this function repeatedly (typically with a monotonically increasing
7136sequence of numbers of the proof-of-work number) until a given number satisfies
7137the check. That number should then be saved for later use in the revocation
7138operation.
7139
7140@code{GNUNET_REVOCATION_sign_revocation} is used to generate the signature that
7141is required in a revocation message. It takes the private key that (possibly in
7142the future) is to be revoked and returns the signature. The signature can again
7143be saved to disk for later use, which will then allow performing a revocation
7144even without access to the private key.
7145
7146@node Issuing revocations
7147@subsubsection Issuing revocations
7148
7149
7150Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign} and
7151the proof-of-work, @code{GNUNET_REVOCATION_revoke} can be used to perform the
7152actual revocation. The given callback is called upon completion of the
7153operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
7154library from calling the continuation; however, in that case it is undefined
7155whether or not the revocation operation will be executed.
7156
7157@node The REVOCATION Client-Service Protocol
7158@subsection The REVOCATION Client-Service Protocol
7159
7160
7161The REVOCATION protocol consists of four simple messages.
7162
7163A @code{QueryMessage} containing a public ECDSA key is used to check if a
7164particular key has been revoked. The service responds with a
7165@code{QueryResponseMessage} which simply contains a bit that says if the given
7166public key is still valid, or if it has been revoked.
7167
7168The second possible interaction is for a client to revoke a key by passing a
7169@code{RevokeMessage} to the service. The @code{RevokeMessage} contains the
7170ECDSA public key to be revoked, a signature by the corresponding private key
7171and the proof-of-work, The service responds with a
7172@code{RevocationResponseMessage} which can be used to indicate that the
7173@code{RevokeMessage} was invalid (i.e. proof of work incorrect), or otherwise
7174indicates that the revocation has been processed successfully.
7175
7176@node The REVOCATION Peer-to-Peer Protocol
7177@subsection The REVOCATION Peer-to-Peer Protocol
7178
7179@c %**end of header
7180
7181Revocation uses two disjoint ways to spread revocation information among peers.
7182First of all, P2P gossip exchanged via CORE-level neighbours is used to quickly
7183spread revocations to all connected peers. Second, whenever two peers (that
7184both support revocations) connect, the SET service is used to compute the union
7185of the respective revocation sets.
7186
7187In both cases, the exchanged messages are @code{RevokeMessage}s which contain
7188the public key that is being revoked, a matching ECDSA signature, and a
7189proof-of-work. Whenever a peer learns about a new revocation this way, it first
7190validates the signature and the proof-of-work, then stores it to disk
7191(typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally spreads the
7192information to all directly connected neighbours.
7193
7194For computing the union using the SET service, the peer with the smaller hashed
7195peer identity will connect (as a "client" in the two-party set protocol) to the
7196other peer after one second (to reduce traffic spikes on connect) and initiate
7197the computation of the set union. All revocation services use a common hash to
7198identify the SET operation over revocation sets.
7199
7200The current implementation accepts revocation set union operations from all
7201peers at any time; however, well-behaved peers should only initiate this
7202operation once after establishing a connection to a peer with a larger hashed
7203peer identity.
7204
7205@node GNUnet's File-sharing (FS) Subsystem
7206@section GNUnet's File-sharing (FS) Subsystem
7207
7208@c %**end of header
7209
7210This chapter describes the details of how the file-sharing service works. As
7211with all services, it is split into an API (libgnunetfs), the service process
7212(gnunet-service-fs) and user interface(s). The file-sharing service uses the
7213datastore service to store blocks and the DHT (and indirectly datacache) for
7214lookups for non-anonymous file-sharing.@ Furthermore, the file-sharing service
7215uses the block library (and the block fs plugin) for validation of DHT
7216operations.
7217
7218In contrast to many other services, libgnunetfs is rather complex since the
7219client library includes a large number of high-level abstractions; this is
7220necessary since the Fs service itself largely only operates on the block level.
7221The FS library is responsible for providing a file-based abstraction to
7222applications, including directories, meta data, keyword search, verification,
7223and so on.
7224
7225The method used by GNUnet to break large files into blocks and to use keyword
7226search is called the "Encoding for Censorship Resistant Sharing" (ECRS). ECRS
7227is largely implemented in the fs library; block validation is also reflected in
7228the block FS plugin and the FS service. ECRS on-demand encoding is implemented
7229in the FS service.
7230
7231NOTE: The documentation in this chapter is quite incomplete.
7232
7233@menu
7234* Encoding for Censorship-Resistant Sharing (ECRS)::
7235* File-sharing persistence directory structure::
7236@end menu
7237
7238@node Encoding for Censorship-Resistant Sharing (ECRS)
7239@subsection Encoding for Censorship-Resistant Sharing (ECRS)
7240
7241@c %**end of header
7242
7243When GNUnet shares files, it uses a content encoding that is called ECRS, the
7244Encoding for Censorship-Resistant Sharing. Most of ECRS is described in the
7245(so far unpublished) research paper attached to this page. ECRS obsoletes the
7246previous ESED and ESED II encodings which were used in GNUnet before version
72470.7.0.@ @ The rest of this page assumes that the reader is familiar with the
7248attached paper. What follows is a description of some minor extensions that
7249GNUnet makes over what is described in the paper. The reason why these
7250extensions are not in the paper is that we felt that they were obvious or
7251trivial extensions to the original scheme and thus did not warrant space in
7252the research report.
7253
7254
7255@menu
7256* Namespace Advertisements::
7257* KSBlocks::
7258@end menu
7259
7260@node Namespace Advertisements
7261@subsubsection Namespace Advertisements
7262
7263@c %**end of header
7264
7265An @code{SBlock} with identifier ′all zeros′ is a signed
7266advertisement for a namespace. This special @code{SBlock} contains metadata
7267describing the content of the namespace. Instead of the name of the identifier
7268for a potential update, it contains the identifier for the root of the
7269namespace. The URI should always be empty. The @code{SBlock} is signed with
7270the content provder′s RSA private key (just like any other SBlock). Peers
7271can search for @code{SBlock}s in order to find out more about a namespace.
7272
7273@node KSBlocks
7274@subsubsection KSBlocks
7275
7276@c %**end of header
7277
7278GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead of
7279encrypting a CHK and metadata, encrypt an @code{SBlock} instead. In other
7280words, @code{KSBlocks} enable GNUnet to find @code{SBlocks} using the global
7281keyword search. Usually the encrypted @code{SBlock} is a namespace
7282advertisement. The rationale behind @code{KSBlock}s and @code{SBlock}s is to
7283enable peers to discover namespaces via keyword searches, and, to associate
7284useful information with namespaces. When GNUnet finds @code{KSBlocks} during a
7285normal keyword search, it adds the information to an internal list of
7286discovered namespaces. Users looking for interesting namespaces can then
7287inspect this list, reducing the need for out-of-band discovery of namespaces.
7288Naturally, namespaces (or more specifically, namespace advertisements) can
7289also be referenced from directories, but @code{KSBlock}s should make it easier
7290to advertise namespaces for the owner of the pseudonym since they eliminate
7291the need to first create a directory.
7292
7293Collections are also advertised using @code{KSBlock}s.
7294
7295@table @asis
7296@item Attachment Size
7297@item ecrs.pdf 270.68 KB
7298@item https://gnunet.org/sites/default/files/ecrs.pdf
7299@end table
7300
7301@node File-sharing persistence directory structure
7302@subsection File-sharing persistence directory structure
7303
7304@c %**end of header
7305
7306This section documents how the file-sharing library implements persistence of
7307file-sharing operations and specifically the resulting directory structure.
7308This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag was set
7309when calling @code{GNUNET_FS_start}. In this case, the file-sharing library
7310will try hard to ensure that all major operations (searching, downloading,
7311publishing, unindexing) are persistent, that is, can live longer than the
7312process itself. More specifically, an operation is supposed to live until it is
7313explicitly stopped.
7314
7315If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
7316@code{SUSPEND} event is generated and then when the process calls
7317@code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
7318Additionally, even if an application crashes (segfault, SIGKILL, system crash)
7319and hence @code{GNUNET_FS_stop} is never called and no @code{SUSPEND} events
7320are generated, operations are still resumed (with @code{RESUME} events). This
7321is implemented by constantly writing the current state of the file-sharing
7322operations to disk. Specifically, the current state is always written to disk
7323whenever anything significant changes (the exception are block-wise progress in
7324publishing and unindexing, since those operations would be slowed down
7325significantly and can be resumed cheaply even without detailed accounting).
7326Note that@ if the process crashes (or is killed) during a serialization
7327operation, FS does not guarantee that this specific operation is recoverable
7328(no strict transactional semantics, again for performance reasons). However,
7329all other unrelated operations should resume nicely.
7330
7331Since we need to serialize the state continuously and want to recover as much
7332as possible even after crashing during a serialization operation, we do not use
7333one large file for serialization. Instead, several directories are used for the
7334various operations. When @code{GNUNET_FS_start} executes, the master
7335directories are scanned for files describing operations to resume. Sometimes,
7336these operations can refer to related operations in child directories which may
7337also be resumed at this point. Note that corrupted files are cleaned up
7338automatically. However, dangling files in child directories (those that are not
7339referenced by files from the master directories) are not automatically removed.
7340
7341Persistence data is kept in a directory that begins with the "STATE_DIR" prefix
7342from the configuration file (by default, "$SERVICEHOME/persistence/") followed
7343by the name of the client as given to @code{GNUNET_FS_start} (for example,
7344"gnunet-gtk") followed by the actual name of the master or child directory.
7345
7346The names for the master directories follow the names of the operations:
7347
7348@itemize @bullet
7349@item "search"
7350@item "download"
7351@item "publish"
7352@item "unindex"
7353@end itemize
7354
7355Each of the master directories contains names (chosen at random) for each
7356active top-level (master) operation. Note that a download that is associated
7357with a search result is not a top-level operation.
7358
7359In contrast to the master directories, the child directories are only consulted
7360when another operation refers to them. For each search, a subdirectory (named
7361after the master search synchronization file) contains the search results.
7362Search results can have an associated download, which is then stored in the
7363general "download-child" directory. Downloads can be recursive, in which case
7364children are stored in subdirectories mirroring the structure of the recursive
7365download (either starting in the master "download" directory or in the
7366"download-child" directory depending on how the download was initiated). For
7367publishing operations, the "publish-file" directory contains information about
7368the individual files and directories that are part of the publication. However,
7369this directory structure is flat and does not mirror the structure of the
7370publishing operation. Note that unindex operations cannot have associated child
7371operations.
7372
7373@node GNUnet's REGEX Subsystem
7374@section GNUnet's REGEX Subsystem
7375
7376@c %**end of header
7377
7378Using the REGEX subsystem, you can discover peers that offer a particular
7379service using regular expressions. The peers that offer a service specify it
7380using a regular expressions. Peers that want to patronize a service search
7381using a string. The REGEX subsystem will then use the DHT to return a set of
7382matching offerers to the patrons.
7383
7384For the technical details, we have "Max's defense talk and Max's Master's
7385thesis. An additional publication is under preparation and available to team
7386members (in Git).
7387
7388@menu
7389* How to run the regex profiler::
7390@end menu
7391
7392@node How to run the regex profiler
7393@subsection How to run the regex profiler
7394
7395@c %**end of header
7396
7397The gnunet-regex-profiler can be used to profile the usage of mesh/regex for a
7398given set of regular expressions and strings. Mesh/regex allows you to announce
7399your peer ID under a certain regex and search for peers matching a particular
7400regex using a string. See https://gnunet.org/szengel2012ms for a full
7401introduction.
7402
7403First of all, the regex profiler uses GNUnet testbed, thus all the implications
7404for testbed also apply to the regex profiler (for example you need
7405password-less ssh login to the machines listed in your hosts file).
7406
7407@strong{Configuration}
7408
7409Moreover, an appropriate configuration file is needed. Generally you can refer
7410to SVN HEAD: contrib/regex_profiler_infiniband.conf for an example
7411configuration. In the following paragraph the important details are
7412highlighted.
7413
7414Announcing of the regular expressions is done by the
7415gnunet-daemon-regexprofiler, therefore you have to make sure it is started, by
7416adding it to the AUTOSTART set of ARM:@
7417@code{
7418[regexprofiler]@
7419AUTOSTART = YES@
7420}
7421
7422Furthermore you have to specify the location of the binary:
7423@example
7424[regexprofiler]
7425# Location of the gnunet-daemon-regexprofiler binary.
7426BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
7427# Regex prefix that will be applied to all regular expressions and
7428# search string.
7429REGEX_PREFIX = "GNVPN-0001-PAD"
7430@end example
7431
7432When running the profiler with a large scale deployment, you probably want to
7433reduce the workload of each peer. Use the following options to do this.@
7434@example
7435[dht]@
7436# Force network size estimation@
7437FORCE_NSE = 1
7438
7439[dhtcache]
7440DATABASE = heap@
7441# Disable RC-file for Bloom filter? (for benchmarking with limited IO
7442# availability)@
7443DISABLE_BF_RC = YES@
7444# Disable Bloom filter entirely@
7445DISABLE_BF = YES
7446
7447[nse]@
7448# Minimize proof-of-work CPU consumption by NSE@
7449WORKBITS = 1
7450@end example
7451
7452
7453@strong{Options}
7454
7455To finally run the profiler some options and the input data need to be
7456specified on the command line.
7457@code{@ gnunet-regex-profiler -c config-file -d
7458log-file -n num-links -p@ path-compression-length -s search-delay -t
7459matching-timeout -a num-search-strings hosts-file policy-dir
7460search-strings-file@ }
7461
7462@code{config-file} the configuration file created earlier.@ @code{log-file}
7463file where to write statistics output.@ @code{num-links} number of random links
7464between started peers.@ @code{path-compression-length} maximum path compression
7465length in the DFA.@ @code{search-delay} time to wait between peers finished
7466linking and@ starting to match strings.@ @code{matching-timeout} timeout after
7467witch to cancel the searching.@ @code{num-search-strings} number of strings in
7468the search-strings-file.
7469
7470The @code{hosts-file} should contain a list of hosts for the testbed, one per
7471line in the following format. @code{user@@host_ip:port}.
7472
7473The @code{policy-dir} is a folder containing text files containing one or more
7474regular expressions. A peer is started for each file in that folder and the
7475regular expressions in the corresponding file are announced by this peer.
7476
7477The @code{search-strings-file} is a text file containing search strings, one in
7478each line.
7479
7480You can create regular expressions and search strings for every AS in the@
7481Internet using the attached scripts. You need one of the
7482@uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA
7483routeviews prefix2as} data files for this. Run @code{create_regex.py <filename>
7484<output path>} to create the regular expressions and @code{create_strings.py
7485<input path> <outfile>} to create a search strings file from the previously
7486created regular expressions.