aboutsummaryrefslogtreecommitdiff
path: root/doc/documentation/chapters/developer.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/documentation/chapters/developer.texi')
-rw-r--r--doc/documentation/chapters/developer.texi8304
1 files changed, 8304 insertions, 0 deletions
diff --git a/doc/documentation/chapters/developer.texi b/doc/documentation/chapters/developer.texi
new file mode 100644
index 000000000..996474359
--- /dev/null
+++ b/doc/documentation/chapters/developer.texi
@@ -0,0 +1,8304 @@
1@c ***********************************************************************
2@node GNUnet Developer Handbook
3@chapter GNUnet Developer Handbook
4
5This book is intended to be an introduction for programmers that want to
6extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
7application. For developers, GNUnet is:
8
9@itemize @bullet
10@item Free software under the GNU General Public License, with a community
11that believes in the GNU philosophy
12@item A set of standards, including coding conventions and
13architectural rules
14@item A set of layered protocols, both specifying the communication
15between peers as well as the communication between components
16of a single peer
17@item A set of libraries with well-defined APIs suitable for
18writing extensions
19@end itemize
20
21In particular, the architecture specifies that a peer consists of many
22processes communicating via protocols. Processes can be written in almost
23any language. C and Java APIs exist for accessing existing services and
24for writing extensions. It is possible to write extensions in other
25languages by implementing the necessary IPC protocols.
26
27GNUnet can be extended and improved along many possible dimensions, and
28anyone interested in Free Software and Freedom-enhancing Networking is
29welcome to join the effort. This Developer Handbook attempts to provide
30an initial introduction to some of the key design choices and central
31components of the system. This part of the GNUNet documentation
32is far from complete, and we welcome informed contributions,
33be it in the form of new chapters or insightful comments.
34
35@menu
36* Developer Introduction::
37* Code overview::
38* System Architecture::
39* Subsystem stability::
40* Naming conventions and coding style guide::
41* Build-system::
42* Developing extensions for GNUnet using the gnunet-ext template::
43* Writing testcases::
44* GNUnet's TESTING library::
45* Performance regression analysis with Gauger::
46* GNUnet's TESTBED Subsystem::
47* libgnunetutil::
48* The Automatic Restart Manager (ARM)::
49* GNUnet's TRANSPORT Subsystem::
50* NAT library::
51* Distance-Vector plugin::
52* SMTP plugin::
53* Bluetooth plugin::
54* WLAN plugin::
55* The ATS Subsystem::
56* GNUnet's CORE Subsystem::
57* GNUnet's CADET subsystem::
58* GNUnet's NSE subsystem::
59* GNUnet's HOSTLIST subsystem::
60* GNUnet's IDENTITY subsystem::
61* GNUnet's NAMESTORE Subsystem::
62* GNUnet's PEERINFO subsystem::
63* GNUnet's PEERSTORE subsystem::
64* GNUnet's SET Subsystem::
65* GNUnet's STATISTICS subsystem::
66* GNUnet's Distributed Hash Table (DHT)::
67* The GNU Name System (GNS)::
68* The GNS Namecache::
69* The REVOCATION Subsystem::
70* GNUnet's File-sharing (FS) Subsystem::
71* GNUnet's REGEX Subsystem::
72@end menu
73
74@node Developer Introduction
75@section Developer Introduction
76
77This Developer Handbook is intended as first introduction to GNUnet for
78new developers that want to extend the GNUnet framework. After the
79introduction, each of the GNUnet subsystems (directories in the
80@file{src/} tree) is (supposed to be) covered in its own chapter. In
81addition to this documentation, GNUnet developers should be aware of the
82services available on the GNUnet server to them.
83
84New developers can have a look a the GNUnet tutorials for C and java
85available in the @file{src/} directory of the repository or under the
86following links:
87
88@c ** FIXME: Link to files in source, not online.
89@c ** FIXME: Where is the Java tutorial?
90@itemize @bullet
91@item @uref{https://gnunet.org/git/gnunet.git/plain/doc/gnunet-c-tutorial.pdf, GNUnet C tutorial}
92@item GNUnet Java tutorial
93@end itemize
94
95In addition to this book, the GNUnet server contains various resources for
96GNUnet developers. They are all conveniently reachable via the "Developer"
97entry in the navigation menu. Some additional tools (such as static
98analysis reports) require a special developer access to perform certain
99operations. If you feel you need access, you should contact
100@uref{http://grothoff.org/christian/, Christian Grothoff},
101GNUnet's maintainer.
102
103The public subsystems on the GNUnet server that help developers are:
104
105@itemize @bullet
106
107@item The Version control system (git) keeps our code and enables
108distributed development.
109Only developers with write access can commit code, everyone else is
110encouraged to submit patches to the
111@uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers, GNUnet-developers mailinglist}
112.
113
114@item The GNUnet bugtracking system (Mantis) is used to track
115feature requests, open bug reports and their resolutions.
116Anyone can report bugs, only developers can claim to have fixed them.
117
118@item A site installation of the CI system "Buildbot" is used to check
119GNUnet builds automatically on a range of platforms.
120Builds are triggered automatically after 30 minutes of no changes to Git.
121
122@item The current quality of our automated test suite is assessed using
123Code coverage analysis. This analysis is run daily; however the webpage
124is only updated if all automated tests pass at that time. Testcases that
125improve our code coverage are always welcome.
126
127@item We try to automatically find bugs using a static analysis scan.
128This scan is run daily; however the webpage is only updated if all
129automated tests pass at the time. Note that not everything that is
130flagged by the analysis is a bug, sometimes even good code can be marked
131as possibly problematic. Nevertheless, developers are encouraged to at
132least be aware of all issues in their code that are listed.
133
134@item We use Gauger for automatic performance regression visualization.
135Details on how to use Gauger are here.
136
137@item We use @uref{http://junit.org/, junit} to automatically test
138@command{gnunet-java}.
139Automatically generated, current reports on the test suite are here.
140
141@item We use Cobertura to generate test coverage reports for gnunet-java.
142Current reports on test coverage are here.
143
144@end itemize
145
146
147
148@c ***********************************************************************
149@menu
150* Project overview::
151@end menu
152
153@node Project overview
154@subsection Project overview
155
156The GNUnet project consists at this point of several sub-projects. This
157section is supposed to give an initial overview about the various
158sub-projects. Note that this description also lists projects that are far
159from complete, including even those that have literally not a single line
160of code in them yet.
161
162GNUnet sub-projects in order of likely relevance are currently:
163
164@table @asis
165
166@item gnunet
167Core of the P2P framework, including file-sharing, VPN and
168chat applications; this is what the developer handbook covers mostly
169@item gnunet-gtk Gtk+-based user interfaces, including gnunet-fs-gtk
170(file-sharing), gnunet-statistics-gtk (statistics over time),
171gnunet-peerinfo-gtk (information about current connections and known
172peers), gnunet-chat-gtk (chat GUI) and gnunet-setup (setup tool for
173"everything")
174@item gnunet-fuse
175Mounting directories shared via GNUnet's file-sharing
176on Linux
177@item gnunet-update
178Installation and update tool
179@item gnunet-ext
180Template for starting 'external' GNUnet projects
181@item gnunet-java
182Java APIs for writing GNUnet services and applications
183@c ** FIXME: Point to new website repository once we have it:
184@c ** @item svn/gnunet-www/ Code and media helping drive the GNUnet
185@c website
186@item eclectic
187Code to run GNUnet nodes on testbeds for research, development,
188testing and evaluation
189@c ** FIXME: Solve the status and location of gnunet-qt
190@item gnunet-qt
191Qt-based GNUnet GUI (dead?)
192@item gnunet-cocoa
193cocoa-based GNUnet GUI (dead?)
194
195@end table
196
197We are also working on various supporting libraries and tools:
198@c ** FIXME: What about gauger, and what about libmwmodem?
199
200@table @asis
201@item libextractor
202GNU libextractor (meta data extraction)
203@item libmicrohttpd
204GNU libmicrohttpd (embedded HTTP(S) server library)
205@item gauger
206Tool for performance regression analysis
207@item monkey
208Tool for automated debugging of distributed systems
209@item libmwmodem
210Library for accessing satellite connection quality
211reports
212@item libgnurl
213gnURL (feature restricted variant of cURL/libcurl)
214@end table
215
216Finally, there are various external projects (see links for a list of
217those that have a public website) which build on top of the GNUnet
218framework.
219
220@c ***********************************************************************
221@node Code overview
222@section Code overview
223
224This section gives a brief overview of the GNUnet source code.
225Specifically, we sketch the function of each of the subdirectories in
226the @file{gnunet/src/} directory. The order given is roughly bottom-up
227(in terms of the layers of the system).
228
229@table @asis
230@item @file{util/} --- libgnunetutil
231Library with general utility functions, all
232GNUnet binaries link against this library. Anything from memory
233allocation and data structures to cryptography and inter-process
234communication. The goal is to provide an OS-independent interface and
235more 'secure' or convenient implementations of commonly used primitives.
236The API is spread over more than a dozen headers, developers should study
237those closely to avoid duplicating existing functions.
238@item @file{hello/} --- libgnunethello
239HELLO messages are used to
240describe under which addresses a peer can be reached (for example,
241protocol, IP, port). This library manages parsing and generating of HELLO
242messages.
243@item @file{block/} --- libgnunetblock
244The DHT and other components of GNUnet
245store information in units called 'blocks'. Each block has a type and the
246type defines a particular format and how that binary format is to be
247linked to a hash code (the key for the DHT and for databases). The block
248library is a wapper around block plugins which provide the necessary
249functions for each block type.
250@item @file{statistics/}
251The statistics service enables associating
252values (of type uint64_t) with a componenet name and a string. The main
253uses is debugging (counting events), performance tracking and user
254entertainment (what did my peer do today?).
255@item @file{arm/} --- Automatic Restart Manager (ARM)
256The automatic-restart-manager (ARM) service
257is the GNUnet master service. Its role is to start gnunet-services, to
258re-start them when they crashed and finally to shut down the system when
259requested.
260@item @file{peerinfo/}
261The peerinfo service keeps track of which peers are known
262to the local peer and also tracks the validated addresses for each peer
263(in the form of a HELLO message) for each of those peers. The peer is not
264necessarily connected to all peers known to the peerinfo service.
265Peerinfo provides persistent storage for peer identities --- peers are
266not forgotten just because of a system restart.
267@item @file{datacache/} --- libgnunetdatacache
268The datacache library provides (temporary) block storage for the DHT.
269Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
270All data stored in the cache is lost when the peer is stopped or
271restarted (datacache uses temporary tables).
272@item @file{datastore/}
273The datastore service stores file-sharing blocks in
274databases for extended periods of time. In contrast to the datacache, data
275is not lost when peers restart. However, quota restrictions may still
276cause old, expired or low-priority data to be eventually discarded.
277Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
278@item @file{template/}
279Template for writing a new service. Does nothing.
280@item @file{ats/} --- Automatic Transport Selection
281The automatic transport
282selection (ATS) service is responsible for deciding which address (i.e.
283which transport plugin) should be used for communication with other peers,
284and at what bandwidth.
285@item @file{nat/} --- libgnunetnat
286Library that provides basic functions for NAT traversal.
287The library supports NAT traversal with
288manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
289traversal. The library also includes an API for testing if the current
290configuration works and the @code{gnunet-nat-server} which provides an
291external service to test the local configuration.
292@item @file{fragmentation/} --- libgnunetfragmentation
293Some transports (UDP and WLAN, mostly) have restrictions on the maximum
294transfer unit (MTU) for packets. The fragmentation library can be used to
295break larger packets into chunks of at most 1k and transmit the resulting
296fragments reliabily (with acknowledgement, retransmission, timeouts,
297etc.).
298@item @file{transport/}
299The transport service is responsible for managing the
300basic P2P communication. It uses plugins to support P2P communication
301over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
302validates peer addresses, enforces bandwidth restrictions, limits the
303total number of connections and enforces connectivity restrictions (i.e.
304friends-only).
305@item @file{peerinfo-tool/}
306This directory contains the gnunet-peerinfo binary which can be used to
307inspect the peers and HELLOs known to the peerinfo service.
308@item @file{core/}
309The core service is responsible for establishing encrypted, authenticated
310connections with other peers, encrypting and decrypting messages and
311forwarding messages to higher-level services that are interested in them.
312@item @file{testing/} --- libgnunettesting
313The testing library allows starting (and stopping) peers
314for writing testcases.
315It also supports automatic generation of configurations for peers
316ensuring that the ports and paths are disjoint. libgnunettesting is also
317the foundation for the testbed service
318@item @file{testbed/}
319The testbed service is used for creating small or large scale deployments
320of GNUnet peers for evaluation of protocols.
321It facilitates peer depolyments on multiple
322hosts (for example, in a cluster) and establishing varous network
323topologies (both underlay and overlay).
324@item @file{nse/} --- Network Size Estimation
325The network size estimation (NSE) service
326implements a protocol for (securely) estimating the current size of the
327P2P network.
328@item @file{dht/} --- distributed hash table
329The distributed hash table (DHT) service provides a
330distributed implementation of a hash table to store blocks under hash
331keys in the P2P network.
332@item @file{hostlist/}
333The hostlist service allows learning about
334other peers in the network by downloading HELLO messages from an HTTP
335server, can be configured to run such an HTTP server and also implements
336a P2P protocol to advertise and automatically learn about other peers
337that offer a public hostlist server.
338@item @file{topology/}
339The topology service is responsible for
340maintaining the mesh topology. It tries to maintain connections to friends
341(depending on the configuration) and also tries to ensure that the peer
342has a decent number of active connections at all times. If necessary, new
343connections are added. All peers should run the topology service,
344otherwise they may end up not being connected to any other peer (unless
345some other service ensures that core establishes the required
346connections). The topology service also tells the transport service which
347connections are permitted (for friend-to-friend networking)
348@item @file{fs/} --- file-sharing
349The file-sharing (FS) service implements GNUnet's
350file-sharing application. Both anonymous file-sharing (using gap) and
351non-anonymous file-sharing (using dht) are supported.
352@item @file{cadet/}
353The CADET service provides a general-purpose routing abstraction to create
354end-to-end encrypted tunnels in mesh networks. We wrote a paper
355documenting key aspects of the design.
356@item @file{tun/} --- libgnunettun
357Library for building IPv4, IPv6 packets and creating
358checksums for UDP, TCP and ICMP packets. The header
359defines C structs for common Internet packet formats and in particular
360structs for interacting with TUN (virtual network) interfaces.
361@item @file{mysql/} --- libgnunetmysql
362Library for creating and executing prepared MySQL
363statements and to manage the connection to the MySQL database.
364Essentially a lightweight wrapper for the interaction between GNUnet
365components and libmysqlclient.
366@item @file{dns/}
367Service that allows intercepting and modifying DNS requests of
368the local machine. Currently used for IPv4-IPv6 protocol translation
369(DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
370service can also be configured to offer an exit service for DNS traffic.
371@item @file{vpn/}
372The virtual public network (VPN) service provides a virtual
373tunnel interface (VTUN) for IP routing over GNUnet.
374Needs some other peers to run an "exit" service to work.
375Can be activated using the "gnunet-vpn" tool or integrated with DNS using
376the "pt" daemon.
377@item @file{exit/}
378Daemon to allow traffic from the VPN to exit this
379peer to the Internet or to specific IP-based services of the local peer.
380Currently, an exit service can only be restricted to IPv4 or IPv6, not to
381specific ports and or IP address ranges. If this is not acceptable,
382additional firewall rules must be added manually. exit currently only
383works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
384system via a DNS service.
385@item @file{pt/}
386protocol translation daemon. This daemon enables 4-to-6,
3876-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
388essentially uses "DNS" to intercept DNS replies and then maps results to
389those offered by the VPN, which then sends them using mesh to some daemon
390offering an appropriate exit service.
391@item @file{identity/}
392Management of egos (alter egos) of a user; identities are
393essentially named ECC private keys and used for zones in the GNU name
394system and for namespaces in file-sharing, but might find other uses later
395@item @file{revocation/}
396Key revocation service, can be used to revoke the
397private key of an identity if it has been compromised
398@item @file{namecache/}
399Cache for resolution results for the GNU name system;
400data is encrypted and can be shared among users,
401loss of the data should ideally only result in a
402performance degradation (persistence not required)
403@item @file{namestore/}
404Database for the GNU name system with per-user private information,
405persistence required
406@item @file{gns/}
407GNU name system, a GNU approach to DNS and PKI.
408@item @file{dv/}
409A plugin for distance-vector (DV)-based routing.
410DV consists of a service and a transport plugin to provide peers
411with the illusion of a direct P2P connection for connections
412that use multiple (typically up to 3) hops in the actual underlay network.
413@item @file{regex/}
414Service for the (distributed) evaluation of regular expressions.
415@item @file{scalarproduct/}
416The scalar product service offers an API to perform a secure multiparty
417computation which calculates a scalar product between two peers
418without exposing the private input vectors of the peers to each other.
419@item @file{consensus/}
420The consensus service will allow a set of peers to agree
421on a set of values via a distributed set union computation.
422@item @file{rest/}
423The rest API allows access to GNUnet services using RESTful interaction.
424The services provide plugins that can exposed by the rest server.
425@item @file{experimentation/}
426The experimentation daemon coordinates distributed
427experimentation to evaluate transport and ATS properties.
428@end table
429
430@c ***********************************************************************
431@node System Architecture
432@section System Architecture
433
434GNUnet developers like LEGOs. The blocks are indestructible, can be
435stacked together to construct complex buildings and it is generally easy
436to swap one block for a different one that has the same shape. GNUnet's
437architecture is based on LEGOs:
438
439@c images here
440
441This chapter documents the GNUnet LEGO system, also known as GNUnet's
442system architecture.
443
444The most common GNUnet component is a service. Services offer an API (or
445several, depending on what you count as "an API") which is implemented as
446a library. The library communicates with the main process of the service
447using a service-specific network protocol. The main process of the service
448typically doesn't fully provide everything that is needed --- it has holes
449to be filled by APIs to other services.
450
451A special kind of component in GNUnet are user interfaces and daemons.
452Like services, they have holes to be filled by APIs of other services.
453Unlike services, daemons do not implement their own network protocol and
454they have no API:
455
456The GNUnet system provides a range of services, daemons and user
457interfaces, which are then combined into a layered GNUnet instance (also
458known as a peer).
459
460Note that while it is generally possible to swap one service for another
461compatible service, there is often only one implementation. However,
462during development we often have a "new" version of a service in parallel
463with an "old" version. While the "new" version is not working, developers
464working on other parts of the service can continue their development by
465simply using the "old" service. Alternative design ideas can also be
466easily investigated by swapping out individual components. This is
467typically achieved by simply changing the name of the "BINARY" in the
468respective configuration section.
469
470Key properties of GNUnet services are that they must be separate
471processes and that they must protect themselves by applying tight error
472checking against the network protocol they implement (thereby achieving a
473certain degree of robustness).
474
475On the other hand, the APIs are implemented to tolerate failures of the
476service, isolating their host process from errors by the service. If the
477service process crashes, other services and daemons around it should not
478also fail, but instead wait for the service process to be restarted by
479ARM.
480
481
482@c ***********************************************************************
483@node Subsystem stability
484@section Subsystem stability
485
486This section documents the current stability of the various GNUnet
487subsystems. Stability here describes the expected degree of compatibility
488with future versions of GNUnet. For each subsystem we distinguish between
489compatibility on the P2P network level (communication protocol between
490peers), the IPC level (communication between the service and the service
491library) and the API level (stability of the API). P2P compatibility is
492relevant in terms of which applications are likely going to be able to
493communicate with future versions of the network. IPC communication is
494relevant for the implementation of language bindings that re-implement the
495IPC messages. Finally, API compatibility is relevant to developers that
496hope to be able to avoid changes to applications build on top of the APIs
497of the framework.
498
499The following table summarizes our current view of the stability of the
500respective protocols or APIs:
501
502@multitable @columnfractions .20 .20 .20 .20
503@headitem Subsystem @tab P2P @tab IPC @tab C API
504@item util @tab n/a @tab n/a @tab stable
505@item arm @tab n/a @tab stable @tab stable
506@item ats @tab n/a @tab unstable @tab testing
507@item block @tab n/a @tab n/a @tab stable
508@item cadet @tab testing @tab testing @tab testing
509@item consensus @tab experimental @tab experimental @tab experimental
510@item core @tab stable @tab stable @tab stable
511@item datacache @tab n/a @tab n/a @tab stable
512@item datastore @tab n/a @tab stable @tab stable
513@item dht @tab stable @tab stable @tab stable
514@item dns @tab stable @tab stable @tab stable
515@item dv @tab testing @tab testing @tab n/a
516@item exit @tab testing @tab n/a @tab n/a
517@item fragmentation @tab stable @tab n/a @tab stable
518@item fs @tab stable @tab stable @tab stable
519@item gns @tab stable @tab stable @tab stable
520@item hello @tab n/a @tab n/a @tab testing
521@item hostlist @tab stable @tab stable @tab n/a
522@item identity @tab stable @tab stable @tab n/a
523@item multicast @tab experimental @tab experimental @tab experimental
524@item mysql @tab stable @tab n/a @tab stable
525@item namestore @tab n/a @tab stable @tab stable
526@item nat @tab n/a @tab n/a @tab stable
527@item nse @tab stable @tab stable @tab stable
528@item peerinfo @tab n/a @tab stable @tab stable
529@item psyc @tab experimental @tab experimental @tab experimental
530@item pt @tab n/a @tab n/a @tab n/a
531@item regex @tab stable @tab stable @tab stable
532@item revocation @tab stable @tab stable @tab stable
533@item social @tab experimental @tab experimental @tab experimental
534@item statistics @tab n/a @tab stable @tab stable
535@item testbed @tab n/a @tab testing @tab testing
536@item testing @tab n/a @tab n/a @tab testing
537@item topology @tab n/a @tab n/a @tab n/a
538@item transport @tab stable @tab stable @tab stable
539@item tun @tab n/a @tab n/a @tab stable
540@item vpn @tab testing @tab n/a @tab n/a
541@end multitable
542
543Here is a rough explanation of the values:
544
545@table @samp
546@item stable
547No incompatible changes are planned at this time; for IPC/APIs, if
548there are incompatible changes, they will be minor and might only require
549minimal changes to existing code; for P2P, changes will be avoided if at
550all possible for the 0.10.x-series
551
552@item testing
553No incompatible changes are
554planned at this time, but the code is still known to be in flux; so while
555we have no concrete plans, our expectation is that there will still be
556minor modifications; for P2P, changes will likely be extensions that
557should not break existing code
558
559@item unstable
560Changes are planned and will happen; however, they
561will not be totally radical and the result should still resemble what is
562there now; nevertheless, anticipated changes will break protocol/API
563compatibility
564
565@item experimental
566Changes are planned and the result may look nothing like
567what the API/protocol looks like today
568
569@item unknown
570Someone should think about where this subsystem headed
571
572@item n/a
573This subsystem does not have an API/IPC-protocol/P2P-protocol
574@end table
575
576@c ***********************************************************************
577@node Naming conventions and coding style guide
578@section Naming conventions and coding style guide
579
580Here you can find some rules to help you write code for GNUnet.
581
582@c ***********************************************************************
583@menu
584* Naming conventions::
585* Coding style::
586@end menu
587
588@node Naming conventions
589@subsection Naming conventions
590
591
592@c ***********************************************************************
593@menu
594* include files::
595* binaries::
596* logging::
597* configuration::
598* exported symbols::
599* private (library-internal) symbols (including structs and macros)::
600* testcases::
601* performance tests::
602* src/ directories::
603@end menu
604
605@node include files
606@subsubsection include files
607
608@itemize @bullet
609@item _lib: library without need for a process
610@item _service: library that needs a service process
611@item _plugin: plugin definition
612@item _protocol: structs used in network protocol
613@item exceptions:
614@itemize @bullet
615@item gnunet_config.h --- generated
616@item platform.h --- first included
617@item plibc.h --- external library
618@item gnunet_common.h --- fundamental routines
619@item gnunet_directories.h --- generated
620@item gettext.h --- external library
621@end itemize
622@end itemize
623
624@c ***********************************************************************
625@node binaries
626@subsubsection binaries
627
628@itemize @bullet
629@item gnunet-service-xxx: service process (has listen socket)
630@item gnunet-daemon-xxx: daemon process (no listen socket)
631@item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
632@item gnunet-yyy: command-line tool for end-users
633@item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
634@item libgnunetxxx.so: library for API xxx
635@end itemize
636
637@c ***********************************************************************
638@node logging
639@subsubsection logging
640
641@itemize @bullet
642@item services and daemons use their directory name in
643@code{GNUNET_log_setup} (i.e. 'core') and log using
644plain 'GNUNET_log'.
645@item command-line tools use their full name in
646@code{GNUNET_log_setup} (i.e. 'gnunet-publish') and log using
647plain 'GNUNET_log'.
648@item service access libraries log using
649'@code{GNUNET_log_from}' and use '@code{DIRNAME-api}' for the
650component (i.e. 'core-api')
651@item pure libraries (without associated service) use
652'@code{GNUNET_log_from}' with the component set to their
653library name (without lib or '@file{.so}'),
654which should also be their directory name (i.e. '@file{nat}')
655@item plugins should use '@code{GNUNET_log_from}'
656with the directory name and the plugin name combined to produce
657the component name (i.e. 'transport-tcp').
658@item logging should be unified per-file by defining a
659@code{LOG} macro with the appropriate arguments,
660along these lines:
661
662@example
663#define LOG(kind,...)
664GNUNET_log_from (kind, "example-api",__VA_ARGS__)
665@end example
666
667@end itemize
668
669@c ***********************************************************************
670@node configuration
671@subsubsection configuration
672
673@itemize @bullet
674@item paths (that are substituted in all filenames) are in PATHS
675(have as few as possible)
676@item all options for a particular module (@file{src/MODULE})
677are under @code{[MODULE]}
678@item options for a plugin of a module
679are under @code{[MODULE-PLUGINNAME]}
680@end itemize
681
682@c ***********************************************************************
683@node exported symbols
684@subsubsection exported symbols
685
686@itemize @bullet
687@item must start with "@code{GNUNET_modulename_}" and be defined in
688"@file{modulename.c}"
689@item exceptions: those defined in @file{gnunet_common.h}
690@end itemize
691
692@c ***********************************************************************
693@node private (library-internal) symbols (including structs and macros)
694@subsubsection private (library-internal) symbols (including structs and macros)
695
696@itemize @bullet
697@item must NOT start with any prefix
698@item must not be exported in a way that linkers could use them or@ other
699libraries might see them via headers; they must be either
700declared/defined in C source files or in headers that are in the
701respective directory under @file{src/modulename/} and NEVER be declared
702in @file{src/include/}.
703@end itemize
704
705@node testcases
706@subsubsection testcases
707
708@itemize @bullet
709@item must be called "@file{test_module-under-test_case-description.c}"
710@item "case-description" maybe omitted if there is only one test
711@end itemize
712
713@c ***********************************************************************
714@node performance tests
715@subsubsection performance tests
716
717@itemize @bullet
718@item must be called "@file{perf_module-under-test_case-description.c}"
719@item "case-description" maybe omitted if there is only one performance
720test
721@item Must only be run if @code{HAVE_BENCHMARKS} is satisfied
722@end itemize
723
724@c ***********************************************************************
725@node src/ directories
726@subsubsection src/ directories
727
728@itemize @bullet
729@item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
730@item gnunet-service-NAME: service processes with accessor library (i.e.,
731gnunet-service-arm)
732@item libgnunetNAME: accessor library (_service.h-header) or standalone
733library (_lib.h-header)
734@item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
735gnunet-daemon-hostlist) and no GNUnet management port
736@item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
737libgnunet_plugin_transport_tcp)
738@end itemize
739
740@cindex Coding style
741@node Coding style
742@subsection Coding style
743
744@itemize @bullet
745@item We follow the GNU Coding Standards (@pxref{Top, The GNU Coding Standards,, standards, The GNU Coding Standards});
746@item Indentation is done with spaces, two per level, no tabs;
747@item C99 struct initialization is fine;
748@item declare only one variable per line, for example:
749
750@noindent
751instead of
752
753@example
754int i,j;
755@end example
756
757@noindent
758write:
759
760@example
761int i;
762int j;
763@end example
764
765@c TODO: include actual example from a file in source
766
767@noindent
768This helps keep diffs small and forces developers to think precisely about
769the type of every variable.
770Note that @code{char *} is different from @code{const char*} and
771@code{int} is different from @code{unsigned int} or @code{uint32_t}.
772Each variable type should be chosen with care.
773
774@item While @code{goto} should generally be avoided, having a
775@code{goto} to the end of a function to a block of clean up
776statements (free, close, etc.) can be acceptable.
777
778@item Conditions should be written with constants on the left (to avoid
779accidental assignment) and with the 'true' target being either the
780'error' case or the significantly simpler continuation. For example:
781
782@example
783if (0 != stat ("filename," &sbuf)) @{
784 error();
785 @}
786 else @{
787 /* handle normal case here */
788 @}
789@end example
790
791@noindent
792instead of
793
794@example
795if (stat ("filename," &sbuf) == 0) @{
796 /* handle normal case here */
797 @} else @{
798 error();
799 @}
800@end example
801
802@noindent
803If possible, the error clause should be terminated with a 'return' (or
804'goto' to some cleanup routine) and in this case, the 'else' clause
805should be omitted:
806
807@example
808if (0 != stat ("filename," &sbuf)) @{
809 error();
810 return;
811 @}
812/* handle normal case here */
813@end example
814
815This serves to avoid deep nesting. The 'constants on the left' rule
816applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
817NULL, and enums). With the two above rules (constants on left, errors in
818'true' branch), there is only one way to write most branches correctly.
819
820@item Combined assignments and tests are allowed if they do not hinder
821code clarity. For example, one can write:
822
823@example
824if (NULL == (value = lookup_function())) @{
825 error();
826 return;
827 @}
828@end example
829
830@item Use @code{break} and @code{continue} wherever possible to avoid
831deep(er) nesting. Thus, we would write:
832
833@example
834next = head;
835while (NULL != (pos = next)) @{
836 next = pos->next;
837 if (! should_free (pos))
838 continue;
839 GNUNET_CONTAINER_DLL_remove (head, tail, pos);
840 GNUNET_free (pos);
841 @}
842@end example
843
844instead of
845
846@example
847next = head; while (NULL != (pos = next)) @{
848 next = pos->next;
849 if (should_free (pos)) @{
850 /* unnecessary nesting! */
851 GNUNET_CONTAINER_DLL_remove (head, tail, pos);
852 GNUNET_free (pos);
853 @}
854 @}
855@end example
856
857@item We primarily use @code{for} and @code{while} loops.
858A @code{while} loop is used if the method for advancing in the loop is
859not a straightforward increment operation. In particular, we use:
860
861@example
862next = head;
863while (NULL != (pos = next))
864@{
865 next = pos->next;
866 if (! should_free (pos))
867 continue;
868 GNUNET_CONTAINER_DLL_remove (head, tail, pos);
869 GNUNET_free (pos);
870@}
871@end example
872
873to free entries in a list (as the iteration changes the structure of the
874list due to the free; the equivalent @code{for} loop does no longer
875follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
876However, for loops that do follow the simple @code{for} paradigm we do
877use @code{for}, even if it involves linked lists:
878
879@example
880/* simple iteration over a linked list */
881for (pos = head;
882 NULL != pos;
883 pos = pos->next)
884@{
885 use (pos);
886@}
887@end example
888
889
890@item The first argument to all higher-order functions in GNUnet must be
891declared to be of type @code{void *} and is reserved for a closure. We do
892not use inner functions, as trampolines would conflict with setups that
893use non-executable stacks.
894The first statement in a higher-order function, which unusually should
895be part of the variable declarations, should assign the
896@code{cls} argument to the precise expected type. For example:
897
898@example
899int callback (void *cls, char *args) @{
900 struct Foo *foo = cls;
901 int other_variables;
902
903 /* rest of function */
904@}
905@end example
906
907
908@item It is good practice to write complex @code{if} expressions instead
909of using deeply nested @code{if} statements. However, except for addition
910and multiplication, all operators should use parens. This is fine:
911
912@example
913if ( (1 == foo) || ((0 == bar) && (x != y)) )
914 return x;
915@end example
916
917
918However, this is not:
919
920@example
921if (1 == foo)
922 return x;
923if (0 == bar && x != y)
924 return x;
925@end example
926
927@noindent
928Note that splitting the @code{if} statement above is debateable as the
929@code{return x} is a very trivial statement. However, once the logic after
930the branch becomes more complicated (and is still identical), the "or"
931formulation should be used for sure.
932
933@item There should be two empty lines between the end of the function and
934the comments describing the following function. There should be a single
935empty line after the initial variable declarations of a function. If a
936function has no local variables, there should be no initial empty line. If
937a long function consists of several complex steps, those steps might be
938separated by an empty line (possibly followed by a comment describing the
939following step). The code should not contain empty lines in arbitrary
940places; if in doubt, it is likely better to NOT have an empty line (this
941way, more code will fit on the screen).
942@end itemize
943
944@c ***********************************************************************
945@node Build-system
946@section Build-system
947
948If you have code that is likely not to compile or build rules you might
949want to not trigger for most developers, use @code{if HAVE_EXPERIMENTAL}
950in your @file{Makefile.am}.
951Then it is OK to (temporarily) add non-compiling (or known-to-not-port)
952code.
953
954If you want to compile all testcases but NOT run them, run configure with
955the @code{--enable-test-suppression} option.
956
957If you want to run all testcases, including those that take a while, run
958configure with the @code{--enable-expensive-testcases} option.
959
960If you want to compile and run benchmarks, run configure with the
961@code{--enable-benchmarks} option.
962
963If you want to obtain code coverage results, run configure with the
964@code{--enable-coverage} option and run the @file{coverage.sh} script in
965the @file{contrib/} directory.
966
967@cindex gnunet-ext
968@node Developing extensions for GNUnet using the gnunet-ext template
969@section Developing extensions for GNUnet using the gnunet-ext template
970
971For developers who want to write extensions for GNUnet we provide the
972gnunet-ext template to provide an easy to use skeleton.
973
974gnunet-ext contains the build environment and template files for the
975development of GNUnet services, command line tools, APIs and tests.
976
977First of all you have to obtain gnunet-ext from git:
978
979@example
980git clone https://gnunet.org/git/gnunet-ext.git
981@end example
982
983The next step is to bootstrap and configure it. For configure you have to
984provide the path containing GNUnet with
985@code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
986install the extension using @code{--prefix=/path/to/install}:
987
988@example
989./bootstrap
990./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
991@end example
992
993When your GNUnet installation is not included in the default linker search
994path, you have to add @code{/path/to/gnunet} to the file
995@file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
996environmental variable @code{LD_LIBRARY_PATH} by using
997
998@example
999export LD_LIBRARY_PATH=/path/to/gnunet/lib
1000@end example
1001
1002@cindex writing testcases
1003@node Writing testcases
1004@section Writing testcases
1005
1006Ideally, any non-trivial GNUnet code should be covered by automated
1007testcases. Testcases should reside in the same place as the code that is
1008being tested. The name of source files implementing tests should begin
1009with "@code{test_}" followed by the name of the file
1010that contains the code that is being tested.
1011
1012Testcases in GNUnet should be integrated with the autotools build system.
1013This way, developers and anyone building binary packages will be able to
1014run all testcases simply by running @code{make check}. The final
1015testcases shipped with the distribution should output at most some brief
1016progress information and not display debug messages by default. The
1017success or failure of a testcase must be indicated by returning zero
1018(success) or non-zero (failure) from the main method of the testcase.
1019The integration with the autotools is relatively straightforward and only
1020requires modifications to the @file{Makefile.am} in the directory
1021containing the testcase. For a testcase testing the code in @file{foo.c}
1022the @file{Makefile.am} would contain the following lines:
1023
1024@example
1025check_PROGRAMS = test_foo
1026TESTS = $(check_PROGRAMS)
1027test_foo_SOURCES = test_foo.c
1028test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
1029@end example
1030
1031Naturally, other libraries used by the testcase may be specified in the
1032@code{LDADD} directive as necessary.
1033
1034Often testcases depend on additional input files, such as a configuration
1035file. These support files have to be listed using the @code{EXTRA_DIST}
1036directive in order to ensure that they are included in the distribution.
1037
1038Example:
1039
1040@example
1041EXTRA_DIST = test_foo_data.conf
1042@end example
1043
1044Executing @code{make check} will run all testcases in the current
1045directory and all subdirectories. Testcases can be compiled individually
1046by running @code{make test_foo} and then invoked directly using
1047@code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
1048typically necessary to run @code{make install} before running any
1049testcases. Thus the canonical command @code{make check install} has to be
1050changed to @code{make install check} for GNUnet.
1051
1052@cindex TESTING library
1053@node GNUnet's TESTING library
1054@section GNUnet's TESTING library
1055
1056The TESTING library is used for writing testcases which involve starting a
1057single or multiple peers. While peers can also be started by testcases
1058using the ARM subsystem, using TESTING library provides an elegant way to
1059do this. The configurations of the peers are auto-generated from a given
1060template to have non-conflicting port numbers ensuring that peers'
1061services do not run into bind errors. This is achieved by testing ports'
1062availability by binding a listening socket to them before allocating them
1063to services in the generated configurations.
1064
1065An another advantage while using TESTING is that it shortens the testcase
1066startup time as the hostkeys for peers are copied from a pre-computed set
1067of hostkeys instead of generating them at peer startup which may take a
1068considerable amount of time when starting multiple peers or on an embedded
1069processor.
1070
1071TESTING also allows for certain services to be shared among peers. This
1072feature is invaluable when testing with multiple peers as it helps to
1073reduce the number of services run per each peer and hence the total
1074number of processes run per testcase.
1075
1076TESTING library only handles creating, starting and stopping peers.
1077Features useful for testcases such as connecting peers in a topology are
1078not available in TESTING but are available in the TESTBED subsystem.
1079Furthermore, TESTING only creates peers on the localhost, however by
1080using TESTBED testcases can benefit from creating peers across multiple
1081hosts.
1082
1083@menu
1084* API::
1085* Finer control over peer stop::
1086* Helper functions::
1087* Testing with multiple processes::
1088@end menu
1089
1090@cindex TESTING API
1091@node API
1092@subsection API
1093
1094TESTING abstracts a group of peers as a TESTING system. All peers in a
1095system have common hostname and no two services of these peers have a
1096same port or a UNIX domain socket path.
1097
1098TESTING system can be created with the function
1099@code{GNUNET_TESTING_system_create()} which returns a handle to the
1100system. This function takes a directory path which is used for generating
1101the configurations of peers, an IP address from which connections to the
1102peers' services should be allowed, the hostname to be used in peers'
1103configuration, and an array of shared service specifications of type
1104@code{struct GNUNET_TESTING_SharedService}.
1105
1106The shared service specification must specify the name of the service to
1107share, the configuration pertaining to that shared service and the
1108maximum number of peers that are allowed to share a single instance of
1109the shared service.
1110
1111TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1112ports from the default range @code{12000} - @code{56000} while
1113auto-generating configurations for peers.
1114This range can be customised with the function
1115@code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1116similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1117additional parameters --- the start and end of the port range to use.
1118
1119A TESTING system is destroyed with the funciton
1120@code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1121the system and a flag to remove the files created in the directory used
1122to generate configurations.
1123
1124A peer is created with the function
1125@code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1126handle, a configuration template from which the configuration for the peer
1127is auto-generated and the index from where the hostkey for the peer has to
1128be copied from. When successfull, this function returs a handle to the
1129peer which can be used to start and stop it and to obtain the identity of
1130the peer. If unsuccessful, a NULL pointer is returned with an error
1131message. This function handles the generated configuration to have
1132non-conflicting ports and paths.
1133
1134Peers can be started and stopped by calling the functions
1135@code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1136respectively. A peer can be destroyed by calling the function
1137@code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1138and paths in allocated in its configuration are reclaimed for usage in new
1139peers.
1140
1141@c ***********************************************************************
1142@node Finer control over peer stop
1143@subsection Finer control over peer stop
1144
1145Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1146However, calling this function for each peer is inefficient when trying to
1147shutdown multiple peers as this function sends the termination signal to
1148the given peer process and waits for it to terminate. It would be faster
1149in this case to send the termination signals to the peers first and then
1150wait on them. This is accomplished by the functions
1151@code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1152peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1153the peer.
1154
1155Further finer control can be achieved by choosing to stop a peer
1156asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1157This function takes a callback parameter and a closure for it in addition
1158to the handle to the peer to stop. The callback function is called with
1159the given closure when the peer is stopped. Using this function
1160eliminates blocking while waiting for the peer to terminate.
1161
1162An asynchronous peer stop can be cancelled by calling the function
1163@code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1164function does not prevent the peer from terminating if the termination
1165signal has already been sent to it. It does, however, cancels the
1166callback to be called when the peer is stopped.
1167
1168@c ***********************************************************************
1169@node Helper functions
1170@subsection Helper functions
1171
1172Most of the testcases can benefit from an abstraction which configures a
1173peer and starts it. This is provided by the function
1174@code{GNUNET_TESTING_peer_run()}. This function takes the testing
1175directory pathname, a configuration template, a callback and its closure.
1176This function creates a peer in the given testing directory by using the
1177configuration template, starts the peer and calls the given callback with
1178the given closure.
1179
1180The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1181the peer which starts the rest of the configured services. A similar
1182function @code{GNUNET_TESTING_service_run} can be used to just start a
1183single service of a peer. In this case, the peer's ARM service is not
1184started; instead, only the given service is run.
1185
1186@c ***********************************************************************
1187@node Testing with multiple processes
1188@subsection Testing with multiple processes
1189
1190When testing GNUnet, the splitting of the code into a services and clients
1191often complicates testing. The solution to this is to have the testcase
1192fork @code{gnunet-service-arm}, ask it to start the required server and
1193daemon processes and then execute appropriate client actions (to test the
1194client APIs or the core module or both). If necessary, multiple ARM
1195services can be forked using different ports (!) to simulate a network.
1196However, most of the time only one ARM process is needed. Note that on
1197exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1198it the chance to cleanly stop its child processes).
1199
1200The following code illustrates spawning and killing an ARM process from a
1201testcase:
1202
1203@example
1204static void run (void *cls,
1205 char *const *args,
1206 const char *cfgfile,
1207 const struct GNUNET_CONFIGURATION_Handle *cfg) @{
1208 struct GNUNET_OS_Process *arm_pid;
1209 arm_pid = GNUNET_OS_start_process (NULL,
1210 NULL,
1211 "gnunet-service-arm",
1212 "gnunet-service-arm",
1213 "-c",
1214 cfgname,
1215 NULL);
1216 /* do real test work here */
1217 if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM))
1218 GNUNET_log_strerror
1219 (GNUNET_ERROR_TYPE_WARNING, "kill");
1220 GNUNET_assert (GNUNET_OK == GNUNET_OS_process_wait (arm_pid));
1221 GNUNET_OS_process_close (arm_pid); @}
1222
1223GNUNET_PROGRAM_run (argc, argv,
1224 "NAME-OF-TEST",
1225 "nohelp",
1226 options,
1227 &run,
1228 cls);
1229@end example
1230
1231
1232An alternative way that works well to test plugins is to implement a
1233mock-version of the environment that the plugin expects and then to
1234simply load the plugin directly.
1235
1236@c ***********************************************************************
1237@node Performance regression analysis with Gauger
1238@section Performance regression analysis with Gauger
1239
1240To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1241simple logging tool that allows remote hosts to send performance data to
1242a central server, where this data can be analyzed and visualized. Gauger
1243shows graphs of the repository revisions and the performace data recorded
1244for each revision, so sudden performance peaks or drops can be identified
1245and linked to a specific revision number.
1246
1247In the case of GNUnet, the buildbots log the performance data obtained
1248during the tests after each build. The data can be accesed on GNUnet's
1249Gauger page.
1250
1251The menu on the left allows to select either the results of just one
1252build bot (under "Hosts") or review the data from all hosts for a given
1253test result (under "Metrics"). In case of very different absolute value
1254of the results, for instance arm vs. amd64 machines, the option
1255"Normalize" on a metric view can help to get an idea about the
1256performance evolution across all hosts.
1257
1258Using Gauger in GNUnet and having the performance of a module tracked over
1259time is very easy. First of course, the testcase must generate some
1260consistent metric, which makes sense to have logged. Highly volatile or
1261random dependant metrics probably are not ideal candidates for meaningful
1262regression detection.
1263
1264To start logging any value, just include @code{gauger.h} in your testcase
1265code. Then, use the macro @code{GAUGER()} to make the Buildbots log
1266whatever value is of interest for you to @code{gnunet.org}'s Gauger
1267server. No setup is necessary as most Buildbots have already everything
1268in place and new metrics are created on demand. To delete a metric, you
1269need to contact a member of the GNUnet development team (a file will need
1270to be removed manually from the respective directory).
1271
1272The code in the test should look like this:
1273
1274@example
1275[other includes]
1276#include <gauger.h>
1277
1278int main (int argc, char *argv[]) @{
1279
1280 [run test, generate data]
1281 GAUGER("YOUR_MODULE",
1282 "METRIC_NAME",
1283 (float)value,
1284 "UNIT"); @}
1285@end example
1286
1287Where:
1288
1289@table @asis
1290
1291@item @strong{YOUR_MODULE} is a category in the gauger page and should be
1292the name of the module or subsystem like "Core" or "DHT"
1293@item @strong{METRIC} is
1294the name of the metric being collected and should be concise and
1295descriptive, like "PUT operations in sqlite-datastore".
1296@item @strong{value} is the value
1297of the metric that is logged for this run.
1298@item @strong{UNIT} is the unit in
1299which the value is measured, for instance "kb/s" or "kb of RAM/node".
1300@end table
1301
1302If you wish to use Gauger for your own project, you can grab a copy of the
1303latest stable release or check out Gauger's Subversion repository.
1304
1305@cindex TESTBED Subsystem
1306@node GNUnet's TESTBED Subsystem
1307@section GNUnet's TESTBED Subsystem
1308
1309The TESTBED subsystem facilitates testing and measuring of multi-peer
1310deployments on a single host or over multiple hosts.
1311
1312The architecture of the testbed module is divided into the following:
1313@itemize @bullet
1314
1315@item Testbed API: An API which is used by the testing driver programs. It
1316provides with functions for creating, destroying, starting, stopping
1317peers, etc.
1318
1319@item Testbed service (controller): A service which is started through the
1320Testbed API. This service handles operations to create, destroy, start,
1321stop peers, connect them, modify their configurations.
1322
1323@item Testbed helper: When a controller has to be started on a host, the
1324testbed API starts the testbed helper on that host which in turn starts
1325the controller. The testbed helper receives a configuration for the
1326controller through its stdin and changes it to ensure the controller
1327doesn't run into any port conflict on that host.
1328@end itemize
1329
1330
1331The testbed service (controller) is different from the other GNUnet
1332services in that it is not started by ARM and is not supposed to be run
1333as a daemon. It is started by the testbed API through a testbed helper.
1334In a typical scenario involving multiple hosts, a controller is started
1335on each host. Controllers take up the actual task of creating peers,
1336starting and stopping them on the hosts they run.
1337
1338While running deployments on a single localhost the testbed API starts the
1339testbed helper directly as a child process. When running deployments on
1340remote hosts the testbed API starts Testbed Helpers on each remote host
1341through remote shell. By default testbed API uses SSH as a remote shell.
1342This can be changed by setting the environmental variable
1343GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1344variable can also contain parameters which are to be passed to the remote
1345shell program. For e.g:
1346
1347@example
1348export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1349-o NoHostAuthenticationForLocalhost=yes %h"@
1350@end example
1351
1352Substitutions are allowed int the above command string also allows for
1353substitions. through placemarks which begin with a `%'. At present the
1354following substitutions are supported
1355
1356@itemize @bullet
1357@item
1358%h: hostname
1359@item
1360%u: username
1361@item
1362%p: port
1363@end itemize
1364
1365Note that the substitution placemark is replaced only when the
1366corresponding field is available and only once. Specifying
1367@example
1368%u@atchar{}%h
1369@end example
1370doesn't work either.
1371If you want to user username substitutions for SSH
1372use the argument @code{-l} before the username substitution.
1373For exmaple:
1374@example
1375ssh -l %u -p %p %h
1376@end example
1377
1378The testbed API and the helper communicate through the helpers stdin and
1379stdout. As the helper is started through a remote shell on remote hosts
1380any output messages from the remote shell interfere with the communication
1381and results in a failure while starting the helper. For this reason, it is
1382suggested to use flags to make the remote shells produce no output
1383messages and to have password-less logins. The default remote shell, SSH,
1384the default options are:
1385
1386@example
1387-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1388@end example
1389
1390Password-less logins should be ensured by using SSH keys.
1391
1392Since the testbed API executes the remote shell as a non-interactive
1393shell, certain scripts like .bashrc, .profiler may not be executed. If
1394this is the case testbed API can be forced to execute an interactive
1395shell by setting up the environmental variable
1396@code{GNUNET_TESTBED_RSH_CMD_SUFFIX} to a shell program.
1397
1398An example could be:
1399
1400@example
1401export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1402@end example
1403
1404The testbed API will then execute the remote shell program as:
1405
1406@example
1407$GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1408gnunet-helper-testbed
1409@end example
1410
1411On some systems, problems may arise while starting testbed helpers if
1412GNUnet is installed into a custom location since the helper may not be
1413found in the standard path. This can be addressed by setting the variable
1414`@code{HELPER_BINARY_PATH}' to the path of the testbed helper.
1415Testbed API will then use this path to start helper binaries both
1416locally and remotely.
1417
1418Testbed API can accessed by including the
1419"@file{gnunet_testbed_service.h}" file and linking with -lgnunettestbed.
1420
1421@c ***********************************************************************
1422@menu
1423* Supported Topologies::
1424* Hosts file format::
1425* Topology file format::
1426* Testbed Barriers::
1427* Automatic large-scale deployment in the PlanetLab testbed::
1428* TESTBED Caveats::
1429@end menu
1430
1431@node Supported Topologies
1432@subsection Supported Topologies
1433
1434While testing multi-peer deployments, it is often needed that the peers
1435are connected in some topology. This requirement is addressed by the
1436function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1437two peers in the testbed.
1438
1439The API also provides a helper function
1440@code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1441of peers in any of the following supported topologies:
1442
1443@itemize @bullet
1444
1445@item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1446each other
1447
1448@item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1449line
1450
1451@item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1452ring topology
1453
1454@item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1455form a 2 dimensional torus topology. The number of peers may not be a
1456perfect square, in that case the resulting torus may not have the uniform
1457poloidal and toroidal lengths
1458
1459@item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1460to form a random graph. The number of links to be present should be given
1461
1462@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1463form a 2D Torus with some random links among them. The number of random
1464links are to be given
1465
1466@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1467connected to form a ring with some random links among them. The number of
1468random links are to be given
1469
1470@item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1471topology where peer connectivity follows power law - new peers are
1472connected with high probabililty to well connected peers.
1473@footnote{See Emergence of Scaling in Random Networks. Science 286,
1474509-512, 1999
1475(@uref{https://gnunet.org/git/bibliography.git/plain/docs/emergence_of_scaling_in_random_networks__barabasi_albert_science_286__1999.pdf, pdf})}
1476
1477@item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1478is loaded from a file. The path to the file has to be given.
1479@xref{Topology file format}, for the format of this file.
1480
1481@item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1482@end itemize
1483
1484
1485The above supported topologies can be specified respectively by setting
1486the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1487configuration passed to Testbed API functions
1488@code{GNUNET_TESTBED_test_run()} and
1489@code{GNUNET_TESTBED_run()}:
1490
1491@itemize @bullet
1492@item @code{CLIQUE}
1493@item @code{RING}
1494@item @code{LINE}
1495@item @code{2D_TORUS}
1496@item @code{RANDOM}
1497@item @code{SMALL_WORLD}
1498@item @code{SMALL_WORLD_RING}
1499@item @code{SCALE_FREE}
1500@item @code{FROM_FILE}
1501@item @code{NONE}
1502@end itemize
1503
1504
1505Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1506require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1507random links to be generated in the configuration. The option will be
1508ignored for the rest of the topologies.
1509
1510Topology @code{SCALE_FREE} requires the options
1511@code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1512which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1513how many peers a peer should be atleast connected to.
1514
1515Similarly, the topology @code{FROM_FILE} requires the option
1516@code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1517the topology information. This option is ignored for the rest of the
1518topologies. @xref{Topology file format}, for the format of this file.
1519
1520@c ***********************************************************************
1521@node Hosts file format
1522@subsection Hosts file format
1523
1524The testbed API offers the function
1525@code{GNUNET_TESTBED_hosts_load_from_file()} to load from a given file
1526details about the hosts which testbed can use for deploying peers.
1527This function is useful to keep the data about hosts
1528separate instead of hard coding them in code.
1529
1530Another helper function from testbed API, @code{GNUNET_TESTBED_run()}
1531also takes a hosts file name as its parameter. It uses the above
1532function to populate the hosts data structures and start controllers to
1533deploy peers.
1534
1535These functions require the hosts file to be of the following format:
1536@itemize @bullet
1537@item Each line is interpreted to have details about a host
1538@item Host details should include the username to use for logging into the
1539host, the hostname of the host and the port number to use for the remote
1540shell program. All thee values should be given.
1541@item These details should be given in the following format:
1542@example
1543<username>@@<hostname>:<port>
1544@end example
1545@end itemize
1546
1547Note that having canonical hostnames may cause problems while resolving
1548the IP addresses (See this bug). Hence it is advised to provide the hosts'
1549IP numerical addresses as hostnames whenever possible.
1550
1551@c ***********************************************************************
1552@node Topology file format
1553@subsection Topology file format
1554
1555A topology file describes how peers are to be connected. It should adhere
1556to the following format for testbed to parse it correctly.
1557
1558Each line should begin with the target peer id. This should be followed by
1559a colon(`:') and origin peer ids seperated by `|'. All spaces except for
1560newline characters are ignored. The API will then try to connect each
1561origin peer to the target peer.
1562
1563For example, the following file will result in 5 overlay connections:
1564[2->1], [3->1],[4->3], [0->3], [2->0]@
1565@code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
1566
1567@c ***********************************************************************
1568@node Testbed Barriers
1569@subsection Testbed Barriers
1570
1571The testbed subsystem's barriers API facilitates coordination among the
1572peers run by the testbed and the experiment driver. The concept is
1573similar to the barrier synchronisation mechanism found in parallel
1574programming or multi-threading paradigms - a peer waits at a barrier upon
1575reaching it until the barrier is reached by a predefined number of peers.
1576This predefined number of peers required to cross a barrier is also called
1577quorum. We say a peer has reached a barrier if the peer is waiting for the
1578barrier to be crossed. Similarly a barrier is said to be reached if the
1579required quorum of peers reach the barrier. A barrier which is reached is
1580deemed as crossed after all the peers waiting on it are notified.
1581
1582The barriers API provides the following functions:
1583@itemize @bullet
1584@item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
1585initialse a barrier in the experiment
1586@item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
1587a barrier which has been initialised before
1588@item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
1589barrier service that the caller has reached a barrier and is waiting for
1590it to be crossed
1591@item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
1592stop waiting for a barrier to be crossed
1593@end itemize
1594
1595
1596Among the above functions, the first two, namely
1597@code{GNUNET_TESTBED_barrier_init()} and
1598@code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
1599barriers should be initialised by the experiment driver by calling
1600@code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
1601identify the barrier, the quorum required for the barrier to be crossed
1602and a notification callback for notifying the experiment driver when the
1603barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
1604initialised barrier and frees the resources allocated for it. This
1605function can be called upon a initialised barrier before it is crossed.
1606
1607The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
1608@code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
1609processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
1610barrier service running on the same host the peer is running on and
1611registers that the caller has reached the barrier and is waiting for the
1612barrier to be crossed. Note that this function can only be used by peers
1613which are started by testbed as this function tries to access the local
1614barrier service which is part of the testbed controller service. Calling
1615@code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
1616in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
1617notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
1618
1619
1620@c ***********************************************************************
1621@menu
1622* Implementation::
1623@end menu
1624
1625@node Implementation
1626@subsubsection Implementation
1627
1628Since barriers involve coordination between experiment driver and peers,
1629the barrier service in the testbed controller is split into two
1630components. The first component responds to the message generated by the
1631barrier API used by the experiment driver (functions
1632@code{GNUNET_TESTBED_barrier_init()} and
1633@code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
1634messages generated by barrier API used by peers (functions
1635@code{GNUNET_TESTBED_barrier_wait()} and
1636@code{GNUNET_TESTBED_barrier_wait_cancel()}).
1637
1638Calling @code{GNUNET_TESTBED_barrier_init()} sends a
1639@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
1640controller. The master controller then registers a barrier and calls
1641@code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
1642way barrier initialisation is propagated to the controller hierarchy.
1643While propagating initialisation, any errors at a subcontroller such as
1644timeout during further propagation are reported up the hierarchy back to
1645the experiment driver.
1646
1647Similar to @code{GNUNET_TESTBED_barrier_init()},
1648@code{GNUNET_TESTBED_barrier_cancel()} propagates
1649@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
1650controllers to remove an initialised barrier.
1651
1652The second component is implemented as a separate service in the binary
1653`gnunet-service-testbed' which already has the testbed controller service.
1654Although this deviates from the gnunet process architecture of having one
1655service per binary, it is needed in this case as this component needs
1656access to barrier data created by the first component. This component
1657responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
1658local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
1659receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
1660service checks if the requested barrier has been initialised before and
1661if it was not initialised, an error status is sent through
1662@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
1663peer and the connection from the peer is terminated. If the barrier is
1664initialised before, the barrier's counter for reached peers is incremented
1665and a notification is registered to notify the peer when the barrier is
1666reached. The connection from the peer is left open.
1667
1668When enough peers required to attain the quorum send
1669@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
1670sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
1671parent informing that the barrier is crossed. If the controller has
1672started further subcontrollers, it delays this message until it receives
1673a similar notification from each of those subcontrollers. Finally, the
1674barriers API at the experiment driver receives the
1675@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
1676reached at all the controllers.
1677
1678The barriers API at the experiment driver responds to the
1679@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
1680back to the master controller and notifying the experiment controller
1681through the notification callback that a barrier has been crossed. The
1682echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
1683propagated by the master controller to the controller hierarchy. This
1684propagation triggers the notifications registered by peers at each of the
1685controllers in the hierarchy. Note the difference between this downward
1686propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
1687message from its upward propagation --- the upward propagation is needed
1688for ensuring that the barrier is reached by all the controllers and the
1689downward propagation is for triggering that the barrier is crossed.
1690
1691@cindex PlanetLab testbed
1692@node Automatic large-scale deployment in the PlanetLab testbed
1693@subsection Automatic large-scale deployment in the PlanetLab testbed
1694
1695PlanetLab is a testbed for computer networking and distributed systems
1696research. It was established in 2002 and as of June 2010 was composed of
16971090 nodes at 507 sites worldwide.
1698
1699To automate the GNUnet we created a set of automation tools to simplify
1700the large-scale deployment. We provide you a set of scripts you can use
1701to deploy GNUnet on a set of nodes and manage your installation.
1702
1703Please also check @uref{https://gnunet.org/installation-fedora8-svn} and
1704@uref{https://gnunet.org/installation-fedora12-svn} to find detailled
1705instructions how to install GNUnet on a PlanetLab node.
1706
1707
1708@c ***********************************************************************
1709@menu
1710* PlanetLab Automation for Fedora8 nodes::
1711* Install buildslave on PlanetLab nodes running fedora core 8::
1712* Setup a new PlanetLab testbed using GPLMT::
1713* Why do i get an ssh error when using the regex profiler?::
1714@end menu
1715
1716@node PlanetLab Automation for Fedora8 nodes
1717@subsubsection PlanetLab Automation for Fedora8 nodes
1718
1719@c ***********************************************************************
1720@node Install buildslave on PlanetLab nodes running fedora core 8
1721@subsubsection Install buildslave on PlanetLab nodes running fedora core 8
1722@c ** Actually this is a subsubsubsection, but must be fixed differently
1723@c ** as subsubsection is the lowest.
1724
1725Since most of the PlanetLab nodes are running the very old Fedora core 8
1726image, installing the buildslave software is quite some pain. For our
1727PlanetLab testbed we figured out how to install the buildslave software
1728best.
1729
1730@c This is a vvery terrible way to suggest installing software.
1731@c FIXME: Is there an official, safer way instead of blind-piping a
1732@c script?
1733@c FIXME: Use newer pypi URLs below.
1734Install Distribute for Python:
1735
1736@example
1737curl http://python-distribute.org/distribute_setup.py | sudo python
1738@end example
1739
1740Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not
1741work):
1742
1743@example
1744export PYPI=@value{PYPI-URL}
1745wget $PYPI/z/zope.interface/zope.interface-3.8.0.tar.gz
1746tar zvfz zope.interface-3.8.0.tar.gz
1747cd zope.interface-3.8.0
1748sudo python setup.py install
1749@end example
1750
1751Install the buildslave software (0.8.6 was the latest version):
1752
1753@example
1754export GCODE="http://buildbot.googlecode.com/files"
1755wget $GCODE/buildbot-slave-0.8.6p1.tar.gz
1756tar xvfz buildbot-slave-0.8.6p1.tar.gz
1757cd buildslave-0.8.6p1
1758sudo python setup.py install
1759@end example
1760
1761The setup will download the matching twisted package and install it.
1762It will also try to install the latest version of zope.interface which
1763will fail to install. Buildslave will work anyway since version 3.8.0
1764was installed before!
1765
1766@c ***********************************************************************
1767@node Setup a new PlanetLab testbed using GPLMT
1768@subsubsection Setup a new PlanetLab testbed using GPLMT
1769
1770@itemize @bullet
1771@item Get a new slice and assign nodes
1772Ask your PlanetLab PI to give you a new slice and assign the nodes you
1773need
1774@item Install a buildmaster
1775You can stick to the buildbot documentation:@
1776@uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
1777@item Install the buildslave software on all nodes
1778To install the buildslave on all nodes assigned to your slice you can use
1779the tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
1780
1781@example
1782./gplmt.py -c contrib/tumple_gnunet.conf -t \
1783contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>
1784@end example
1785
1786@item Create the buildmaster configuration and the slave setup commands
1787
1788The master and the and the slaves have need to have credentials and the
1789master has to have all nodes configured. This can be done with the
1790@file{create_buildbot_configuration.py} script in the @file{scripts}
1791directory.
1792
1793This scripts takes a list of nodes retrieved directly from PlanetLab or
1794read from a file and a configuration template and creates:
1795
1796@itemize @bullet
1797@item a tasklist which can be executed with gplmt to setup the slaves
1798@item a master.cfg file containing a PlanetLab nodes
1799@end itemize
1800
1801A configuration template is included in the <contrib>, most important is
1802that the script replaces the following tags in the template:
1803
1804%GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
1805%GPLMT_SCHEDULER_BUILDERS
1806
1807Create configuration for all nodes assigned to a slice:
1808
1809@example
1810./create_buildbot_configuration.py -u <planetlab username> \
1811-p <planetlab password> -s <slice> -m <buildmaster+port> \
1812-t <template>
1813@end example
1814
1815Create configuration for some nodes in a file:
1816
1817@example
1818./create_buildbot_configuration.p -f <node_file> \
1819-m <buildmaster+port> -t <template>
1820@end example
1821
1822@item Copy the @file{master.cfg} to the buildmaster and start it
1823Use @code{buildbot start <basedir>} to start the server
1824@item Setup the buildslaves
1825@end itemize
1826
1827@c ***********************************************************************
1828@node Why do i get an ssh error when using the regex profiler?
1829@subsubsection Why do i get an ssh error when using the regex profiler?
1830
1831Why do i get an ssh error "Permission denied (publickey,password)." when
1832using the regex profiler although passwordless ssh to localhost works
1833using publickey and ssh-agent?
1834
1835You have to generate a public/private-key pair with no password:@
1836@code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
1837and then add the following to your ~/.ssh/config file:
1838
1839@code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
1840
1841now make sure your hostsfile looks like
1842
1843@example
1844[USERNAME]@@127.0.0.1:22@
1845[USERNAME]@@127.0.0.1:22
1846@end example
1847
1848You can test your setup by running @code{ssh 127.0.0.1} in a
1849terminal and then in the opened session run it again.
1850If you were not asked for a password on either login,
1851then you should be good to go.
1852
1853@cindex TESTBED Caveats
1854@node TESTBED Caveats
1855@subsection TESTBED Caveats
1856
1857This section documents a few caveats when using the GNUnet testbed
1858subsystem.
1859
1860@c ***********************************************************************
1861@menu
1862* CORE must be started::
1863* ATS must want the connections::
1864@end menu
1865
1866@node CORE must be started
1867@subsubsection CORE must be started
1868
1869A simple issue is #3993: Your configuration MUST somehow ensure that for
1870each peer the CORE service is started when the peer is setup, otherwise
1871TESTBED may fail to connect peers when the topology is initialized, as
1872TESTBED will start some CORE services but not necessarily all (but it
1873relies on all of them running). The easiest way is to set
1874'FORCESTART = YES' in the '[core]' section of the configuration file.
1875Alternatively, having any service that directly or indirectly depends on
1876CORE being started with FORCESTART will also do. This issue largely arises
1877if users try to over-optimize by not starting any services with
1878FORCESTART.
1879
1880@c ***********************************************************************
1881@node ATS must want the connections
1882@subsubsection ATS must want the connections
1883
1884When TESTBED sets up connections, it only offers the respective HELLO
1885information to the TRANSPORT service. It is then up to the ATS service to
1886@strong{decide} to use the connection. The ATS service will typically
1887eagerly establish any connection if the number of total connections is
1888low (relative to bandwidth). Details may further depend on the
1889specific ATS backend that was configured. If ATS decides to NOT establish
1890a connection (even though TESTBED provided the required information), then
1891that connection will count as failed for TESTBED. Note that you can
1892configure TESTBED to tolerate a certain number of connection failures
1893(see '-e' option of gnunet-testbed-profiler). This issue largely arises
1894for dense overlay topologies, especially if you try to create cliques
1895with more than 20 peers.
1896
1897@cindex libgnunetutil
1898@node libgnunetutil
1899@section libgnunetutil
1900
1901libgnunetutil is the fundamental library that all GNUnet code builds upon.
1902Ideally, this library should contain most of the platform dependent code
1903(except for user interfaces and really special needs that only few
1904applications have). It is also supposed to offer basic services that most
1905if not all GNUnet binaries require. The code of libgnunetutil is in the
1906@file{src/util/} directory. The public interface to the library is in the
1907gnunet_util.h header. The functions provided by libgnunetutil fall
1908roughly into the following categories (in roughly the order of importance
1909for new developers):
1910
1911@itemize @bullet
1912@item logging (common_logging.c)
1913@item memory allocation (common_allocation.c)
1914@item endianess conversion (common_endian.c)
1915@item internationalization (common_gettext.c)
1916@item String manipulation (string.c)
1917@item file access (disk.c)
1918@item buffered disk IO (bio.c)
1919@item time manipulation (time.c)
1920@item configuration parsing (configuration.c)
1921@item command-line handling (getopt*.c)
1922@item cryptography (crypto_*.c)
1923@item data structures (container_*.c)
1924@item CPS-style scheduling (scheduler.c)
1925@item Program initialization (program.c)
1926@item Networking (network.c, client.c, server*.c, service.c)
1927@item message queueing (mq.c)
1928@item bandwidth calculations (bandwidth.c)
1929@item Other OS-related (os*.c, plugin.c, signal.c)
1930@item Pseudonym management (pseudonym.c)
1931@end itemize
1932
1933It should be noted that only developers that fully understand this entire
1934API will be able to write good GNUnet code.
1935
1936Ideally, porting GNUnet should only require porting the gnunetutil
1937library. More testcases for the gnunetutil APIs are therefore a great
1938way to make porting of GNUnet easier.
1939
1940@menu
1941* Logging::
1942* Interprocess communication API (IPC)::
1943* Cryptography API::
1944* Message Queue API::
1945* Service API::
1946* Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
1947* The CONTAINER_MDLL API::
1948@end menu
1949
1950@cindex Logging
1951@cindex log levels
1952@node Logging
1953@subsection Logging
1954
1955GNUnet is able to log its activity, mostly for the purposes of debugging
1956the program at various levels.
1957
1958@file{gnunet_common.h} defines several @strong{log levels}:
1959@table @asis
1960
1961@item ERROR for errors (really problematic situations, often leading to
1962crashes)
1963@item WARNING for warnings (troubling situations that might have
1964negative consequences, although not fatal)
1965@item INFO for various information.
1966Used somewhat rarely, as GNUnet statistics is used to hold and display
1967most of the information that users might find interesting.
1968@item DEBUG for debugging.
1969Does not produce much output on normal builds, but when extra logging is
1970enabled at compile time, a staggering amount of data is outputted under
1971this log level.
1972@end table
1973
1974
1975Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
1976are supposed to log nothing under DEBUG level. The
1977@code{--enable-logging=verbose} configure option can be used to create a
1978build with all logging enabled. However, such build will produce large
1979amounts of log data, which is inconvenient when one tries to hunt down a
1980specific problem.
1981
1982To mitigate this problem, GNUnet provides facilities to apply a filter to
1983reduce the logs:
1984@table @asis
1985
1986@item Logging by default When no log levels are configured in any other
1987way (see below), GNUnet will default to the WARNING log level. This
1988mostly applies to GNUnet command line utilities, services and daemons;
1989tests will always set log level to WARNING or, if
1990@code{--enable-logging=verbose} was passed to configure, to DEBUG. The
1991default level is suggested for normal operation.
1992@item The -L option Most GNUnet executables accept an "-L loglevel" or
1993"--log=loglevel" option. If used, it makes the process set a global log
1994level to "loglevel". Thus it is possible to run some processes
1995with -L DEBUG, for example, and others with -L ERROR to enable specific
1996settings to diagnose problems with a particular process.
1997@item Configuration files. Because GNUnet
1998service and deamon processes are usually launched by gnunet-arm, it is not
1999possible to pass different custom command line options directly to every
2000one of them. The options passed to @code{gnunet-arm} only affect
2001gnunet-arm and not the rest of GNUnet. However, one can specify a
2002configuration key "OPTIONS" in the section that corresponds to a service
2003or a daemon, and put a value of "-L loglevel" there. This will make the
2004respective service or daemon set its log level to "loglevel" (as the
2005value of OPTIONS will be passed as a command-line argument).
2006
2007To specify the same log level for all services without creating separate
2008"OPTIONS" entries in the configuration for each one, the user can specify
2009a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
2010file. The value of GLOBAL_POSTFIX will be appended to all command lines
2011used by the ARM service to run other services. It can contain any option
2012valid for all GNUnet commands, thus in particular the "-L loglevel"
2013option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
2014to set log level for it, one has to specify "OPTIONS" key in the [arm]
2015section.
2016@item Environment variables.
2017Setting global per-process log levels with "-L loglevel" does not offer
2018sufficient log filtering granularity, as one service will call interface
2019libraries and supporting libraries of other GNUnet services, potentially
2020producing lots of debug log messages from these libraries. Also, changing
2021the config file is not always convenient (especially when running the
2022GNUnet test suite).@ To fix that, and to allow GNUnet to use different
2023log filtering at runtime without re-compiling the whole source tree, the
2024log calls were changed to be configurable at run time. To configure them
2025one has to define environment variables "GNUNET_FORCE_LOGFILE",
2026"GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
2027@itemize @bullet
2028
2029@item "GNUNET_LOG" only affects the logging when no global log level is
2030configured by any other means (that is, the process does not explicitly
2031set its own log level, there are no "-L loglevel" options on command line
2032or in configuration files), and can be used to override the default
2033WARNING log level.
2034
2035@item "GNUNET_FORCE_LOG" will completely override any other log
2036configuration options given.
2037
2038@item "GNUNET_FORCE_LOGFILE" will completely override the location of the
2039file to log messages to. It should contain a relative or absolute file
2040name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
2041"--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
2042format in file names, but not "@{@}" (see below).
2043@end itemize
2044
2045
2046Because environment variables are inherited by child processes when they
2047are launched, starting or re-starting the ARM service with these
2048variables will propagate them to all other services.
2049
2050"GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
2051formatted @strong{logging definition} string, which looks like this:@
2052
2053@c FIXME: Can we close this with [/component] instead?
2054@example
2055[component];[file];[function];[from_line[-to_line]];loglevel[/component...]
2056@end example
2057
2058That is, a logging definition consists of definition entries, separated by
2059slashes ('/'). If only one entry is present, there is no need to add a
2060slash to its end (although it is not forbidden either).@ All definition
2061fields (component, file, function, lines and loglevel) are mandatory, but
2062(except for the loglevel) they can be empty. An empty field means
2063"match anything". Note that even if fields are empty, the semicolon (';')
2064separators must be present.@ The loglevel field is mandatory, and must
2065contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
2066The lines field might contain one non-negative number, in which case it
2067matches only one line, or a range "from_line-to_line", in which case it
2068matches any line in the interval [from_line;to_line] (that is, including
2069both start and end line).@ GNUnet mostly defaults component name to the
2070name of the service that is implemented in a process ('transport',
2071'core', 'peerinfo', etc), but logging calls can specify custom component
2072names using @code{GNUNET_log_from}.@ File name and function name are
2073provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
2074
2075Component, file and function fields are interpreted as non-extended
2076regular expressions (GNU libc regex functions are used). Matching is
2077case-sensitive, "^" and "$" will match the beginning and the end of the
2078text. If a field is empty, its contents are automatically replaced with
2079a ".*" regular expression, which matches anything. Matching is done in
2080the default way, which means that the expression matches as long as it's
2081contained anywhere in the string. Thus "GNUNET_" will match both
2082"GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
2083the expression matches at the start and/or at the end of the string.
2084The semicolon (';') can't be escaped, and GNUnet will not use it in
2085component names (it can't be used in function names and file names
2086anyway).
2087
2088@end table
2089
2090
2091Every logging call in GNUnet code will be (at run time) matched against
2092the log definitions passed to the process. If a log definition fields are
2093matching the call arguments, then the call log level is compared the the
2094log level of that definition. If the call log level is less or equal to
2095the definition log level, the call is allowed to proceed. Otherwise the
2096logging call is forbidden, and nothing is logged. If no definitions
2097matched at all, GNUnet will use the global log level or (if a global log
2098level is not specified) will default to WARNING (that is, it will allow
2099the call to proceed, if its level is less or equal to the global log
2100level or to WARNING).
2101
2102That is, definitions are evaluated from left to right, and the first
2103matching definition is used to allow or deny the logging call. Thus it is
2104advised to place narrow definitions at the beginning of the logdef
2105string, and generic definitions - at the end.
2106
2107Whether a call is allowed or not is only decided the first time this
2108particular call is made. The evaluation result is then cached, so that
2109any attempts to make the same call later will be allowed or disallowed
2110right away. Because of that runtime log level evaluation should not
2111significantly affect the process performance.
2112Log definition parsing is only done once, at the first call to
2113GNUNET_log_setup () made by the process (which is usually done soon after
2114it starts).
2115
2116At the moment of writing there is no way to specify logging definitions
2117from configuration files, only via environment variables.
2118
2119At the moment GNUnet will stop processing a log definition when it
2120encounters an error in definition formatting or an error in regular
2121expression syntax, and will not report the failure in any way.
2122
2123
2124@c ***********************************************************************
2125@menu
2126* Examples::
2127* Log files::
2128* Updated behavior of GNUNET_log::
2129@end menu
2130
2131@node Examples
2132@subsubsection Examples
2133
2134@table @asis
2135
2136@item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2137process tree, running all processes with DEBUG level (one should be
2138careful with it, as log files will grow at alarming rate!)
2139@item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2140process tree, running the core service under DEBUG level (everything else
2141will use configured or default level).
2142
2143@item Start GNUnet process tree, allowing any logging calls from
2144gnunet-service-transport_validation.c (everything else will use
2145configured or default level).
2146
2147@example
2148GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2149gnunet-arm -s
2150@end example
2151
2152@item Start GNUnet process tree, allowing any logging calls from
2153gnunet-gnunet-service-fs_push.c (everything else will use configured or
2154default level).
2155
2156@example
2157GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2158@end example
2159
2160@item Start GNUnet process tree, allowing any logging calls from the
2161GNUNET_NETWORK_socket_select function (everything else will use
2162configured or default level).
2163
2164@example
2165GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2166@end example
2167
2168@item Start GNUnet process tree, allowing any logging calls from the
2169components that have "transport" in their names, and are made from
2170function that have "send" in their names. Everything else will be allowed
2171to be logged only if it has WARNING level.
2172
2173@example
2174GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2175@end example
2176
2177@end table
2178
2179
2180On Windows, one can use batch files to run GNUnet processes with special
2181environment variables, without affecting the whole system. Such batch
2182file will look like this:
2183
2184@example
2185set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2186@end example
2187
2188(note the absence of double quotes in the environment variable definition,
2189as opposed to earlier examples, which use the shell).
2190Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2191in order to GNUNET_FORCE_LOG to work.
2192
2193
2194@cindex Log files
2195@node Log files
2196@subsubsection Log files
2197
2198GNUnet can be told to log everything into a file instead of stderr (which
2199is the default) using the "--log-file=logfile" or "-l logfile" option.
2200This option can also be passed via command line, or from the "OPTION" and
2201"GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2202with this option is subject to GNUnet filename expansion. If specified in
2203"GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2204in particular, it may contain "@{@}" (left and right curly brace)
2205sequence, which will be replaced by ARM with the name of the service.
2206This is used to keep logs from more than one service separate, while only
2207specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2208
2209As part of a secondary file name expansion, the first occurrence of "[]"
2210sequence ("left square brace" followed by "right square brace") in the
2211file name will be replaced with a process identifier or the process when
2212it initializes its logging subsystem. As a result, all processes will log
2213into different files. This is convenient for isolating messages of a
2214particular process, and prevents I/O races when multiple processes try to
2215write into the file at the same time. This expansion is done
2216independently of "@{@}" expansion that ARM service does (see above).
2217
2218The log file name that is specified via "-l" can contain format characters
2219from the 'strftime' function family. For example, "%Y" will be replaced
2220with the current year. Using "basename-%Y-%m-%d.log" would include the
2221current year, month and day in the log file. If a GNUnet process runs for
2222long enough to need more than one log file, it will eventually clean up
2223old log files. Currently, only the last three log files (plus the current
2224log file) are preserved. So once the fifth log file goes into use (so
2225after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2226automatically deleted. Note that if your log file name only contains "%Y",
2227then log files would be kept for 4 years and the logs from the first year
2228would be deleted once year 5 begins. If you do not use any date-related
2229string format codes, logs would never be automatically deleted by GNUnet.
2230
2231
2232@c ***********************************************************************
2233
2234@node Updated behavior of GNUNET_log
2235@subsubsection Updated behavior of GNUNET_log
2236
2237It's currently quite common to see constructions like this all over the
2238code:
2239
2240@example
2241#if MESH_DEBUG
2242GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2243#endif
2244@end example
2245
2246The reason for the #if is not to avoid displaying the message when
2247disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2248compiler including it in the binary at all, when compiling GNUnet for
2249platforms with restricted storage space / memory (MIPS routers,
2250ARM plug computers / dev boards, etc).
2251
2252This presents several problems: the code gets ugly, hard to write and it
2253is very easy to forget to include the #if guards, creating non-consistent
2254code. A new change in GNUNET_log aims to solve these problems.
2255
2256@strong{This change requires to @file{./configure} with at least
2257@code{--enable-logging=verbose} to see debug messages.}
2258
2259Here is an example of code with dense debug statements:
2260
2261@example
2262switch (restrict_topology) @{
2263case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2264GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2265topology\n")); #endif unblacklisted_connections = create_clique (pg,
2266&remove_connections, BLACKLIST, GNUNET_NO); break; case
2267GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2268(GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2269topology\n")); #endif unblacklisted_connections = create_small_world_ring
2270(pg,&remove_connections, BLACKLIST); break;
2271@end example
2272
2273
2274Pretty hard to follow, huh?
2275
2276From now on, it is not necessary to include the #if / #endif statements to
2277achieve the same behavior. The GNUNET_log and GNUNET_log_from macros take
2278care of it for you, depending on the configure option:
2279
2280@itemize @bullet
2281@item If @code{--enable-logging} is set to @code{no}, the binary will
2282contain no log messages at all.
2283@item If @code{--enable-logging} is set to @code{yes}, the binary will
2284contain no DEBUG messages, and therefore running with -L DEBUG will have
2285no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2286@item If @code{--enable-logging} is set to @code{verbose}, or
2287@code{veryverbose} the binary will contain DEBUG messages (still, it will
2288be neccessary to run with -L DEBUG or set the DEBUG config option to show
2289them).
2290@end itemize
2291
2292
2293If you are a developer:
2294@itemize @bullet
2295@item please make sure that you @code{./configure
2296--enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2297@item please remove the @code{#if} statements around @code{GNUNET_log
2298(GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your
2299code.
2300@end itemize
2301
2302Since now activating DEBUG automatically makes it VERBOSE and activates
2303@strong{all} debug messages by default, you probably want to use the
2304https://gnunet.org/logging functionality to filter only relevant messages.
2305A suitable configuration could be:
2306
2307@example
2308$ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2309@end example
2310
2311Which will behave almost like enabling DEBUG in that subsytem before the
2312change. Of course you can adapt it to your particular needs, this is only
2313a quick example.
2314
2315@cindex Interprocess communication API
2316@cindex ICP
2317@node Interprocess communication API (IPC)
2318@subsection Interprocess communication API (IPC)
2319
2320In GNUnet a variety of new message types might be defined and used in
2321interprocess communication, in this tutorial we use the
2322@code{struct AddressLookupMessage} as a example to introduce how to
2323construct our own message type in GNUnet and how to implement the message
2324communication between service and client.
2325(Here, a client uses the @code{struct AddressLookupMessage} as a request
2326to ask the server to return the address of any other peer connecting to
2327the service.)
2328
2329
2330@c ***********************************************************************
2331@menu
2332* Define new message types::
2333* Define message struct::
2334* Client - Establish connection::
2335* Client - Initialize request message::
2336* Client - Send request and receive response::
2337* Server - Startup service::
2338* Server - Add new handles for specified messages::
2339* Server - Process request message::
2340* Server - Response to client::
2341* Server - Notification of clients::
2342* Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2343@end menu
2344
2345@node Define new message types
2346@subsubsection Define new message types
2347
2348First of all, you should define the new message type in
2349@file{gnunet_protocols.h}:
2350
2351@example
2352 // Request to look addresses of peers in server.
2353#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2354 // Response to the address lookup request.
2355#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2356@end example
2357
2358@c ***********************************************************************
2359@node Define message struct
2360@subsubsection Define message struct
2361
2362After the type definition, the specified message structure should also be
2363described in the header file, e.g. transport.h in our case.
2364
2365@example
2366struct AddressLookupMessage @{
2367 struct GNUNET_MessageHeader header;
2368 int32_t numeric_only GNUNET_PACKED;
2369 struct GNUNET_TIME_AbsoluteNBO timeout;
2370 uint32_t addrlen GNUNET_PACKED;
2371 /* followed by 'addrlen' bytes of the actual address, then
2372 followed by the 0-terminated name of the transport */ @};
2373GNUNET_NETWORK_STRUCT_END
2374@end example
2375
2376
2377Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2378which both ensure correct alignment when sending structs over the network.
2379
2380@menu
2381@end menu
2382
2383@c ***********************************************************************
2384@node Client - Establish connection
2385@subsubsection Client - Establish connection
2386@c %**end of header
2387
2388
2389At first, on the client side, the underlying API is employed to create a
2390new connection to a service, in our example the transport service would be
2391connected.
2392
2393@example
2394struct GNUNET_CLIENT_Connection *client;
2395client = GNUNET_CLIENT_connect ("transport", cfg);
2396@end example
2397
2398@c ***********************************************************************
2399@node Client - Initialize request message
2400@subsubsection Client - Initialize request message
2401@c %**end of header
2402
2403When the connection is ready, we initialize the message. In this step,
2404all the fields of the message should be properly initialized, namely the
2405size, type, and some extra user-defined data, such as timeout, name of
2406transport, address and name of transport.
2407
2408@example
2409struct AddressLookupMessage *msg;
2410size_t len = sizeof (struct AddressLookupMessage)
2411 + addressLen
2412 + strlen (nameTrans)
2413 + 1;
2414msg->header->size = htons (len);
2415msg->header->type = htons
2416(GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP);
2417msg->timeout = GNUNET_TIME_absolute_hton (abs_timeout);
2418msg->addrlen = htonl (addressLen);
2419char *addrbuf = (char *) &msg[1];
2420memcpy (addrbuf, address, addressLen);
2421char *tbuf = &addrbuf[addressLen];
2422memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2423@end example
2424
2425Note that, here the functions @code{htonl}, @code{htons} and
2426@code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2427into big endian, about the usage of the big/small edian order and the
2428corresponding conversion function please refer to Introduction of
2429Big Endian and Little Endian.
2430
2431@c ***********************************************************************
2432@node Client - Send request and receive response
2433@subsubsection Client - Send request and receive response
2434@c %**end of header
2435
2436@b{FIXME: This is very outdated, see the tutorial for the current API!}
2437
2438Next, the client would send the constructed message as a request to the
2439service and wait for the response from the service. To accomplish this
2440goal, there are a number of API calls that can be used. In this example,
2441@code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2442appropriate function to use.
2443
2444@example
2445GNUNET_CLIENT_transmit_and_get_response
2446(client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2447arp_ctx);
2448@end example
2449
2450the argument @code{address_response_processor} is a function with
2451@code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2452reply message from the service.
2453
2454@node Server - Startup service
2455@subsubsection Server - Startup service
2456
2457After receiving the request message, we run a standard GNUnet service
2458startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2459
2460@example
2461int main(int argc, char**argv) @{
2462 GNUNET_SERVICE_run(argc, argv, "transport"
2463 GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2464@end example
2465
2466@c ***********************************************************************
2467@node Server - Add new handles for specified messages
2468@subsubsection Server - Add new handles for specified messages
2469@c %**end of header
2470
2471in the function above the argument @code{run} is used to initiate
2472transport service,and defined like this:
2473
2474@example
2475static void run (void *cls,
2476struct GNUNET_SERVER_Handle *serv,
2477const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2478 GNUNET_SERVER_add_handlers (serv, handlers); @}
2479@end example
2480
2481
2482Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2483function to add new handlers in the service. The parameter
2484@code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2485to tell the service which function should be called when a particular
2486type of message is received, and should be defined in this way:
2487
2488@example
2489static struct GNUNET_SERVER_MessageHandler handlers[] = @{
2490 @{&handle_start,
2491 NULL,
2492 GNUNET_MESSAGE_TYPE_TRANSPORT_START,
2493 0@},
2494 @{&handle_send,
2495 NULL,
2496 GNUNET_MESSAGE_TYPE_TRANSPORT_SEND,
2497 0@},
2498 @{&handle_try_connect,
2499 NULL,
2500 GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT,
2501 sizeof (struct TryConnectMessage)
2502 @},
2503 @{&handle_address_lookup,
2504 NULL,
2505 GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP,
2506 0@},
2507 @{NULL,
2508 NULL,
2509 0,
2510 0@}
2511@};
2512@end example
2513
2514
2515As shown, the first member of the struct in the first area is a callback
2516function, which is called to process the specified message types, given
2517as the third member. The second parameter is the closure for the callback
2518function, which is set to @code{NULL} in most cases, and the last
2519parameter is the expected size of the message of this type, usually we
2520set it to 0 to accept variable size, for special cases the exact size of
2521the specified message also can be set. In addition, the terminator sign
2522depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last aera.
2523
2524@c ***********************************************************************
2525@node Server - Process request message
2526@subsubsection Server - Process request message
2527@c %**end of header
2528
2529After the initialization of transport service, the request message would
2530be processed. Before handling the main message data, the validity of this
2531message should be checked out, e.g., to check whether the size of message
2532is correct.
2533
2534@example
2535size = ntohs (message->size);
2536if (size < sizeof (struct AddressLookupMessage)) @{
2537 GNUNET_break_op (0);
2538 GNUNET_SERVER_receive_done (client, GNUNET_SYSERR);
2539 return; @}
2540@end example
2541
2542
2543Note that, opposite to the construction method of the request message in
2544the client, in the server the function @code{nothl} and @code{ntohs}
2545should be employed during the extraction of the data from the message, so
2546that the data in big endian order can be converted back into little
2547endian order. See more in detail please refer to Introduction of
2548Big Endian and Little Endian.
2549
2550Moreover in this example, the name of the transport stored in the message
2551is a 0-terminated string, so we should also check whether the name of the
2552transport in the received message is 0-terminated:
2553
2554@example
2555nameTransport = (const char *) &address[addressLen];
2556if (nameTransport[size - sizeof
2557 (struct AddressLookupMessage)
2558 - addressLen - 1] != '\0') @{
2559 GNUNET_break_op (0);
2560 GNUNET_SERVER_receive_done (client,
2561 GNUNET_SYSERR);
2562 return; @}
2563@end example
2564
2565Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
2566service that the request is done and can receive the next message. The
2567argument @code{GNUNET_SYSERR} here indicates that the service didn't
2568understand the request message, and the processing of this request would
2569be terminated.
2570
2571In comparison to the aforementioned situation, when the argument is equal
2572to @code{GNUNET_OK}, the service would continue to process the requst
2573message.
2574
2575@c ***********************************************************************
2576@node Server - Response to client
2577@subsubsection Server - Response to client
2578@c %**end of header
2579
2580Once the processing of current request is done, the server should give the
2581response to the client. A new @code{struct AddressLookupMessage} would be
2582produced by the server in a similar way as the client did and sent to the
2583client, but here the type should be
2584@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2585@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2586@example
2587struct AddressLookupMessage *msg;
2588size_t len = sizeof (struct AddressLookupMessage)
2589 + addressLen
2590 + strlen (nameTrans) + 1;
2591msg->header->size = htons (len);
2592msg->header->type = htons
2593 (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2594
2595// ...
2596
2597struct GNUNET_SERVER_TransmitContext *tc;
2598tc = GNUNET_SERVER_transmit_context_create (client);
2599GNUNET_SERVER_transmit_context_append_data
2600(tc,
2601 NULL,
2602 0,
2603 GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2604GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2605@end example
2606
2607
2608Note that, there are also a number of other APIs provided to the service
2609to send the message.
2610
2611@c ***********************************************************************
2612@node Server - Notification of clients
2613@subsubsection Server - Notification of clients
2614@c %**end of header
2615
2616Often a service needs to (repeatedly) transmit notifications to a client
2617or a group of clients. In these cases, the client typically has once
2618registered for a set of events and then needs to receive a message
2619whenever such an event happens (until the client disconnects). The use of
2620a notification context can help manage message queues to clients and
2621handle disconnects. Notification contexts can be used to send
2622individualized messages to a particular client or to broadcast messages
2623to a group of clients. An individualized notification might look like
2624this:
2625
2626@example
2627GNUNET_SERVER_notification_context_unicast(nc,
2628 client,
2629 msg,
2630 GNUNET_YES);
2631@end example
2632
2633
2634Note that after processing the original registration message for
2635notifications, the server code still typically needs to call
2636@code{GNUNET_SERVER_receive_done} so that the client can transmit further
2637messages to the server.
2638
2639@c ***********************************************************************
2640@node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2641@subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2642@c %** subsub? it's a referenced page on the ipc document.
2643@c %**end of header
2644
2645Here we can simply comprehend big endian and little endian as Network Byte
2646Order and Host Byte Order respectively. What is the difference between
2647both two?
2648
2649Usually in our host computer we store the data byte as Host Byte Order,
2650for example, we store a integer in the RAM which might occupies 4 Byte,
2651as Host Byte Order the higher Byte would be stored at the lower address
2652of RAM, and the lower Byte would be stored at the higher address of RAM.
2653However, contrast to this, Network Byte Order just take the totally
2654opposite way to store the data, says, it will store the lower Byte at the
2655lower address, and the higher Byte will stay at higher address.
2656
2657For the current communication of network, we normally exchange the
2658information by surveying the data package, every two host wants to
2659communicate with each other must send and receive data package through
2660network. In order to maintain the identity of data through the
2661transmission in the network, the order of the Byte storage must changed
2662before sending and after receiving the data.
2663
2664There ten convenient functions to realize the conversion of Byte Order in
2665GNUnet, as following:
2666
2667@table @asis
2668
2669@item uint16_t htons(uint16_t hostshort) Convert host byte order to net
2670byte order with short int
2671@item uint32_t htonl(uint32_t hostlong) Convert host byte
2672order to net byte order with long int
2673@item uint16_t ntohs(uint16_t netshort)
2674Convert net byte order to host byte order with short int
2675@item uint32_t
2676ntohl(uint32_t netlong) Convert net byte order to host byte order with
2677long int
2678@item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
2679Convert net byte order to host byte order with long long int
2680@item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
2681Convert host byte order to net byte order with long long int
2682@item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
2683(struct GNUNET_TIME_Relative a) Convert relative time to network byte
2684order.
2685@item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
2686(struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
2687byte order.
2688@item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
2689(struct GNUNET_TIME_Absolute a) Convert relative time to network byte
2690order.
2691@item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
2692(struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
2693byte order.
2694@end table
2695
2696@cindex Cryptography API
2697@node Cryptography API
2698@subsection Cryptography API
2699@c %**end of header
2700
2701The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
2702GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
2703messages by peers and most other public-key operations. Most researchers
2704in cryptography consider 2048 bit RSA keys as secure and practically
2705unbreakable for a long time. The API provides functions to create a fresh
2706key pair, read a private key from a file (or create a new file if the
2707file does not exist), encrypt, decrypt, sign, verify and extraction of
2708the public key into a format suitable for network transmission.
2709
2710For the encryption of files and the actual data exchanged between peers
2711GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
2712for every new connection.@ Again, there is no published technique to
2713break this cipher in any realistic amount of time. The API provides
2714functions for generation of keys, validation of keys (important for
2715checking that decryptions using RSA succeeded), encryption and decryption.
2716
2717GNUnet uses SHA-512 for computing one-way hash codes. The API provides
2718functions to compute a hash over a block in memory or over a file on disk.
2719
2720The crypto API also provides functions for randomizing a block of memory,
2721obtaining a single random number and for generating a permuation of the
2722numbers 0 to n-1. Random number generation distinguishes between WEAK and
2723STRONG random number quality; WEAK random numbers are pseudo-random
2724whereas STRONG random numbers use entropy gathered from the operating
2725system.
2726
2727Finally, the crypto API provides a means to deterministically generate a
27281024-bit RSA key from a hash code. These functions should most likely not
2729be used by most applications; most importantly,
2730GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
2731should be considered secure for traditional applications of RSA.
2732
2733@cindex Message Queue API
2734@node Message Queue API
2735@subsection Message Queue API
2736@c %**end of header
2737
2738@strong{ Introduction }@
2739Often, applications need to queue messages that
2740are to be sent to other GNUnet peers, clients or services. As all of
2741GNUnet's message-based communication APIs, by design, do not allow
2742messages to be queued, it is common to implement custom message queues
2743manually when they are needed. However, writing very similar code in
2744multiple places is tedious and leads to code duplication.
2745
2746MQ (for Message Queue) is an API that provides the functionality to
2747implement and use message queues. We intend to eventually replace all of
2748the custom message queue implementations in GNUnet with MQ.
2749
2750@strong{ Basic Concepts }@
2751The two most important entities in MQ are queues and envelopes.
2752
2753Every queue is backed by a specific implementation (e.g. for mesh, stream,
2754connection, server client, etc.) that will actually deliver the queued
2755messages. For convenience,@ some queues also allow to specify a list of
2756message handlers. The message queue will then also wait for incoming
2757messages and dispatch them appropriately.
2758
2759An envelope holds the the memory for a message, as well as metadata
2760(Where is the envelope queued? What should happen after it has been
2761sent?). Any envelope can only be queued in one message queue.
2762
2763@strong{ Creating Queues }@
2764The following is a list of currently available message queues. Note that
2765to avoid layering issues, message queues for higher level APIs are not
2766part of @code{libgnunetutil}, but@ the respective API itself provides the
2767queue implementation.
2768
2769@table @asis
2770
2771@item @code{GNUNET_MQ_queue_for_connection_client}
2772Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
2773Also supports receiving with message handlers.
2774
2775@item @code{GNUNET_MQ_queue_for_server_client}
2776Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
2777not support incoming message handlers.
2778
2779@item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
2780@code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
2781handlers.
2782
2783@item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
2784implementation. Instead of delivering and receiving messages with one of
2785GNUnet's communication APIs, implementation callbacks are called. Refer to
2786"Implementing Queues" for a more detailed explanation.
2787@end table
2788
2789
2790@strong{ Allocating Envelopes }@
2791A GNUnet message (as defined by the GNUNET_MessageHeader) has three
2792parts: The size, the type, and the body.
2793
2794MQ provides macros to allocate an envelope containing a message
2795conveniently, automatically setting the size and type fields of the
2796message.
2797
2798Consider the following simple message, with the body consisting of a
2799single number value.
2800@c why the empy code function?
2801@code{}
2802
2803@example
2804struct NumberMessage @{
2805 /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
2806 struct GNUNET_MessageHeader header;
2807 uint32_t number GNUNET_PACKED;
2808@};
2809@end example
2810
2811An envelope containing an instance of the NumberMessage can be
2812constructed like this:
2813
2814@example
2815struct GNUNET_MQ_Envelope *ev;
2816struct NumberMessage *msg;
2817ev = GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1);
2818msg->number = htonl (42);
2819@end example
2820
2821In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
2822the newly allocated envelope. The first argument must be a pointer to some
2823@code{struct} containing a @code{struct GNUNET_MessageHeader header}
2824field, while the second argument is the desired message type, in host
2825byte order.
2826
2827The @code{msg} pointer now points to an allocated message, where the
2828message type and the message size are already set. The message's size is
2829inferred from the type of the @code{msg} pointer: It will be set to
2830'sizeof(*msg)', properly converted to network byte order.
2831
2832If the message body's size is dynamic, the the macro
2833@code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
2834message has additional space allocated after the @code{msg} structure.
2835
2836If no structure has been defined for the message,
2837@code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
2838after the message header. The first argument then must be a pointer to a
2839@code{GNUNET_MessageHeader}.
2840
2841@strong{Envelope Properties}@
2842A few functions in MQ allow to set additional properties on envelopes:
2843
2844@table @asis
2845
2846@item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
2847be called once the envelope's message has been sent irrevocably.
2848An envelope can be canceled precisely up to the@ point where the notify
2849sent callback has been called.
2850
2851@item @code{GNUNET_MQ_disable_corking} No corking will be used when
2852sending the message. Not every@ queue supports this flag, per default,
2853envelopes are sent with corking.@
2854
2855@end table
2856
2857
2858@strong{Sending Envelopes}@
2859Once an envelope has been constructed, it can be queued for sending with
2860@code{GNUNET_MQ_send}.
2861
2862Note that in order to avoid memory leaks, an envelope must either be sent
2863(the queue will free it) or destroyed explicitly with
2864@code{GNUNET_MQ_discard}.
2865
2866@strong{Canceling Envelopes}@
2867An envelope queued with @code{GNUNET_MQ_send} can be canceled with
2868@code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
2869been called, canceling a message results in undefined behavior.
2870Thus it is unsafe to cancel an envelope that does not have a notify sent
2871callback. When canceling an envelope, it is not necessary@ to call
2872@code{GNUNET_MQ_discard}, and the envelope can't be sent again.
2873
2874@strong{ Implementing Queues }@
2875@code{TODO}
2876
2877@cindex Service API
2878@node Service API
2879@subsection Service API
2880@c %**end of header
2881
2882Most GNUnet code lives in the form of services. Services are processes
2883that offer an API for other components of the system to build on. Those
2884other components can be command-line tools for users, graphical user
2885interfaces or other services. Services provide their API using an IPC
2886protocol. For this, each service must listen on either a TCP port or a
2887UNIX domain socket; for this, the service implementation uses the server
2888API. This use of server is exposed directly to the users of the service
2889API. Thus, when using the service API, one is usually also often using
2890large parts of the server API. The service API provides various
2891convenience functions, such as parsing command-line arguments and the
2892configuration file, which are not found in the server API.
2893The dual to the service/server API is the client API, which can be used to
2894access services.
2895
2896The most common way to start a service is to use the
2897@code{GNUNET_SERVICE_run} function from the program's main function.
2898@code{GNUNET_SERVICE_run} will then parse the command line and
2899configuration files and, based on the options found there,
2900start the server. It will then give back control to the main
2901program, passing the server and the configuration to the
2902@code{GNUNET_SERVICE_Main} callback. @code{GNUNET_SERVICE_run}
2903will also take care of starting the scheduler loop.
2904If this is inappropriate (for example, because the scheduler loop
2905is already running), @code{GNUNET_SERVICE_start} and
2906related functions provide an alternative to @code{GNUNET_SERVICE_run}.
2907
2908When starting a service, the service_name option is used to determine
2909which sections in the configuration file should be used to configure the
2910service. A typical value here is the name of the @file{src/}
2911sub-directory, for example "@file{statistics}".
2912The same string would also be given to
2913@code{GNUNET_CLIENT_connect} to access the service.
2914
2915Once a service has been initialized, the program should use the
2916@code{GNUNET_SERVICE_Main} callback to register message handlers
2917using @code{GNUNET_SERVER_add_handlers}.
2918The service will already have registered a handler for the
2919"TEST" message.
2920
2921@fnindex GNUNET_SERVICE_Options
2922The option bitfield (@code{enum GNUNET_SERVICE_Options})
2923determines how a service should behave during shutdown.
2924There are three key strategies:
2925
2926@table @asis
2927
2928@item instant (@code{GNUNET_SERVICE_OPTION_NONE})
2929Upon receiving the shutdown
2930signal from the scheduler, the service immediately terminates the server,
2931closing all existing connections with clients.
2932@item manual (@code{GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN})
2933The service does nothing by itself
2934during shutdown. The main program will need to take the appropriate
2935action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
2936on how the service was initialized) to terminate the service. This method
2937is used by gnunet-service-arm and rather uncommon.
2938@item soft (@code{GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN})
2939Upon receiving the shutdown signal from the scheduler,
2940the service immediately tells the server to stop
2941listening for incoming clients. Requests from normal existing clients are
2942still processed and the server/service terminates once all normal clients
2943have disconnected. Clients that are not expected to ever disconnect (such
2944as clients that monitor performance values) can be marked as 'monitor'
2945clients using GNUNET_SERVER_client_mark_monitor. Those clients will
2946continue to be processed until all 'normal' clients have disconnected.
2947Then, the server will terminate, closing the monitor connections.
2948This mode is for example used by 'statistics', allowing existing 'normal'
2949clients to set (possibly persistent) statistic values before terminating.
2950
2951@end table
2952
2953@c ***********************************************************************
2954@node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2955@subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2956@c %**end of header
2957
2958A commonly used data structure in GNUnet is a (multi-)hash map. It is most
2959often used to map a peer identity to some data structure, but also to map
2960arbitrary keys to values (for example to track requests in the distributed
2961hash table or in file-sharing). As it is commonly used, the DHT is
2962actually sometimes responsible for a large share of GNUnet's overall
2963memory consumption (for some processes, 30% is not uncommon). The
2964following text documents some API quirks (and their implications for
2965applications) that were recently introduced to minimize the footprint of
2966the hash map.
2967
2968
2969@c ***********************************************************************
2970@menu
2971* Analysis::
2972* Solution::
2973* Migration::
2974* Conclusion::
2975* Availability::
2976@end menu
2977
2978@node Analysis
2979@subsubsection Analysis
2980@c %**end of header
2981
2982The main reason for the "excessive" memory consumption by the hash map is
2983that GNUnet uses 512-bit cryptographic hash codes --- and the
2984(multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
2985a result, storing just the keys requires 64 bytes of memory for each key.
2986As some applications like to keep a large number of entries in the hash
2987map (after all, that's what maps are good for), 64 bytes per hash is
2988significant: keeping a pointer to the value and having a linked list for
2989collisions consume between 8 and 16 bytes, and 'malloc' may add about the
2990same overhead per allocation, putting us in the 16 to 32 byte per entry
2991ballpark. Adding a 64-byte key then triples the overall memory
2992requirement for the hash map.
2993
2994To make things "worse", most of the time storing the key in the hash map
2995is not required: it is typically already in memory elsewhere! In most
2996cases, the values stored in the hash map are some application-specific
2997struct that _also_ contains the hash. Here is a simplified example:
2998
2999@example
3000struct MyValue @{
3001struct GNUNET_HashCode key;
3002unsigned int my_data; @};
3003
3004// ...
3005val = GNUNET_malloc (sizeof (struct MyValue));
3006val->key = key;
3007val->my_data = 42;
3008GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
3009@end example
3010
3011This is a common pattern as later the entries might need to be removed,
3012and at that time it is convenient to have the key immediately at hand:
3013
3014@example
3015GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
3016@end example
3017
3018
3019Note that here we end up with two times 64 bytes for the key, plus maybe
302064 bytes total for the rest of the 'struct MyValue' and the map entry in
3021the hash map. The resulting redundant storage of the key increases
3022overall memory consumption per entry from the "optimal" 128 bytes to 192
3023bytes. This is not just an extreme example: overheads in practice are
3024actually sometimes close to those highlighted in this example. This is
3025especially true for maps with a significant number of entries, as there
3026we tend to really try to keep the entries small.
3027
3028@c ***********************************************************************
3029@node Solution
3030@subsubsection Solution
3031@c %**end of header
3032
3033The solution that has now been implemented is to @strong{optionally}
3034allow the hash map to not make a (deep) copy of the hash but instead have
3035a pointer to the hash/key in the entry. This reduces the memory
3036consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
3037also only work if the key is actually stored in the entry (which is the
3038case most of the time) and if the entry does not modify the key (which in
3039all of the code I'm aware of has been always the case if there key is
3040stored in the entry). Finally, when the client stores an entry in the
3041hash map, it @strong{must} provide a pointer to the key within the entry,
3042not just a pointer to a transient location of the key. If
3043the client code does not meet these requirements, the result is a dangling
3044pointer and undefined behavior of the (multi-)hash map API.
3045
3046@c ***********************************************************************
3047@node Migration
3048@subsubsection Migration
3049@c %**end of header
3050
3051To use the new feature, first check that the values contain the respective
3052key (and never modify it). Then, all calls to
3053@code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
3054audited and most likely changed to pass a pointer into the value's struct.
3055For the initial example, the new code would look like this:
3056
3057@example
3058struct MyValue @{
3059struct GNUNET_HashCode key;
3060unsigned int my_data; @};
3061
3062// ...
3063val = GNUNET_malloc (sizeof (struct MyValue));
3064val->key = key; val->my_data = 42;
3065GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
3066@end example
3067
3068
3069Note that @code{&val} was changed to @code{&val->key} in the argument to
3070the @code{put} call. This is critical as often @code{key} is on the stack
3071or in some other transient data structure and thus having the hash map
3072keep a pointer to @code{key} would not work. Only the key inside of
3073@code{val} has the same lifetime as the entry in the map (this must of
3074course be checked as well). Naturally, @code{val->key} must be
3075intiialized before the @code{put} call. Once all @code{put} calls have
3076been converted and double-checked, you can change the call to create the
3077hash map from
3078
3079@example
3080map =
3081GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
3082@end example
3083
3084to
3085
3086@example
3087map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
3088@end example
3089
3090If everything was done correctly, you now use about 60 bytes less memory
3091per entry in @code{map}. However, if now (or in the future) any call to
3092@code{put} does not ensure that the given key is valid until the entry is
3093removed from the map, undefined behavior is likely to be observed.
3094
3095@c ***********************************************************************
3096@node Conclusion
3097@subsubsection Conclusion
3098@c %**end of header
3099
3100The new optimization can is often applicable and can result in a
3101reduction in memory consumption of up to 30% in practice. However, it
3102makes the code less robust as additional invariants are imposed on the
3103multi hash map client. Thus applications should refrain from enabling the
3104new mode unless the resulting performance increase is deemed significant
3105enough. In particular, it should generally not be used in new code (wait
3106at least until benchmarks exist).
3107
3108@c ***********************************************************************
3109@node Availability
3110@subsubsection Availability
3111@c %**end of header
3112
3113The new multi hash map code was committed in SVN 24319 (will be in GNUnet
31140.9.4). Various subsystems (transport, core, dht, file-sharing) were
3115previously audited and modified to take advantage of the new capability.
3116In particular, memory consumption of the file-sharing service is expected
3117to drop by 20-30% due to this change.
3118
3119
3120@cindex CONTAINER_MDLL API
3121@node The CONTAINER_MDLL API
3122@subsection The CONTAINER_MDLL API
3123@c %**end of header
3124
3125This text documents the GNUNET_CONTAINER_MDLL API. The
3126GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
3127that it provides operations for the construction and manipulation of
3128doubly-linked lists. The key difference to the (simpler) DLL-API is that
3129the MDLL-version allows a single element (instance of a "struct") to be
3130in multiple linked lists at the same time.
3131
3132Like the DLL API, the MDLL API stores (most of) the data structures for
3133the doubly-linked list with the respective elements; only the 'head' and
3134'tail' pointers are stored "elsewhere" --- and the application needs to
3135provide the locations of head and tail to each of the calls in the
3136MDLL API. The key difference for the MDLL API is that the "next" and
3137"previous" pointers in the struct can no longer be simply called "next"
3138and "prev" --- after all, the element may be in multiple doubly-linked
3139lists, so we cannot just have one "next" and one "prev" pointer!
3140
3141The solution is to have multiple fields that must have a name of the
3142format "next_XX" and "prev_XX" where "XX" is the name of one of the
3143doubly-linked lists. Here is a simple example:
3144
3145@example
3146struct MyMultiListElement @{
3147 struct MyMultiListElement *next_ALIST;
3148 struct MyMultiListElement *prev_ALIST;
3149 struct MyMultiListElement *next_BLIST;
3150 struct MyMultiListElement *prev_BLIST;
3151 void
3152 *data;
3153@};
3154@end example
3155
3156
3157Note that by convention, we use all-uppercase letters for the list names.
3158In addition, the program needs to have a location for the head and tail
3159pointers for both lists, for example:
3160
3161@example
3162static struct MyMultiListElement *head_ALIST;
3163static struct MyMultiListElement *tail_ALIST;
3164static struct MyMultiListElement *head_BLIST;
3165static struct MyMultiListElement *tail_BLIST;
3166@end example
3167
3168
3169Using the MDLL-macros, we can now insert an element into the ALIST:
3170
3171@example
3172GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
3173@end example
3174
3175
3176Passing "ALIST" as the first argument to MDLL specifies which of the
3177next/prev fields in the 'struct MyMultiListElement' should be used. The
3178extra "ALIST" argument and the "_ALIST" in the names of the
3179next/prev-members are the only differences between the MDDL and DLL-API.
3180Like the DLL-API, the MDLL-API offers functions for inserting (at head,
3181at tail, after a given element) and removing elements from the list.
3182Iterating over the list should be done by directly accessing the
3183"next_XX" and/or "prev_XX" members.
3184
3185@cindex Automatic Restart Manager
3186@cindex ARM
3187@node The Automatic Restart Manager (ARM)
3188@section The Automatic Restart Manager (ARM)
3189@c %**end of header
3190
3191GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3192for system initialization and service babysitting. ARM starts and halts
3193services, detects configuration changes and restarts services impacted by
3194the changes as needed. It's also responsible for restarting services in
3195case of crashes and is planned to incorporate automatic debugging for
3196diagnosing service crashes providing developers insights about crash
3197reasons. The purpose of this document is to give GNUnet developer an idea
3198about how ARM works and how to interact with it.
3199
3200@menu
3201* Basic functionality::
3202* Key configuration options::
3203* ARM - Availability::
3204* Reliability::
3205@end menu
3206
3207@c ***********************************************************************
3208@node Basic functionality
3209@subsection Basic functionality
3210@c %**end of header
3211
3212@itemize @bullet
3213@item ARM source code can be found under "src/arm".@ Service processes are
3214managed by the functions in "gnunet-service-arm.c" which is controlled
3215with "gnunet-arm.c" (main function in that file is ARM's entry point).
3216
3217@item The functions responsible for communicating with ARM , starting and
3218stopping services -including ARM service itself- are provided by the
3219ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3220an ARM handle after setting it to the caller's context (configuration and
3221scheduler in use). This handle can be used afterwards by the caller to
3222communicate with ARM. Functions GNUNET_ARM_start_service() and
3223GNUNET_ARM_stop_service() are used for starting and stopping services
3224respectively.
3225
3226@item A typical example of using these basic ARM services can be found in
3227file test_arm_api.c. The test case connects to ARM, starts it, then uses
3228it to start a service "resolver", stops the "resolver" then stops "ARM".
3229@end itemize
3230
3231@c ***********************************************************************
3232@node Key configuration options
3233@subsection Key configuration options
3234@c %**end of header
3235
3236Configurations for ARM and services should be available in a .conf file
3237(As an example, see test_arm_api_data.conf). When running ARM, the
3238configuration file to use should be passed to the command:
3239
3240@example
3241$ gnunet-arm -s -c configuration_to_use.conf
3242@end example
3243
3244If no configuration is passed, the default configuration file will be used
3245(see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3246contrib/defaults.conf).@ Each of the services is having a section starting
3247by the service name between square brackets, for example: "[arm]".
3248The following options configure how ARM configures or interacts with the
3249various services:
3250
3251@table @asis
3252
3253@item PORT Port number on which the service is listening for incoming TCP
3254connections. ARM will start the services should it notice a request at
3255this port.
3256
3257@item HOSTNAME Specifies on which host the service is deployed. Note
3258that ARM can only start services that are running on the local system
3259(but will not check that the hostname matches the local machine name).
3260This option is used by the @code{gnunet_client_lib.h} implementation to
3261determine which system to connect to. The default is "localhost".
3262
3263@item BINARY The name of the service binary file.
3264
3265@item OPTIONS To be passed to the service.
3266
3267@item PREFIX A command to pre-pend to the actual command, for example,
3268running a service with "valgrind" or "gdb"
3269
3270@item DEBUG Run in debug mode (much verbosity).
3271
3272@item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of
3273the service and start the service on-demand.
3274
3275@item FORCESTART ARM will always start this service when the peer
3276is started.
3277
3278@item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3279
3280@item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3281
3282@end table
3283
3284
3285Options that impact the operation of ARM overall are in the "[arm]"
3286section. ARM is a normal service and has (except for AUTOSTART) all of the
3287options that other services do. In addition, ARM has the
3288following options:
3289
3290@table @asis
3291
3292@item GLOBAL_PREFIX Command to be pre-pended to all services that are
3293going to run.
3294
3295@item GLOBAL_POSTFIX Global option that will be supplied to all the
3296services that are going to run.
3297
3298@end table
3299
3300@c ***********************************************************************
3301@node ARM - Availability
3302@subsection ARM - Availability
3303@c %**end of header
3304
3305As mentioned before, one of the features provided by ARM is starting
3306services on demand. Consider the example of one service "client" that
3307wants to connect to another service a "server". The "client" will ask ARM
3308to run the "server". ARM starts the "server". The "server" starts
3309listening to incoming connections. The "client" will establish a
3310connection with the "server". And then, they will start to communicate
3311together.@ One problem with that scheme is that it's slow!@
3312The "client" service wants to communicate with the "server" service at
3313once and is not willing wait for it to be started and listening to
3314incoming connections before serving its request.@ One solution for that
3315problem will be that ARM starts all services as default services. That
3316solution will solve the problem, yet, it's not quite practical, for some
3317services that are going to be started can never be used or are going to
3318be used after a relatively long time.@
3319The approach followed by ARM to solve this problem is as follows:
3320
3321@itemize @bullet
3322
3323@item For each service having a PORT field in the configuration file and
3324that is not one of the default services ( a service that accepts incoming
3325connections from clients), ARM creates listening sockets for all addresses
3326associated with that service.
3327
3328@item The "client" will immediately establish a connection with
3329the "server".
3330
3331@item ARM --- pretending to be the "server" --- will listen on the
3332respective port and notice the incoming connection from the "client"
3333(but not accept it), instead
3334
3335@item Once there is an incoming connection, ARM will start the "server",
3336passing on the listen sockets (now, the service is started and can do its
3337work).
3338
3339@item Other client services now can directly connect directly to the
3340"server".
3341
3342@end itemize
3343
3344@c ***********************************************************************
3345@node Reliability
3346@subsection Reliability
3347
3348One of the features provided by ARM, is the automatic restart of crashed
3349services.@ ARM needs to know which of the running services died. Function
3350"gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3351function is scheduled to run upon receiving a SIGCHLD signal. The
3352function, then, iterates ARM's list of services running and monitors
3353which service has died (crashed). For all crashing services, ARM restarts
3354them.@
3355Now, considering the case of a service having a serious problem causing it
3356to crash each time it's started by ARM. If ARM keeps blindly restarting
3357such a service, we are going to have the pattern:
3358start-crash-restart-crash-restart-crash and so forth!! Which is of course
3359not practical.@
3360For that reason, ARM schedules the service to be restarted after waiting
3361for some delay that grows exponentially with each crash/restart of that
3362service.@ To clarify the idea, considering the following example:
3363
3364@itemize @bullet
3365
3366@item Service S crashed.
3367
3368@item ARM receives the SIGCHLD and inspects its list of services to find
3369the dead one(s).
3370
3371@item ARM finds S dead and schedules it for restarting after "backoff"
3372time which is initially set to 1ms. ARM will double the backoff time
3373correspondent to S (now backoff(S) = 2ms)
3374
3375@item Because there is a severe problem with S, it crashed again.
3376
3377@item Again ARM receives the SIGCHLD and detects that it's S again that's
3378crashed. ARM schedules it for restarting but after its new backoff time
3379(which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3380
3381@item and so on, until backoff(S) reaches a certain threshold
3382(@code{EXPONENTIAL_BACKOFF_THRESHOLD} is set to half an hour),
3383after reaching it, backoff(S) will remain half an hour,
3384hence ARM won't be busy for a lot of time trying to restart a
3385problematic service.
3386@end itemize
3387
3388@cindex TRANSPORT Subsystem
3389@node GNUnet's TRANSPORT Subsystem
3390@section GNUnet's TRANSPORT Subsystem
3391@c %**end of header
3392
3393This chapter documents how the GNUnet transport subsystem works. The
3394GNUnet transport subsystem consists of three main components: the
3395transport API (the interface used by the rest of the system to access the
3396transport service), the transport service itself (most of the interesting
3397functions, such as choosing transports, happens here) and the transport
3398plugins. A transport plugin is a concrete implementation for how two
3399GNUnet peers communicate; many plugins exist, for example for
3400communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3401transport subsystem uses supporting code, especially the NAT/UPnP
3402library to help with tasks such as NAT traversal.
3403
3404Key tasks of the transport service include:
3405
3406@itemize @bullet
3407
3408@item Create our HELLO message, notify clients and neighbours if our HELLO
3409changes (using NAT library as necessary)
3410
3411@item Validate HELLOs from other peers (send PING), allow other peers to
3412validate our HELLO's addresses (send PONG)
3413
3414@item Upon request, establish connections to other peers (using address
3415selection from ATS subsystem) and maintain them (again using PINGs and
3416PONGs) as long as desired
3417
3418@item Accept incoming connections, give ATS service the opportunity to
3419switch communication channels
3420
3421@item Notify clients about peers that have connected to us or that have
3422been disconnected from us
3423
3424@item If a (stateful) connection goes down unexpectedly (without explicit
3425DISCONNECT), quickly attempt to recover (without notifying clients) but do
3426notify clients quickly if reconnecting fails
3427
3428@item Send (payload) messages arriving from clients to other peers via
3429transport plugins and receive messages from other peers, forwarding
3430those to clients
3431
3432@item Enforce inbound traffic limits (using flow-control if it is
3433applicable); outbound traffic limits are enforced by CORE, not by us (!)
3434
3435@item Enforce restrictions on P2P connection as specified by the blacklist
3436configuration and blacklisting clients
3437@end itemize
3438
3439Note that the term "clients" in the list above really refers to the
3440GNUnet-CORE service, as CORE is typically the only client of the
3441transport service.
3442
3443@menu
3444* Address validation protocol::
3445@end menu
3446
3447@node Address validation protocol
3448@subsection Address validation protocol
3449@c %**end of header
3450
3451This section documents how the GNUnet transport service validates
3452connections with other peers. It is a high-level description of the
3453protocol necessary to understand the details of the implementation. It
3454should be noted that when we talk about PING and PONG messages in this
3455section, we refer to transport-level PING and PONG messages, which are
3456different from core-level PING and PONG messages (both in implementation
3457and function).
3458
3459The goal of transport-level address validation is to minimize the chances
3460of a successful man-in-the-middle attack against GNUnet peers on the
3461transport level. Such an attack would not allow the adversary to decrypt
3462the P2P transmissions, but a successful attacker could at least measure
3463traffic volumes and latencies (raising the adversaries capablities by
3464those of a global passive adversary in the worst case). The scenarios we
3465are concerned about is an attacker, Mallory, giving a HELLO to Alice that
3466claims to be for Bob, but contains Mallory's IP address instead of Bobs
3467(for some transport). Mallory would then forward the traffic to Bob (by
3468initiating a connection to Bob and claiming to be Alice). As a further
3469complication, the scheme has to work even if say Alice is behind a NAT
3470without traversal support and hence has no address of their own (and thus
3471Alice must always initiate the connection to Bob).
3472
3473An additional constraint is that HELLO messages do not contain a
3474cryptographic signature since other peers must be able to edit
3475(i.e. remove) addresses from the HELLO at any time (this was not true in
3476GNUnet 0.8.x). A basic @strong{assumption} is that each peer knows the
3477set of possible network addresses that it @strong{might} be reachable
3478under (so for example, the external IP address of the NAT plus the LAN
3479address(es) with the respective ports).
3480
3481The solution is the following. If Alice wants to validate that a given
3482address for Bob is valid (i.e. is actually established @strong{directly}
3483with the intended target), it sends a PING message over that connection
3484to Bob. Note that in this case, Alice initiated the connection so only
3485Alice knows which address was used for sure (Alice maybe behind NAT, so
3486whatever address Bob sees may not be an address Alice knows they have).
3487Bob
3488checks that the address given in the PING is actually one of Bob's
3489addresses
3490(does not belong to Mallory), and if it is, sends back a PONG (with a
3491signature that says that Bob owns/uses the address from the PING). Alice
3492checks the signature and is happy if it is valid and the address in the
3493PONG is the address Alice used.
3494This is similar to the 0.8.x protocol where the HELLO contained a
3495signature from Bob for each address used by Bob.
3496Here, the purpose code for the signature is
3497@code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3498remember Bob's address and consider the address valid for a while (12h in
3499the current implementation). Note that after this exchange, Alice only
3500considers Bob's address to be valid, the connection itself is not
3501considered 'established'. In particular, Alice may have many addresses
3502for Bob that Alice considers valid.
3503
3504The PONG message is protected with a nonce/challenge against replay
3505attacks and uses an expiration time for the signature (but those are
3506almost implementation details).
3507
3508@cindex NAT library
3509@node NAT library
3510@section NAT library
3511@c %**end of header
3512
3513The goal of the GNUnet NAT library is to provide a general-purpose API for
3514NAT traversal @strong{without} third-party support. So protocols that
3515involve contacting a third peer to help establish a connection between
3516two peers are outside of the scope of this API. That does not mean that
3517GNUnet doesn't support involving a third peer (we can do this with the
3518distance-vector transport or using application-level protocols), it just
3519means that the NAT API is not concerned with this possibility. The API is
3520written so that it will work for IPv6-NAT in the future as well as
3521current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3522that are not behind NAT --- in that case, the mapping provided is simply
3523the identity.
3524
3525NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3526set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3527library will return (via callback) a (possibly longer) list of addresses
3528the peer @strong{might} be reachable under. Internally, depending on the
3529configuration, the NAT library will try to punch a hole (using UPnP) or
3530just "know" that the NAT was manually punched and generate the respective
3531external IP address (the one that should be globally visible) based on
3532the given information.
3533
3534The NAT library also supports ICMP-based NAT traversal. Here, the other
3535peer can request connection-reversal by this peer (in this special case,
3536the peer is even allowed to configure a port number of zero). If the NAT
3537library detects a connection-reversal request, it returns the respective
3538target address to the client as well. It should be noted that
3539connection-reversal is currently only intended for TCP, so other plugins
3540@strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3541NAT library also supports requesting connection reversal from a remote
3542peer (@code{GNUNET_NAT_run_client}).
3543
3544Once initialized, the NAT handle can be used to test if a given address is
3545possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3546This is used for validating our addresses when generating PONGs.
3547
3548Finally, the NAT library contains an API to test if our NAT configuration
3549is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
3550the respective port, the NAT library can be used to test if the
3551configuration works. The test function act as a local client, initialize
3552the NAT traversal and then contact a @code{gnunet-nat-server} (running by
3553default on @code{gnunet.org}) and ask for a connection to be established.
3554This way, it is easy to test if the current NAT configuration is valid.
3555
3556@node Distance-Vector plugin
3557@section Distance-Vector plugin
3558@c %**end of header
3559
3560The Distance Vector (DV) transport is a transport mechanism that allows
3561peers to act as relays for each other, thereby connecting peers that would
3562otherwise be unable to connect. This gives a larger connection set to
3563applications that may work better with more peers to choose from (for
3564example, File Sharing and/or DHT).
3565
3566The Distance Vector transport essentially has two functions. The first is
3567"gossiping" connection information about more distant peers to directly
3568connected peers. The second is taking messages intended for non-directly
3569connected peers and encapsulating them in a DV wrapper that contains the
3570required information for routing the message through forwarding peers. Via
3571gossiping, optimal routes through the known DV neighborhood are discovered
3572and utilized and the message encapsulation provides some benefits in
3573addition to simply getting the message from the correct source to the
3574proper destination.
3575
3576The gossiping function of DV provides an up to date routing table of
3577peers that are available up to some number of hops. We call this a
3578fisheye view of the network (like a fish, nearby objects are known while
3579more distant ones unknown). Gossip messages are sent only to directly
3580connected peers, but they are sent about other knowns peers within the
3581"fisheye distance". Whenever two peers connect, they immediately gossip
3582to each other about their appropriate other neighbors. They also gossip
3583about the newly connected peer to previously
3584connected neighbors. In order to keep the routing tables up to date,
3585disconnect notifications are propogated as gossip as well (because
3586disconnects may not be sent/received, timeouts are also used remove
3587stagnant routing table entries).
3588
3589Routing of messages via DV is straightforward. When the DV transport is
3590notified of a message destined for a non-direct neighbor, the appropriate
3591forwarding peer is selected, and the base message is encapsulated in a DV
3592message which contains information about the initial peer and the intended
3593recipient. At each forwarding hop, the initial peer is validated (the
3594forwarding peer ensures that it has the initial peer in its neighborhood,
3595otherwise the message is dropped). Next the base message is
3596re-encapsulated in a new DV message for the next hop in the forwarding
3597chain (or delivered to the current peer, if it has arrived at the
3598destination).
3599
3600Assume a three peer network with peers Alice, Bob and Carol. Assume that
3601Alice <-> Bob and Bob <-> Carol are direct (e.g. over TCP or UDP
3602transports) connections, but that Alice cannot directly connect to Carol.
3603This may be the case due to NAT or firewall restrictions, or perhaps
3604based on one of the peers respective configurations. If the Distance
3605Vector transport is enabled on all three peers, it will automatically
3606discover (from the gossip protocol) that Alice and Carol can connect via
3607Bob and provide a "virtual" Alice <-> Carol connection. Routing between
3608Alice and Carol happens as follows; Alice creates a message destined for
3609Carol and notifies the DV transport about it. The DV transport at Alice
3610looks up Carol in the routing table and finds that the message must be
3611sent through Bob for Carol. The message is encapsulated setting Alice as
3612the initiator and Carol as the destination and sent to Bob. Bob receives
3613the messages, verifies both Alice and Carol are known to Bob, and re-wraps
3614the message in a new DV message for Carol. The DV transport at Carol
3615receives this message, unwraps the original message, and delivers it to
3616Carol as though it came directly from Alice.
3617
3618@cindex SMTP plugin
3619@node SMTP plugin
3620@section SMTP plugin
3621@c %**end of header
3622
3623This section describes the new SMTP transport plugin for GNUnet as it
3624exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
3625available in GNUnet 0.9.x. This page also describes the transport layer
3626abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
3627some benchmarking results. The performance results presented are quite
3628old and maybe outdated at this point.
3629
3630@itemize @bullet
3631@item Why use SMTP for a peer-to-peer transport?
3632@item SMTPHow does it work?
3633@item How do I configure my peer?
3634@item How do I test if it works?
3635@item How fast is it?
3636@item Is there any additional documentation?
3637@end itemize
3638
3639
3640@menu
3641* Why use SMTP for a peer-to-peer transport?::
3642* How does it work?::
3643* How do I configure my peer?::
3644* How do I test if it works?::
3645* How fast is it?::
3646@end menu
3647
3648@node Why use SMTP for a peer-to-peer transport?
3649@subsection Why use SMTP for a peer-to-peer transport?
3650@c %**end of header
3651
3652There are many reasons why one would not want to use SMTP:
3653
3654@itemize @bullet
3655@item SMTP is using more bandwidth than TCP, UDP or HTTP
3656@item SMTP has a much higher latency.
3657@item SMTP requires significantly more computation (encoding and decoding
3658time) for the peers.
3659@item SMTP is significantly more complicated to configure.
3660@item SMTP may be abused by tricking GNUnet into sending mail to@
3661non-participating third parties.
3662@end itemize
3663
3664So why would anybody want to use SMTP?
3665@itemize @bullet
3666@item SMTP can be used to contact peers behind NAT boxes (in virtual
3667private networks).
3668@item SMTP can be used to circumvent policies that limit or prohibit
3669peer-to-peer traffic by masking as "legitimate" traffic.
3670@item SMTP uses E-mail addresses which are independent of a specific IP,
3671which can be useful to address peers that use dynamic IP addresses.
3672@item SMTP can be used to initiate a connection (e.g. initial address
3673exchange) and peers can then negotiate the use of a more efficient
3674protocol (e.g. TCP) for the actual communication.
3675@end itemize
3676
3677In summary, SMTP can for example be used to send a message to a peer
3678behind a NAT box that has a dynamic IP to tell the peer to establish a
3679TCP connection to a peer outside of the private network. Even an
3680extraordinary overhead for this first message would be irrelevant in this
3681type of situation.
3682
3683@node How does it work?
3684@subsection How does it work?
3685@c %**end of header
3686
3687When a GNUnet peer needs to send a message to another GNUnet peer that has
3688advertised (only) an SMTP transport address, GNUnet base64-encodes the
3689message and sends it in an E-mail to the advertised address. The
3690advertisement contains a filter which is placed in the E-mail header,
3691such that the receiving host can filter the tagged E-mails and forward it
3692to the GNUnet peer process. The filter can be specified individually by
3693each peer and be changed over time. This makes it impossible to censor
3694GNUnet E-mail messages by searching for a generic filter.
3695
3696@node How do I configure my peer?
3697@subsection How do I configure my peer?
3698@c %**end of header
3699
3700First, you need to configure @code{procmail} to filter your inbound E-mail
3701for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
3702example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
3703used by @command{procmail} to detect GNUnet messages. You are free to
3704choose whichever filter you like, but you should make sure that it does
3705not occur in your other E-mail. In our example, we will use
3706@code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
3707looks like this:
3708
3709@example
3710:0:
3711* ^X-mailer: GNUnet
3712/tmp/gnunet.smtp
3713# where do you want your other e-mail delivered to
3714# (default: /var/spool/mail/)
3715:0: /var/spool/mail/
3716@end example
3717
3718After adding this file, first make sure that your regular E-mail still
3719works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
3720configuration. In the section @code{SMTP} you need to specify your E-mail
3721address under @code{EMAIL}, your mail server (for outgoing mail) under
3722@code{SERVER}, the filter (X-mailer: GNUnet in the example) under
3723@code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
3724section could then look like this:
3725
3726@example
3727EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
3728"X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
3729@end example
3730
3731Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
3732the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
3733you specified to contact your peer until the advertisement times out.
3734Thus, if you are not sure if everything works properly or if you are not
3735planning to be online for a long time, you may want to configure this
3736timeout to be short, e.g. just one hour. For this, set
3737@code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
3738
3739This should be it, but you may probably want to test it first.
3740
3741@node How do I test if it works?
3742@subsection How do I test if it works?
3743@c %**end of header
3744
3745Any transport can be subjected to some rudimentary tests using the
3746@code{gnunet-transport-check} tool. The tool sends a message to the local
3747node via the transport and checks that a valid message is received. While
3748this test does not involve other peers and can not check if firewalls or
3749other network obstacles prohibit proper operation, this is a great
3750testcase for the SMTP transport since it tests pretty much nearly all of
3751the functionality.
3752
3753@code{gnunet-transport-check} should only be used without running
3754@code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
3755tests all transports that are specified in the configuration file. But
3756you can specifically test SMTP by giving the option
3757@code{--transport=smtp}.
3758
3759Note that this test always checks if a transport can receive and send.
3760While you can configure most transports to only receive or only send
3761messages, this test will only work if you have configured the transport
3762to send and receive messages.
3763
3764@node How fast is it?
3765@subsection How fast is it?
3766@c %**end of header
3767
3768We have measured the performance of the UDP, TCP and SMTP transport layer
3769directly and when used from an application using the GNUnet core.
3770Measureing just the transport layer gives the better view of the actual
3771overhead of the protocol, whereas evaluating the transport from the
3772application puts the overhead into perspective from a practical point of
3773view.
3774
3775The loopback measurements of the SMTP transport were performed on three
3776different machines spanning a range of modern SMTP configurations. We
3777used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
3778configuration which includes filters for spam. We also used a Xenon 2 GHZ
3779with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
3780qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
3781UDP and TCP are provided using the SGL configuration. The qmail benchmark
3782uses qmail's internal filtering whereas the sendmail benchmarks relies on
3783procmail to filter and deliver the mail. We used the transport layer to
3784send a message of b bytes (excluding transport protocol headers) directly
3785to the local machine. This way, network latency and packet loss on the
3786wire have no impact on the timings. n messages were sent sequentially over
3787the transport layer, sending message i+1 after the i-th message was
3788received. All messages were sent over the same connection and the time to
3789establish the connection was not taken into account since this overhead is
3790miniscule in practice --- as long as a connection is used for a
3791significant number of messages.
3792
3793@multitable @columnfractions .20 .15 .15 .15 .15 .15
3794@headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail)
3795@tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
3796@item 11 bytes @tab 31 ms @tab 55 ms @tab 781 s @tab 77 s @tab 24 s
3797@item 407 bytes @tab 37 ms @tab 62 ms @tab 789 s @tab 78 s @tab 25 s
3798@item 1,221 bytes @tab 46 ms @tab 73 ms @tab 804 s @tab 78 s @tab 25 s
3799@end multitable
3800
3801The benchmarks show that UDP and TCP are, as expected, both significantly
3802faster compared with any of the SMTP services. Among the SMTP
3803implementations, there can be significant differences depending on the
3804SMTP configuration. Filtering with an external tool like procmail that
3805needs to re-parse its configuration for each mail can be very expensive.
3806Applying spam filters can also significantly impact the performance of
3807the underlying SMTP implementation. The microbenchmark shows that SMTP
3808can be a viable solution for initiating peer-to-peer sessions: a couple of
3809seconds to connect to a peer are probably not even going to be noticed by
3810users. The next benchmark measures the possible throughput for a
3811transport. Throughput can be measured by sending multiple messages in
3812parallel and measuring packet loss. Note that not only UDP but also the
3813TCP transport can actually loose messages since the TCP implementation
3814drops messages if the @code{write} to the socket would block. While the
3815SMTP protocol never drops messages itself, it is often so
3816slow that only a fraction of the messages can be sent and received in the
3817given time-bounds. For this benchmark we report the message loss after
3818allowing t time for sending m messages. If messages were not sent (or
3819received) after an overall timeout of t, they were considered lost. The
3820benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
3821with sendmail. The machines were connected with a direct 100 MBit ethernet
3822connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
3823throughput for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps
3824and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
3825overhead of SMTP can be improved by increasing the MTU, for example, an
3826MTU of 12,000 octets improves the throughput to 13 kbps as figure
3827smtp-MTUs shows. Our research paper) has some more details on the
3828benchmarking results.
3829
3830@cindex Bluetooth plugin
3831@node Bluetooth plugin
3832@section Bluetooth plugin
3833@c %**end of header
3834
3835This page describes the new Bluetooth transport plugin for GNUnet. The
3836plugin is still in the testing stage so don't expect it to work
3837perfectly. If you have any questions or problems just post them here or
3838ask on the IRC channel.
3839
3840@itemize @bullet
3841@item What do I need to use the Bluetooth plugin transport?
3842@item BluetoothHow does it work?
3843@item What possible errors should I be aware of?
3844@item How do I configure my peer?
3845@item How can I test it?
3846@end itemize
3847
3848@menu
3849* What do I need to use the Bluetooth plugin transport?::
3850* How does it work2?::
3851* What possible errors should I be aware of?::
3852* How do I configure my peer2?::
3853* How can I test it?::
3854* The implementation of the Bluetooth transport plugin::
3855@end menu
3856
3857@node What do I need to use the Bluetooth plugin transport?
3858@subsection What do I need to use the Bluetooth plugin transport?
3859@c %**end of header
3860
3861If you are a Linux user and you want to use the Bluetooth transport plugin
3862you should install the BlueZ development libraries (if they aren't already
3863installed). For instructions about how to install the libraries you should
3864check out the BlueZ site
3865(@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
3866you have the necesarry libraries, don't worry, just run the GNUnet
3867configure script and you will be able to see a notification at the end
3868which will warn you if you don't have the necessary libraries.
3869
3870If you are a Windows user you should have installed the
3871@emph{MinGW}/@emph{MSys2} with the latest updates (especially the
3872@emph{ws2bth} header). If this is your first build of GNUnet on Windows
3873you should check out the SBuild repository. It will semi-automatically
3874assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
3875packages which are needed for the GNUnet build. So this will ease your
3876work!@ Finally you just have to be sure that you have the correct drivers
3877for your Bluetooth device installed and that your device is on and in a
3878discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
3879protocol so we cannot turn on your device programatically!
3880
3881@c FIXME: Change to unique title
3882@node How does it work2?
3883@subsection How does it work2?
3884@c %**end of header
3885
3886The Bluetooth transport plugin uses virtually the same code as the WLAN
3887plugin and only the helper binary is different. The helper takes a single
3888argument, which represents the interface name and is specified in the
3889configuration file. Here are the basic steps that are followed by the
3890helper binary used on Linux:
3891
3892@itemize @bullet
3893@item it verifies if the name corresponds to a Bluetooth interface name
3894@item it verifies if the iterface is up (if it is not, it tries to bring
3895it up)
3896@item it tries to enable the page and inquiry scan in order to make the
3897device discoverable and to accept incoming connection requests
3898@emph{The above operations require root access so you should start the
3899transport plugin with root privileges.}
3900@item it finds an available port number and registers a SDP service which
3901will be used to find out on which port number is the server listening on
3902and switch the socket in listening mode
3903@item it sends a HELLO message with its address
3904@item finally it forwards traffic from the reading sockets to the STDOUT
3905and from the STDIN to the writing socket
3906@end itemize
3907
3908Once in a while the device will make an inquiry scan to discover the
3909nearby devices and it will send them randomly HELLO messages for peer
3910discovery.
3911
3912@node What possible errors should I be aware of?
3913@subsection What possible errors should I be aware of?
3914@c %**end of header
3915
3916@emph{This section is dedicated for Linux users}
3917
3918Well there are many ways in which things could go wrong but I will try to
3919present some tools that you could use to debug and some scenarios.
3920
3921@itemize @bullet
3922
3923@item @code{bluetoothd -n -d} : use this command to enable logging in the
3924foreground and to print the logging messages
3925
3926@item @code{hciconfig}: can be used to configure the Bluetooth devices.
3927If you run it without any arguments it will print information about the
3928state of the interfaces. So if you receive an error that the device
3929couldn't be brought up you should try to bring it manually and to see if
3930it works (use @code{hciconfig -a hciX up}). If you can't and the
3931Bluetooth address has the form 00:00:00:00:00:00 it means that there is
3932something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
3933@code{bluetoothd} tool to see the logs
3934
3935@item @code{sdptool} can be used to control and interogate SDP servers.
3936If you encounter problems regarding the SDP server (like the SDP server is
3937down) you should check out if the D-Bus daemon is running correctly and to
3938see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
3939Also, sometimes the SDP service could work but somehow the device couldn't
3940register his service. Use @code{sdptool browse [dev-address]} to see if
3941the service is registered. There should be a service with the name of the
3942interface and GNUnet as provider.
3943
3944@item @code{hcitool} : another useful tool which can be used to configure
3945the device and to send some particular commands to it.
3946
3947@item @code{hcidump} : could be used for low level debugging
3948@end itemize
3949
3950@c FIXME: A more unique name
3951@node How do I configure my peer2?
3952@subsection How do I configure my peer2?
3953@c %**end of header
3954
3955On Linux, you just have to be sure that the interface name corresponds to
3956the one that you want to use. Use the @code{hciconfig} tool to check that.
3957By default it is set to hci0 but you can change it.
3958
3959A basic configuration looks like this:
3960
3961@example
3962[transport-bluetooth]
3963# Name of the interface (typically hciX)
3964INTERFACE = hci0
3965# Real hardware, no testing
3966TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
3967@end example
3968
3969In order to use the Bluetooth transport plugin when the transport service
3970is started, you must add the plugin name to the default transport service
3971plugins list. For example:
3972
3973@example
3974[transport] ... PLUGINS = dns bluetooth ...
3975@end example
3976
3977If you want to use only the Bluetooth plugin set
3978@emph{PLUGINS = bluetooth}
3979
3980On Windows, you cannot specify which device to use. The only thing that
3981you should do is to add @emph{bluetooth} on the plugins list of the
3982transport service.
3983
3984@node How can I test it?
3985@subsection How can I test it?
3986@c %**end of header
3987
3988If you have two Bluetooth devices on the same machine which use Linux you
3989must:
3990
3991@itemize @bullet
3992
3993@item create two different file configuration (one which will use the
3994first interface (@emph{hci0}) and the other which will use the second
3995interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
3996@emph{peer2.conf}.
3997
3998@item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
3999peers private keys. The @strong{X} must be replace with 1 or 2.
4000
4001@item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
4002start the transport service. (Make sure that you have "bluetooth" on the
4003transport plugins list if the Bluetooth transport service doesn't start.)
4004
4005@item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
4006ID. If you already know your peer ID (you saved it from the first
4007command), this can be skipped.
4008
4009@item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
4010sending data for benchmarking to the other peer.
4011
4012@end itemize
4013
4014
4015This scenario will try to connect the second peer to the first one and
4016then start sending data for benchmarking.
4017
4018On Windows you cannot test the plugin functionality using two Bluetooth
4019devices from the same machine because after you install the drivers there
4020will occur some conflicts between the Bluetooth stacks. (At least that is
4021what happend on my machine : I wasn't able to use the Bluesoleil stack and
4022the WINDCOMM one in the same time).
4023
4024If you have two different machines and your configuration files are good
4025you can use the same scenario presented on the begining of this section.
4026
4027Another way to test the plugin functionality is to create your own
4028application which will use the GNUnet framework with the Bluetooth
4029transport service.
4030
4031@node The implementation of the Bluetooth transport plugin
4032@subsection The implementation of the Bluetooth transport plugin
4033@c %**end of header
4034
4035This page describes the implementation of the Bluetooth transport plugin.
4036
4037First I want to remind you that the Bluetooth transport plugin uses
4038virtually the same code as the WLAN plugin and only the helper binary is
4039different. Also the scope of the helper binary from the Bluetooth
4040transport plugin is the same as the one used for the wlan transport
4041plugin: it acceses the interface and then it forwards traffic in both
4042directions between the Bluetooth interface and stdin/stdout of the
4043process involved.
4044
4045The Bluetooth plugin transport could be used both on Linux and Windows
4046platforms.
4047
4048@itemize @bullet
4049@item Linux functionality
4050@item Windows functionality
4051@item Pending Features
4052@end itemize
4053
4054
4055
4056@menu
4057* Linux functionality::
4058* THE INITIALIZATION::
4059* THE LOOP::
4060* Details about the broadcast implementation::
4061* Windows functionality::
4062* Pending features::
4063@end menu
4064
4065@node Linux functionality
4066@subsubsection Linux functionality
4067@c %**end of header
4068
4069In order to implement the plugin functionality on Linux I used the BlueZ
4070stack. For the communication with the other devices I used the RFCOMM
4071protocol. Also I used the HCI protocol to gain some control over the
4072device. The helper binary takes a single argument (the name of the
4073Bluetooth interface) and is separated in two stages:
4074
4075@c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
4076@c %** starting a new section?
4077@node THE INITIALIZATION
4078@subsubsection THE INITIALIZATION
4079
4080@itemize @bullet
4081@item first, it checks if we have root privilegies
4082(@emph{Remember that we need to have root privilegies in order to be able
4083to bring the interface up if it is down or to change its state.}).
4084
4085@item second, it verifies if the interface with the given name exists.
4086
4087@strong{If the interface with that name exists and it is a Bluetooth
4088interface:}
4089
4090@item it creates a RFCOMM socket which will be used for listening and call
4091the @emph{open_device} method
4092
4093On the @emph{open_device} method:
4094@itemize @bullet
4095@item creates a HCI socket used to send control events to the the device
4096@item searches for the device ID using the interface name
4097@item saves the device MAC address
4098@item checks if the interface is down and tries to bring it UP
4099@item checks if the interface is in discoverable mode and tries to make it
4100discoverable
4101@item closes the HCI socket and binds the RFCOMM one
4102@item switches the RFCOMM socket in listening mode
4103@item registers the SDP service (the service will be used by the other
4104devices to get the port on which this device is listening on)
4105@end itemize
4106
4107@item drops the root privilegies
4108
4109@strong{If the interface is not a Bluetooth interface the helper exits
4110with a suitable error}
4111@end itemize
4112
4113@c %** Same as for @node entry above
4114@node THE LOOP
4115@subsubsection THE LOOP
4116
4117The helper binary uses a list where it saves all the connected neighbour
4118devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
4119@emph{write_std}). The first message which is send is a control message
4120with the device's MAC address in order to announce the peer presence to
4121the neighbours. Here are a short description of what happens in the main
4122loop:
4123
4124@itemize @bullet
4125@item Every time when it receives something from the STDIN it processes
4126the data and saves the message in the first buffer (@emph{write_pout}).
4127When it has something in the buffer, it gets the destination address from
4128the buffer, searches the destination address in the list (if there is no
4129connection with that device, it creates a new one and saves it to the
4130list) and sends the message.
4131@item Every time when it receives something on the listening socket it
4132accepts the connection and saves the socket on a list with the reading
4133sockets. @item Every time when it receives something from a reading
4134socket it parses the message, verifies the CRC and saves it in the
4135@emph{write_std} buffer in order to be sent later to the STDOUT.
4136@end itemize
4137
4138So in the main loop we use the select function to wait until one of the
4139file descriptor saved in one of the two file descriptors sets used is
4140ready to use. The first set (@emph{rfds}) represents the reading set and
4141it could contain the list with the reading sockets, the STDIN file
4142descriptor or the listening socket. The second set (@emph{wfds}) is the
4143writing set and it could contain the sending socket or the STDOUT file
4144descriptor. After the select function returns, we check which file
4145descriptor is ready to use and we do what is supposed to do on that kind
4146of event. @emph{For example:} if it is the listening socket then we
4147accept a new connection and save the socket in the reading list; if it is
4148the STDOUT file descriptor, then we write to STDOUT the message from the
4149@emph{write_std} buffer.
4150
4151To find out on which port a device is listening on we connect to the local
4152SDP server and searche the registered service for that device.
4153
4154@emph{You should be aware of the fact that if the device fails to connect
4155to another one when trying to send a message it will attempt one more
4156time. If it fails again, then it skips the message.}
4157@emph{Also you should know that the transport Bluetooth plugin has
4158support for @strong{broadcast messages}.}
4159
4160@node Details about the broadcast implementation
4161@subsubsection Details about the broadcast implementation
4162@c %**end of header
4163
4164First I want to point out that the broadcast functionality for the CONTROL
4165messages is not implemented in a conventional way. Since the inquiry scan
4166time is too big and it will take some time to send a message to all the
4167discoverable devices I decided to tackle the problem in a different way.
4168Here is how I did it:
4169
4170@itemize @bullet
4171@item If it is the first time when I have to broadcast a message I make an
4172inquiry scan and save all the devices' addresses to a vector.
4173@item After the inquiry scan ends I take the first address from the list
4174and I try to connect to it. If it fails, I try to connect to the next one.
4175If it succeeds, I save the socket to a list and send the message to the
4176device.
4177@item When I have to broadcast another message, first I search on the list
4178for a new device which I'm not connected to. If there is no new device on
4179the list I go to the beginning of the list and send the message to the
4180old devices. After 5 cycles I make a new inquiry scan to check out if
4181there are new discoverable devices and save them to the list. If there
4182are no new discoverable devices I reset the cycling counter and go again
4183through the old list and send messages to the devices saved in it.
4184@end itemize
4185
4186@strong{Therefore}:
4187
4188@itemize @bullet
4189@item every time when I have a broadcast message I look up on the list
4190for a new device and send the message to it
4191@item if I reached the end of the list for 5 times and I'm connected to
4192all the devices from the list I make a new inquiry scan.
4193@emph{The number of the list's cycles after an inquiry scan could be
4194increased by redefining the MAX_LOOPS variable}
4195@item when there are no new devices I send messages to the old ones.
4196@end itemize
4197
4198Doing so, the broadcast control messages will reach the devices but with
4199delay.
4200
4201@emph{NOTICE:} When I have to send a message to a certain device first I
4202check on the broadcast list to see if we are connected to that device. If
4203not we try to connect to it and in case of success we save the address and
4204the socket on the list. If we are already connected to that device we
4205simply use the socket.
4206
4207@node Windows functionality
4208@subsubsection Windows functionality
4209@c %**end of header
4210
4211For Windows I decided to use the Microsoft Bluetooth stack which has the
4212advantage of coming standard from Windows XP SP2. The main disadvantage is
4213that it only supports the RFCOMM protocol so we will not be able to have
4214a low level control over the Bluetooth device. Therefore it is the user
4215responsability to check if the device is up and in the discoverable mode.
4216Also there are no tools which could be used for debugging in order to read
4217the data coming from and going to a Bluetooth device, which obviously
4218hindered my work. Another thing that slowed down the implementation of the
4219plugin (besides that I wasn't too accomodated with the win32 API) was that
4220there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4221but you should keep in mind that you should have the latest updates
4222(especially the @emph{ws2bth} header).
4223
4224Besides the fact that it uses the Windows Sockets, the Windows
4225implemenation follows the same principles as the Linux one:
4226
4227@itemize @bullet
4228@item It has a initalization part where it initializes the
4229Windows Sockets, creates a RFCOMM socket which will be binded and switched
4230to the listening mode and registers a SDP service. In the Microsoft
4231Bluetooth API there are two ways to work with the SDP:
4232@itemize @bullet
4233@item an easy way which works with very simple service records
4234@item a hard way which is useful when you need to update or to delete the
4235record
4236@end itemize
4237@end itemize
4238
4239Since I only needed the SDP service to find out on which port the device
4240is listening on and that did not change, I decided to use the easy way.
4241In order to register the service I used the @emph{WSASetService} function
4242and I generated the @emph{Universally Unique Identifier} with the
4243@emph{guidgen.exe} Windows's tool.
4244
4245In the loop section the only difference from the Linux implementation is
4246that I used the GNUNET_NETWORK library for functions like @emph{accept},
4247@emph{bind}, @emph{connect} or @emph{select}. I decided to use the
4248GNUNET_NETWORK library because I also needed to interact with the STDIN
4249and STDOUT handles and on Windows the select function is only defined for
4250sockets, and it will not work for arbitrary file handles.
4251
4252Another difference between Linux and Windows implementation is that in
4253Linux, the Bluetooth address is represented in 48 bits while in Windows is
4254represented in 64 bits. Therefore I had to do some changes on
4255@emph{plugin_transport_wlan} header.
4256
4257Also, currently on Windows the Bluetooth plugin doesn't have support for
4258broadcast messages. When it receives a broadcast message it will skip it.
4259
4260@node Pending features
4261@subsubsection Pending features
4262@c %**end of header
4263
4264@itemize @bullet
4265@item Implement the broadcast functionality on Windows @emph{(currently
4266working on)}
4267@item Implement a testcase for the helper :@ @emph{The testcase
4268consists of a program which emaluates the plugin and uses the helper. It
4269will simulate connections, disconnections and data transfers.}
4270@end itemize
4271
4272If you have a new idea about a feature of the plugin or suggestions about
4273how I could improve the implementation you are welcome to comment or to
4274contact me.
4275
4276@node WLAN plugin
4277@section WLAN plugin
4278@c %**end of header
4279
4280This section documents how the wlan transport plugin works. Parts which
4281are not implemented yet or could be better implemented are described at
4282the end.
4283
4284@cindex ats subsystem
4285@node The ATS Subsystem
4286@section The ATS Subsystem
4287@c %**end of header
4288
4289ATS stands for "automatic transport selection", and the function of ATS in
4290GNUnet is to decide on which address (and thus transport plugin) should
4291be used for two peers to communicate, and what bandwidth limits should be
4292imposed on such an individual connection. To help ATS make an informed
4293decision, higher-level services inform the ATS service about their
4294requirements and the quality of the service rendered. The ATS service
4295also interacts with the transport service to be appraised of working
4296addresses and to communicate its resource allocation decisions. Finally,
4297the ATS service's operation can be observed using a monitoring API.
4298
4299The main logic of the ATS service only collects the available addresses,
4300their performance characteristics and the applications requirements, but
4301does not make the actual allocation decision. This last critical step is
4302left to an ATS plugin, as we have implemented (currently three) different
4303allocation strategies which differ significantly in their performance and
4304maturity, and it is still unclear if any particular plugin is generally
4305superior.
4306
4307@cindex core subsystem
4308@cindex CORE subsystem
4309@node GNUnet's CORE Subsystem
4310@section GNUnet's CORE Subsystem
4311@c %**end of header
4312
4313The CORE subsystem in GNUnet is responsible for securing link-layer
4314communications between nodes in the GNUnet overlay network. CORE builds
4315on the TRANSPORT subsystem which provides for the actual, insecure,
4316unreliable link-layer communication (for example, via UDP or WLAN), and
4317then adds fundamental security to the connections:
4318
4319@itemize @bullet
4320@item confidentiality with so-called perfect forward secrecy; we use
4321ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4322powered by Curve25519
4323@footnote{@uref{http://cr.yp.to/ecdh.html, Curve25519}} for the key
4324exchange and then use symmetric encryption, encrypting with both AES-256
4325@footnote{@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}} and
4326Twofish @footnote{@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}}
4327@item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4328is achieved by signing the ephemeral keys using Ed25519
4329@footnote{@uref{http://ed25519.cr.yp.to/, Ed25519}}, a deterministic
4330variant of ECDSA
4331@footnote{@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA}}
4332@item integrity protection (using SHA-512
4333@footnote{@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}} to do
4334encrypt-then-MAC
4335@footnote{@uref{http://en.wikipedia.org/wiki/Authenticated_encryption, encrypt-then-MAC}})
4336@item Replay
4337@footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}}
4338protection (using nonces, timestamps, challenge-response,
4339message counters and ephemeral keys)
4340@item liveness (keep-alive messages, timeout)
4341@end itemize
4342
4343@menu
4344* Limitations::
4345* When is a peer "connected"?::
4346* libgnunetcore::
4347* The CORE Client-Service Protocol::
4348* The CORE Peer-to-Peer Protocol::
4349@end menu
4350
4351@cindex core subsystem limitations
4352@node Limitations
4353@subsection Limitations
4354@c %**end of header
4355
4356CORE does not perform
4357@uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4358only possible to communicate with peers that happen to already be
4359"directly" connected with each other. CORE also does not have an
4360API to allow applications to establish such "direct" connections --- for
4361this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4362establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4363trying to keep a few "direct" connections open at all times. Applications
4364that need to talk to particular peers should use the CADET subsystem, as
4365it can establish arbitrary "indirect" connections.
4366
4367Because CORE does not perform routing, CORE must only be used directly by
4368applications that either perform their own routing logic (such as
4369anonymous file-sharing) or that do not require routing, for example
4370because they are based on flooding the network. CORE communication is
4371unreliable and delivery is possibly out-of-order. Applications that
4372require reliable communication should use the CADET service. Each
4373application can only queue one message per target peer with the CORE
4374service at any time; messages cannot be larger than approximately
437563 kilobytes. If messages are small, CORE may group multiple messages
4376(possibly from different applications) prior to encryption. If permitted
4377by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4378option), CORE may delay transmissions to facilitate grouping of multiple
4379small messages. If cork is not enabled, CORE will transmit the message as
4380soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4381bandwidth and congestion control). CORE does not allow flow control;
4382applications are expected to process messages at line-speed. If flow
4383control is needed, applications should use the CADET service.
4384
4385@cindex when is a peer connected
4386@node When is a peer "connected"?
4387@subsection When is a peer "connected"?
4388@c %**end of header
4389
4390In addition to the security features mentioned above, CORE also provides
4391one additional key feature to applications using it, and that is a
4392limited form of protocol-compatibility checking. CORE distinguishes
4393between TRANSPORT-level connections (which enable communication with other
4394peers) and application-level connections. Applications using the CORE API
4395will (typically) learn about application-level connections from CORE, and
4396not about TRANSPORT-level connections. When a typical application uses
4397CORE, it will specify a set of message types
4398(from @code{gnunet_protocols.h}) that it understands. CORE will then
4399notify the application about connections it has with other peers if and
4400only if those applications registered an intersecting set of message
4401types with their CORE service. Thus, it is quite possible that CORE only
4402exposes a subset of the established direct connections to a particular
4403application --- and different applications running above CORE might see
4404different sets of connections at the same time.
4405
4406A special case are applications that do not register a handler for any
4407message type.
4408CORE assumes that these applications merely want to monitor connections
4409(or "all" messages via other callbacks) and will notify those applications
4410about all connections. This is used, for example, by the
4411@code{gnunet-core} command-line tool to display the active connections.
4412Note that it is also possible that the TRANSPORT service has more active
4413connections than the CORE service, as the CORE service first has to
4414perform a key exchange with connecting peers before exchanging information
4415about supported message types and notifying applications about the new
4416connection.
4417
4418@cindex libgnunetcore
4419@node libgnunetcore
4420@subsection libgnunetcore
4421@c %**end of header
4422
4423The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4424messaging API used by P2P applications built using GNUnet. It provides
4425applications the ability to send and receive encrypted messages to the
4426peer's "directly" connected neighbours.
4427
4428As CORE connections are generally "direct" connections,@ applications must
4429not assume that they can connect to arbitrary peers this way, as "direct"
4430connections may not always be possible. Applications using CORE are
4431notified about which peers are connected. Creating new "direct"
4432connections must be done using the TRANSPORT API.
4433
4434The CORE API provides unreliable, out-of-order delivery. While the
4435implementation tries to ensure timely, in-order delivery, both message
4436losses and reordering are not detected and must be tolerated by the
4437application. Most important, the core will NOT perform retransmission if
4438messages could not be delivered.
4439
4440Note that CORE allows applications to queue one message per connected
4441peer. The rate at which each connection operates is influenced by the
4442preferences expressed by local application as well as restrictions
4443imposed by the other peer. Local applications can express their
4444preferences for particular connections using the "performance" API of the
4445ATS service.
4446
4447Applications that require more sophisticated transmission capabilities
4448such as TCP-like behavior, or if you intend to send messages to arbitrary
4449remote peers, should use the CADET API.
4450
4451The typical use of the CORE API is to connect to the CORE service using
4452@code{GNUNET_CORE_connect}, process events from the CORE service (such as
4453peers connecting, peers disconnecting and incoming messages) and send
4454messages to connected peers using
4455@code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4456cancel pending transmission requests if they receive a disconnect event
4457for a peer that had a transmission pending; furthermore, queueing more
4458than one transmission request per peer per application using the
4459service is not permitted.
4460
4461The CORE API also allows applications to monitor all communications of the
4462peer prior to encryption (for outgoing messages) or after decryption (for
4463incoming messages). This can be useful for debugging, diagnostics or to
4464establish the presence of cover traffic (for anonymity). As monitoring
4465applications are often not interested in the payload, the monitoring
4466callbacks can be configured to only provide the message headers (including
4467the message type and size) instead of copying the full data stream to the
4468monitoring client.
4469
4470The init callback of the @code{GNUNET_CORE_connect} function is called
4471with the hash of the public key of the peer. This public key is used to
4472identify the peer globally in the GNUnet network. Applications are
4473encouraged to check that the provided hash matches the hash that they are
4474using (as theoretically the application may be using a different
4475configuration file with a different private key, which would result in
4476hard to find bugs).
4477
4478As with most service APIs, the CORE API isolates applications from crashes
4479of the CORE service. If the CORE service crashes, the application will see
4480disconnect events for all existing connections. Once the connections are
4481re-established, the applications will be receive matching connect events.
4482
4483@cindex core clinet-service protocol
4484@node The CORE Client-Service Protocol
4485@subsection The CORE Client-Service Protocol
4486@c %**end of header
4487
4488This section describes the protocol between an application using the CORE
4489service (the client) and the CORE service process itself.
4490
4491
4492@menu
4493* Setup2::
4494* Notifications::
4495* Sending::
4496@end menu
4497
4498@node Setup2
4499@subsubsection Setup2
4500@c %**end of header
4501
4502When a client connects to the CORE service, it first sends a
4503@code{InitMessage} which specifies options for the connection and a set of
4504message type values which are supported by the application. The options
4505bitmask specifies which events the client would like to be notified about.
4506The options include:
4507
4508@table @asis
4509@item GNUNET_CORE_OPTION_NOTHING No notifications
4510@item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4511@item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4512decryption) with full payload
4513@item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4514of all inbound messages
4515@item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4516messages (prior to encryption) with full payload
4517@item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4518outbound messages
4519@end table
4520
4521Typical applications will only monitor for connection status changes.
4522
4523The CORE service responds to the @code{InitMessage} with an
4524@code{InitReplyMessage} which contains the peer's identity. Afterwards,
4525both CORE and the client can send messages.
4526
4527@node Notifications
4528@subsubsection Notifications
4529@c %**end of header
4530
4531The CORE will send @code{ConnectNotifyMessage}s and
4532@code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
4533the CORE (assuming their type maps overlap with the message types
4534registered by the client). When the CORE receives a message that matches
4535the set of message types specified during the @code{InitMessage} (or if
4536monitoring is enabled in for inbound messages in the options), it sends a
4537@code{NotifyTrafficMessage} with the peer identity of the sender and the
4538decrypted payload. The same message format (except with
4539@code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
4540used to notify clients monitoring outbound messages; here, the peer
4541identity given is that of the receiver.
4542
4543@node Sending
4544@subsubsection Sending
4545@c %**end of header
4546
4547When a client wants to transmit a message, it first requests a
4548transmission slot by sending a @code{SendMessageRequest} which specifies
4549the priority, deadline and size of the message. Note that these values
4550may be ignored by CORE. When CORE is ready for the message, it answers
4551with a @code{SendMessageReady} response. The client can then transmit the
4552payload with a @code{SendMessage} message. Note that the actual message
4553size in the @code{SendMessage} is allowed to be smaller than the size in
4554the original request. A client may at any time send a fresh
4555@code{SendMessageRequest}, which then superceeds the previous
4556@code{SendMessageRequest}, which is then no longer valid. The client can
4557tell which @code{SendMessageRequest} the CORE service's
4558@code{SendMessageReady} message is for as all of these messages contain a
4559"unique" request ID (based on a counter incremented by the client
4560for each request).
4561
4562@cindex CORE Peer-to-Peer Protocol
4563@node The CORE Peer-to-Peer Protocol
4564@subsection The CORE Peer-to-Peer Protocol
4565@c %**end of header
4566
4567
4568@menu
4569* Creating the EphemeralKeyMessage::
4570* Establishing a connection::
4571* Encryption and Decryption::
4572* Type maps::
4573@end menu
4574
4575@cindex EphemeralKeyMessage creation
4576@node Creating the EphemeralKeyMessage
4577@subsubsection Creating the EphemeralKeyMessage
4578@c %**end of header
4579
4580When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4581public-private key pair and signs the corresponding
4582@code{EphemeralKeyMessage} with its long-term key (which we usually call
4583the peer's identity; the hash of the public long term key is what results
4584in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
4585key is ONLY used for an ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4586exchange by the CORE service to establish symmetric session keys. A peer
4587will use the same @code{EphemeralKeyMessage} for all peers for
4588@code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
4589will create a fresh ephemeral key (forgetting the old one) and broadcast
4590the new @code{EphemeralKeyMessage} to all connected peers, resulting in
4591fresh symmetric session keys. Note that peers independently decide on
4592when to discard ephemeral keys; it is not a protocol violation to discard
4593keys more often. Ephemeral keys are also never stored to disk; restarting
4594a peer will thus always create a fresh ephemeral key. The use of ephemeral
4595keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
4596
4597Just before transmission, the @code{EphemeralKeyMessage} is patched to
4598reflect the current sender_status, which specifies the current state of
4599the connection from the point of view of the sender. The possible values
4600are:
4601
4602@itemize @bullet
4603@item @code{KX_STATE_DOWN} Initial value, never used on the network
4604@item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
4605key of the other peer
4606@item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
4607ephemeral key of the other peer, but we are waiting for the other peer to
4608confirm it's authenticity (ability to decode) via challenge-response.
4609@item @code{KX_STATE_UP} The connection is fully up from the point of
4610view of the sender (now performing keep-alives)
4611@item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
4612operation; the other peer has so far failed to confirm a working
4613connection using the new ephemeral key
4614@end itemize
4615
4616@node Establishing a connection
4617@subsubsection Establishing a connection
4618@c %**end of header
4619
4620Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
4621the other peer once the TRANSPORT service notifies the CORE service about
4622the connection.
4623A peer receiving an @code{EphemeralKeyMessage} with a status
4624indicating that the sender does not have the receiver's ephemeral key, the
4625receiver's @code{EphemeralKeyMessage} is sent in response.
4626Additionally, if the receiver has not yet confirmed the authenticity of
4627the sender, it also sends an (encrypted)@code{PingMessage} with a
4628challenge (and the identity of the target) to the other peer. Peers
4629receiving a @code{PingMessage} respond with an (encrypted)
4630@code{PongMessage} which includes the challenge. Peers receiving a
4631@code{PongMessage} check the challenge, and if it matches set the
4632connection to @code{KX_STATE_UP}.
4633
4634@node Encryption and Decryption
4635@subsubsection Encryption and Decryption
4636@c %**end of header
4637
4638All functions related to the key exchange and encryption/decryption of
4639messages can be found in @file{gnunet-service-core_kx.c} (except for the
4640cryptographic primitives, which are in @file{util/crypto*.c}).
4641Given the key material from ECDHE, a Key derivation function
4642@footnote{@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key derivation function}}
4643is used to derive two pairs of encryption and decryption keys for AES-256
4644and TwoFish, as well as initialization vectors and authentication keys
4645(for HMAC@footnote{@uref{https://en.wikipedia.org/wiki/HMAC, HMAC}}).
4646The HMAC is computed over the encrypted payload.
4647Encrypted messages include an iv_seed and the HMAC in the header.
4648
4649Each encrypted message in the CORE service includes a sequence number and
4650a timestamp in the encrypted payload. The CORE service remembers the
4651largest observed sequence number and a bit-mask which represents which of
4652the previous 32 sequence numbers were already used.
4653Messages with sequence numbers lower than the largest observed sequence
4654number minus 32 are discarded. Messages with a timestamp that is less
4655than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
4656course means that system clocks need to be reasonably synchronized for
4657peers to be able to communicate. Additionally, as the ephemeral key
4658changes every 12 hours, a peer would not even be able to decrypt messages
4659older than 12 hours.
4660
4661@node Type maps
4662@subsubsection Type maps
4663@c %**end of header
4664
4665Once an encrypted connection has been established, peers begin to exchange
4666type maps. Type maps are used to allow the CORE service to determine which
4667(encrypted) connections should be shown to which applications. A type map
4668is an array of 65536 bits representing the different types of messages
4669understood by applications using the CORE service. Each CORE service
4670maintains this map, simply by setting the respective bit for each message
4671type supported by any of the applications using the CORE service. Note
4672that bits for message types embedded in higher-level protocols (such as
4673MESH) will not be included in these type maps.
4674
4675Typically, the type map of a peer will be sparse. Thus, the CORE service
4676attempts to compress its type map using @code{gzip}-style compression
4677("deflate") prior to transmission. However, if the compression fails to
4678compact the map, the map may also be transmitted without compression
4679(resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
4680@code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
4681Upon receiving a type map, the respective CORE service notifies
4682applications about the connection to the other peer if they support any
4683message type indicated in the type map (or no message type at all).
4684If the CORE service experience a connect or disconnect event from an
4685application, it updates its type map (setting or unsetting the respective
4686bits) and notifies its neighbours about the change.
4687The CORE services of the neighbours then in turn generate connect and
4688disconnect events for the peer that sent the type map for their respective
4689applications. As CORE messages may be lost, the CORE service confirms
4690receiving a type map by sending back a
4691@code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
4692(with the correct hash of the type map) is not received, the sender will
4693retransmit the type map (with exponential back-off).
4694
4695@cindex cadet subsystem
4696@cindex CADET
4697@node GNUnet's CADET subsystem
4698@section GNUnet's CADET subsystem
4699
4700The CADET subsystem in GNUnet is responsible for secure end-to-end
4701communications between nodes in the GNUnet overlay network. CADET builds
4702on the CORE subsystem which provides for the link-layer communication and
4703then adds routing, forwarding and additional security to the connections.
4704CADET offers the same cryptographic services as CORE, but on an
4705end-to-end level. This is done so peers retransmitting traffic on behalf
4706of other peers cannot access the payload data.
4707
4708@itemize @bullet
4709@item CADET provides confidentiality with so-called perfect forward
4710secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
4711use symmetric encryption, encrypting with both AES-256 and Twofish
4712@item authentication is achieved by signing the ephemeral keys using
4713Ed25519, a deterministic variant of ECDSA
4714@item integrity protection (using SHA-512 to do encrypt-then-MAC, although
4715only 256 bits are sent to reduce overhead)
4716@item replay protection (using nonces, timestamps, challenge-response,
4717message counters and ephemeral keys)
4718@item liveness (keep-alive messages, timeout)
4719@end itemize
4720
4721Additional to the CORE-like security benefits, CADET offers other
4722properties that make it a more universal service than CORE.
4723
4724@itemize @bullet
4725@item CADET can establish channels to arbitrary peers in GNUnet. If a
4726peer is not immediately reachable, CADET will find a path through the
4727network and ask other peers to retransmit the traffic on its behalf.
4728@item CADET offers (optional) reliability mechanisms. In a reliable
4729channel traffic is guaranteed to arrive complete, unchanged and in-order.
4730@item CADET takes care of flow and congestion control mechanisms, not
4731allowing the sender to send more traffic than the receiver or the network
4732are able to process.
4733@end itemize
4734
4735@menu
4736* libgnunetcadet::
4737@end menu
4738
4739@cindex libgnunetcadet
4740@node libgnunetcadet
4741@subsection libgnunetcadet
4742
4743
4744The CADET API (defined in @file{gnunet_cadet_service.h}) is the
4745messaging API used by P2P applications built using GNUnet.
4746It provides applications the ability to send and receive encrypted
4747messages to any peer participating in GNUnet.
4748The API is heavily base on the CORE API.
4749
4750CADET delivers messages to other peers in "channels".
4751A channel is a permanent connection defined by a destination peer
4752(identified by its public key) and a port number.
4753Internally, CADET tunnels all channels towards a destiantion peer
4754using one session key and relays the data on multiple "connections",
4755independent from the channels.
4756
4757Each channel has optional paramenters, the most important being the
4758reliability flag.
4759Should a message get lost on TRANSPORT/CORE level, if a channel is
4760created with as reliable, CADET will retransmit the lost message and
4761deliver it in order to the destination application.
4762
4763To communicate with other peers using CADET, it is necessary to first
4764connect to the service using @code{GNUNET_CADET_connect}.
4765This function takes several parameters in form of callbacks, to allow the
4766client to react to various events, like incoming channels or channels that
4767terminate, as well as specify a list of ports the client wishes to listen
4768to (at the moment it is not possible to start listening on further ports
4769once connected, but nothing prevents a client to connect several times to
4770CADET, even do one connection per listening port).
4771The function returns a handle which has to be used for any further
4772interaction with the service.
4773
4774To connect to a remote peer a client has to call the
4775@code{GNUNET_CADET_channel_create} function. The most important parameters
4776given are the remote peer's identity (it public key) and a port, which
4777specifies which application on the remote peer to connect to, similar to
4778TCP/UDP ports. CADET will then find the peer in the GNUnet network and
4779establish the proper low-level connections and do the necessary key
4780exchanges to assure and authenticated, secure and verified communication.
4781Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
4782returns a handle to interact with the created channel.
4783
4784For every message the client wants to send to the remote application,
4785@code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
4786channel on which the message should be sent and the size of the message
4787(but not the message itself!). Once CADET is ready to send the message,
4788the provided callback will fire, and the message contents are provided to
4789this callback.
4790
4791Please note the CADET does not provide an explicit notification of when a
4792channel is connected. In loosely connected networks, like big wireless
4793mesh networks, this can take several seconds, even minutes in the worst
4794case. To be alerted when a channel is online, a client can call
4795@code{GNUNET_CADET_notify_transmit_ready} immediately after
4796@code{GNUNET_CADET_create_channel}. When the callback is activated, it
4797means that the channel is online. The callback can give 0 bytes to CADET
4798if no message is to be sent, this is ok.
4799
4800If a transmission was requested but before the callback fires it is no
4801longer needed, it can be cancelled with
4802@code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
4803given back by @code{GNUNET_CADET_notify_transmit_ready}.
4804As in the case of CORE, only one message can be requested at a time: a
4805client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
4806the callback is called or the request is cancelled.
4807
4808When a channel is no longer needed, a client can call
4809@code{GNUNET_CADET_channel_destroy} to get rid of it.
4810Note that CADET will try to transmit all pending traffic before notifying
4811the remote peer of the destruction of the channel, including
4812retransmitting lost messages if the channel was reliable.
4813
4814Incoming channels, channels being closed by the remote peer, and traffic
4815on any incoming or outgoing channels are given to the client when CADET
4816executes the callbacks given to it at the time of
4817@code{GNUNET_CADET_connect}.
4818
4819Finally, when an application no longer wants to use CADET, it should call
4820@code{GNUNET_CADET_disconnect}, but first all channels and pending
4821transmissions must be closed (otherwise CADET will complain).
4822
4823@cindex nse subsystem
4824@cindex NSE
4825@node GNUnet's NSE subsystem
4826@section GNUnet's NSE subsystem
4827
4828
4829NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
4830other subsystems and users with a rough estimate of the number of peers
4831currently participating in the GNUnet overlay.
4832The computed value is not a precise number as producing a precise number
4833in a decentralized, efficient and secure way is impossible.
4834While NSE's estimate is inherently imprecise, NSE also gives the expected
4835range. For a peer that has been running in a stable network for a
4836while, the real network size will typically (99.7% of the time) be in the
4837range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
4838algorithm used to calculate the estimate;
4839all of the details can be found in this technical report.
4840
4841@c FIXME: link to the report.
4842
4843@menu
4844* Motivation::
4845* Principle::
4846* libgnunetnse::
4847* The NSE Client-Service Protocol::
4848* The NSE Peer-to-Peer Protocol::
4849@end menu
4850
4851@node Motivation
4852@subsection Motivation
4853
4854
4855Some subsytems, like DHT, need to know the size of the GNUnet network to
4856optimize some parameters of their own protocol. The decentralized nature
4857of GNUnet makes efficient and securely counting the exact number of peers
4858infeasable. Although there are several decentralized algorithms to count
4859the number of peers in a system, so far there is none to do so securely.
4860Other protocols may allow any malicious peer to manipulate the final
4861result or to take advantage of the system to perform
4862@dfn{Denial of Service} (DoS) attacks against the network.
4863GNUnet's NSE protocol avoids these drawbacks.
4864
4865
4866
4867@menu
4868* Security::
4869@end menu
4870
4871@cindex NSE security
4872@cindex nse security
4873@node Security
4874@subsubsection Security
4875
4876
4877The NSE subsystem is designed to be resilient against these attacks.
4878It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work}
4879to prevent one peer from impersonating a large number of participants,
4880which would otherwise allow an adversary to artifically inflate the
4881estimate.
4882The DoS protection comes from the time-based nature of the protocol:
4883the estimates are calculated periodically and out-of-time traffic is
4884either ignored or stored for later retransmission by benign peers.
4885In particular, peers cannot trigger global network communication at will.
4886
4887@cindex NSE principle
4888@cindex nse principle
4889@node Principle
4890@subsection Principle
4891
4892
4893The algorithm calculates the estimate by finding the globally closest
4894peer ID to a random, time-based value.
4895
4896The idea is that the closer the ID is to the random value, the more
4897"densely packed" the ID space is, and therefore, more peers are in the
4898network.
4899
4900
4901
4902@menu
4903* Example::
4904* Algorithm::
4905* Target value::
4906* Timing::
4907* Controlled Flooding::
4908* Calculating the estimate::
4909@end menu
4910
4911@node Example
4912@subsubsection Example
4913
4914
4915Suppose all peers have IDs between 0 and 100 (our ID space), and the
4916random value is 42.
4917If the closest peer has the ID 70 we can imagine that the average
4918"distance" between peers is around 30 and therefore the are around 3
4919peers in the whole ID space. On the other hand, if the closest peer has
4920the ID 44, we can imagine that the space is rather packed with peers,
4921maybe as much as 50 of them.
4922Naturally, we could have been rather unlucky, and there is only one peer
4923and happens to have the ID 44. Thus, the current estimate is calculated
4924as the average over multiple rounds, and not just a single sample.
4925
4926@node Algorithm
4927@subsubsection Algorithm
4928
4929
4930Given that example, one can imagine that the job of the subsystem is to
4931efficiently communicate the ID of the closest peer to the target value
4932to all the other peers, who will calculate the estimate from it.
4933
4934@node Target value
4935@subsubsection Target value
4936
4937@c %**end of header
4938
4939The target value itself is generated by hashing the current time, rounded
4940down to an agreed value. If the rounding amount is 1h (default) and the
4941time is 12:34:56, the time to hash would be 12:00:00. The process is
4942repeated each rouning amount (in this example would be every hour).
4943Every repetition is called a round.
4944
4945@node Timing
4946@subsubsection Timing
4947@c %**end of header
4948
4949The NSE subsystem has some timing control to avoid everybody broadcasting
4950its ID all at one. Once each peer has the target random value, it
4951compares its own ID to the target and calculates the hypothetical size of
4952the network if that peer were to be the closest.
4953Then it compares the hypothetical size with the estimate from the previous
4954rounds. For each value there is an assiciated point in the period,
4955let's call it "broadcast time". If its own hypothetical estimate
4956is the same as the previous global estimate, its "broadcast time" will be
4957in the middle of the round. If its bigger it will be earlier and if its
4958smaller (the most likely case) it will be later. This ensures that the
4959peers closests to the target value start broadcasting their ID the first.
4960
4961@node Controlled Flooding
4962@subsubsection Controlled Flooding
4963
4964@c %**end of header
4965
4966When a peer receives a value, first it verifies that it is closer than the
4967closest value it had so far, otherwise it answers the incoming message
4968with a message containing the better value. Then it checks a proof of
4969work that must be included in the incoming message, to ensure that the
4970other peer's ID is not made up (otherwise a malicious peer could claim to
4971have an ID of exactly the target value every round). Once validated, it
4972compares the brodcast time of the received value with the current time
4973and if it's not too early, sends the received value to its neighbors.
4974Otherwise it stores the value until the correct broadcast time comes.
4975This prevents unnecessary traffic of sub-optimal values, since a better
4976value can come before the broadcast time, rendering the previous one
4977obsolete and saving the traffic that would have been used to broadcast it
4978to the neighbors.
4979
4980@node Calculating the estimate
4981@subsubsection Calculating the estimate
4982
4983@c %**end of header
4984
4985Once the closest ID has been spread across the network each peer gets the
4986exact distance betweed this ID and the target value of the round and
4987calculates the estimate with a mathematical formula described in the tech
4988report. The estimate generated with this method for a single round is not
4989very precise. Remember the case of the example, where the only peer is the
4990ID 44 and we happen to generate the target value 42, thinking there are
499150 peers in the network. Therefore, the NSE subsystem remembers the last
499264 estimates and calculates an average over them, giving a result of which
4993usually has one bit of uncertainty (the real size could be half of the
4994estimate or twice as much). Note that the actual network size is
4995calculated in powers of two of the raw input, thus one bit of uncertainty
4996means a factor of two in the size estimate.
4997
4998@cindex libgnunetnse
4999@node libgnunetnse
5000@subsection libgnunetnse
5001
5002@c %**end of header
5003
5004The NSE subsystem has the simplest API of all services, with only two
5005calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
5006
5007The connect call gets a callback function as a parameter and this function
5008is called each time the network agrees on an estimate. This usually is
5009once per round, with some exceptions: if the closest peer has a late
5010local clock and starts spreading his ID after everyone else agreed on a
5011value, the callback might be activated twice in a round, the second value
5012being always bigger than the first. The default round time is set to
50131 hour.
5014
5015The disconnect call disconnects from the NSE subsystem and the callback
5016is no longer called with new estimates.
5017
5018
5019
5020@menu
5021* Results::
5022* libgnunetnse - Examples::
5023@end menu
5024
5025@node Results
5026@subsubsection Results
5027
5028@c %**end of header
5029
5030The callback provides two values: the average and the
5031@uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
5032of the last 64 rounds. The values provided by the callback function are
5033logarithmic, this means that the real estimate numbers can be obtained by
5034calculating 2 to the power of the given value (2average). From a
5035statistics point of view this means that:
5036
5037@itemize @bullet
5038@item 68% of the time the real size is included in the interval
5039[(2average-stddev), 2]
5040@item 95% of the time the real size is included in the interval
5041[(2average-2*stddev, 2^average+2*stddev]
5042@item 99.7% of the time the real size is included in the interval
5043[(2average-3*stddev, 2average+3*stddev]
5044@end itemize
5045
5046The expected standard variation for 64 rounds in a network of stable size
5047is 0.2. Thus, we can say that normally:
5048
5049@itemize @bullet
5050@item 68% of the time the real size is in the range [-13%, +15%]
5051@item 95% of the time the real size is in the range [-24%, +32%]
5052@item 99.7% of the time the real size is in the range [-34%, +52%]
5053@end itemize
5054
5055As said in the introduction, we can be quite sure that usually the real
5056size is between one third and three times the estimate. This can of
5057course vary with network conditions.
5058Thus, applications may want to also consider the provided standard
5059deviation value, not only the average (in particular, if the standard
5060veriation is very high, the average maybe meaningless: the network size is
5061changing rapidly).
5062
5063@node libgnunetnse - Examples
5064@subsubsection libgnunetnse -Examples
5065
5066@c %**end of header
5067
5068Let's close with a couple examples.
5069
5070@table @asis
5071
5072@item Average: 10, std dev: 1 Here the estimate would be
50732^10 = 1024 peers. @footnote{The range in which we can be 95% sure is:
5074[2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
5075is not a hundred peers and absolutely sure that it is not a million peers,
5076but somewhere around a thousand.}
5077
5078@item Average 22, std dev: 0.2 Here the estimate would be
50792^22 = 4 Million peers. @footnote{The range in which we can be 99.7% sure
5080is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
5081is around four million, with absolutely way of it being 1 million.}
5082
5083@end table
5084
5085To put this in perspective, if someone remembers the LHC Higgs boson
5086results, were announced with "5 sigma" and "6 sigma" certainties. In this
5087case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
50881.8 million.
5089
5090@node The NSE Client-Service Protocol
5091@subsection The NSE Client-Service Protocol
5092
5093@c %**end of header
5094
5095As with the API, the client-service protocol is very simple, only has 2
5096different messages, defined in @code{src/nse/nse.h}:
5097
5098@itemize @bullet
5099@item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
5100and is sent from the client to the service upon connection.
5101@item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
5102the service to the client for every new estimate and upon connection.
5103Contains a timestamp for the estimate, the average and the standard
5104deviation for the respective round.
5105@end itemize
5106
5107When the @code{GNUNET_NSE_disconnect} API call is executed, the client
5108simply disconnects from the service, with no message involved.
5109
5110@cindex NSE Peer-to-Peer Protocol
5111@node The NSE Peer-to-Peer Protocol
5112@subsection The NSE Peer-to-Peer Protocol
5113
5114@c %**end of header
5115
5116The NSE subsystem only has one message in the P2P protocol, the
5117@code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
5118
5119This message key contents are the timestamp to identify the round
5120(differences in system clocks may cause some peers to send messages way
5121too early or way too late, so the timestamp allows other peers to
5122identify such messages easily), the
5123@uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
5124used to make it difficult to mount a
5125@uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
5126public key, which is used to verify the signature on the message.
5127
5128Every peer stores a message for the previous, current and next round. The
5129messages for the previous and current round are given to peers that
5130connect to us. The message for the next round is simply stored until our
5131system clock advances to the next round. The message for the current round
5132is what we are flooding the network with right now.
5133At the beginning of each round the peer does the following:
5134
5135@itemize @bullet
5136@item calculates his own distance to the target value
5137@item creates, signs and stores the message for the current round (unless
5138it has a better message in the "next round" slot which came early in the
5139previous round)
5140@item calculates, based on the stored round message (own or received) when
5141to stard flooding it to its neighbors
5142@end itemize
5143
5144Upon receiving a message the peer checks the validity of the message
5145(round, proof of work, signature). The next action depends on the
5146contents of the incoming message:
5147
5148@itemize @bullet
5149@item if the message is worse than the current stored message, the peer
5150sends the current message back immediately, to stop the other peer from
5151spreading suboptimal results
5152@item if the message is better than the current stored message, the peer
5153stores the new message and calculates the new target time to start
5154spreading it to its neighbors (excluding the one the message came from)
5155@item if the message is for the previous round, it is compared to the
5156message stored in the "previous round slot", which may then be updated
5157@item if the message is for the next round, it is compared to the message
5158stored in the "next round slot", which again may then be updated
5159@end itemize
5160
5161Finally, when it comes to send the stored message for the current round to
5162the neighbors there is a random delay added for each neighbor, to avoid
5163traffic spikes and minimize cross-messages.
5164
5165@cindex HOSTLIST subsystem
5166@cindex hostlist subsystem
5167@node GNUnet's HOSTLIST subsystem
5168@section GNUnet's HOSTLIST subsystem
5169
5170@c %**end of header
5171
5172Peers in the GNUnet overlay network need address information so that they
5173can connect with other peers. GNUnet uses so called HELLO messages to
5174store and exchange peer addresses.
5175GNUnet provides several methods for peers to obtain this information:
5176
5177@itemize @bullet
5178@item out-of-band exchange of HELLO messages (manually, using for example
5179gnunet-peerinfo)
5180@item HELLO messages shipped with GNUnet (automatic with distribution)
5181@item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
5182@item topology gossiping (learning from other peers we already connected
5183to), and
5184@item the HOSTLIST daemon covered in this section, which is particularly
5185relevant for bootstrapping new peers.
5186@end itemize
5187
5188New peers have no existing connections (and thus cannot learn from gossip
5189among peers), may not have other peers in their LAN and might be started
5190with an outdated set of HELLO messages from the distribution.
5191In this case, getting new peers to connect to the network requires either
5192manual effort or the use of a HOSTLIST to obtain HELLOs.
5193
5194@menu
5195* HELLOs::
5196* Overview for the HOSTLIST subsystem::
5197* Interacting with the HOSTLIST daemon::
5198* Hostlist security address validation::
5199* The HOSTLIST daemon::
5200* The HOSTLIST server::
5201* The HOSTLIST client::
5202* Usage::
5203@end menu
5204
5205@node HELLOs
5206@subsection HELLOs
5207
5208@c %**end of header
5209
5210The basic information peers require to connect to other peers are
5211contained in so called HELLO messages you can think of as a business card.
5212Besides the identity of the peer (based on the cryptographic public key) a
5213HELLO message may contain address information that specifies ways to
5214contact a peer. By obtaining HELLO messages, a peer can learn how to
5215contact other peers.
5216
5217@node Overview for the HOSTLIST subsystem
5218@subsection Overview for the HOSTLIST subsystem
5219
5220@c %**end of header
5221
5222The HOSTLIST subsystem provides a way to distribute and obtain contact
5223information to connect to other peers using a simple HTTP GET request.
5224It's implementation is split in three parts, the main file for the daemon
5225itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5226peer information (@file{hostlist-client.c}) and the server component used
5227to provide this information to other peers (@file{hostlist-server.c}).
5228The server is basically a small HTTP web server (based on GNU
5229libmicrohttpd) which provides a list of HELLOs known to the local peer for
5230download. The client component is basically a HTTP client
5231(based on libcurl) which can download hostlists from one or more websites.
5232The hostlist format is a binary blob containing a sequence of HELLO
5233messages. Note that any HTTP server can theoretically serve a hostlist,
5234the build-in hostlist server makes it simply convenient to offer this
5235service.
5236
5237
5238@menu
5239* Features::
5240* HOSTLIST - Limitations::
5241@end menu
5242
5243@node Features
5244@subsubsection Features
5245
5246@c %**end of header
5247
5248The HOSTLIST daemon can:
5249
5250@itemize @bullet
5251@item provide HELLO messages with validated addresses obtained from
5252PEERINFO to download for other peers
5253@item download HELLO messages and forward these message to the TRANSPORT
5254subsystem for validation
5255@item advertises the URL of this peer's hostlist address to other peers
5256via gossip
5257@item automatically learn about hostlist servers from the gossip of other
5258peers
5259@end itemize
5260
5261@node HOSTLIST - Limitations
5262@subsubsection HOSTLIST - Limitations
5263
5264@c %**end of header
5265
5266The HOSTLIST daemon does not:
5267
5268@itemize @bullet
5269@item verify the cryptographic information in the HELLO messages
5270@item verify the address information in the HELLO messages
5271@end itemize
5272
5273@node Interacting with the HOSTLIST daemon
5274@subsection Interacting with the HOSTLIST daemon
5275
5276@c %**end of header
5277
5278The HOSTLIST subsystem is currently implemented as a daemon, so there is
5279no need for the user to interact with it and therefore there is no
5280command line tool and no API to communicate with the daemon. In the
5281future, we can envision changing this to allow users to manually trigger
5282the download of a hostlist.
5283
5284Since there is no command line interface to interact with HOSTLIST, the
5285only way to interact with the hostlist is to use STATISTICS to obtain or
5286modify information about the status of HOSTLIST:
5287
5288@example
5289$ gnunet-statistics -s hostlist
5290@end example
5291
5292@noindent
5293In particular, HOSTLIST includes a @strong{persistent} value in statistics
5294that specifies when the hostlist server might be queried next. As this
5295value is exponentially increasing during runtime, developers may want to
5296reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5297to be shutdown if changes to this value are to have any effect on the
5298daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5299download frequency).
5300
5301@node Hostlist security address validation
5302@subsection Hostlist security address validation
5303
5304@c %**end of header
5305
5306Since information obtained from other parties cannot be trusted without
5307validation, we have to distinguish between @emph{validated} and
5308@emph{not validated} addresses. Before using (and so trusting)
5309information from other parties, this information has to be double-checked
5310(validated). Address validation is not done by HOSTLIST but by the
5311TRANSPORT service.
5312
5313The HOSTLIST component is functionally located between the PEERINFO and
5314the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5315(@emph{validated}) peer information (HELLO messages) from the PEERINFO
5316service and provides it to other peers. When acting as a client, it
5317contacts the HOSTLIST servers specified in the configuration, downloads
5318the (unvalidated) list of HELLO messages and forwards these information
5319to the TRANSPORT server to validate the addresses.
5320
5321@cindex HOSTLIST daemon
5322@node The HOSTLIST daemon
5323@subsection The HOSTLIST daemon
5324
5325@c %**end of header
5326
5327The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5328started by the ARM service and (if configured) starts the HOSTLIST client
5329and server components.
5330
5331If the daemon provides a hostlist itself it can advertise it's own
5332hostlist to other peers. To do so it sends a
5333@code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5334when they connect to this peer on the CORE level. This hostlist
5335advertisement message contains the URL to access the HOSTLIST HTTP
5336server of the sender. The daemon may also subscribe to this type of
5337message from CORE service, and then forward these kind of message to the
5338HOSTLIST client. The client then uses all available URLs to download peer
5339information when necessary.
5340
5341When starting, the HOSTLIST daemon first connects to the CORE subsystem
5342and if hostlist learning is enabled, registers a CORE handler to receive
5343this kind of messages. Next it starts (if configured) the client and
5344server. It passes pointers to CORE connect and disconnect and receive
5345handlers where the client and server store their functions, so the daemon
5346can notify them about CORE events.
5347
5348To clean up on shutdown, the daemon has a cleaning task, shutting down all
5349subsystems and disconnecting from CORE.
5350
5351@cindex HOSTLIST server
5352@node The HOSTLIST server
5353@subsection The HOSTLIST server
5354
5355@c %**end of header
5356
5357The server provides a way for other peers to obtain HELLOs. Basically it
5358is a small web server other peers can connect to and download a list of
5359HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5360to other peers connecting on CORE level.
5361
5362
5363@menu
5364* The HTTP Server::
5365* Advertising the URL::
5366@end menu
5367
5368@node The HTTP Server
5369@subsubsection The HTTP Server
5370
5371@c %**end of header
5372
5373During startup, the server starts a web server listening on the port
5374specified with the HTTPPORT value (default 8080). In addition it connects
5375to the PEERINFO service to obtain peer information. The HOSTLIST server
5376uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5377all peers and adds their information to a new hostlist if they are
5378suitable (expired addresses and HELLOs without addresses are both not
5379suitable) and the maximum size for a hostlist is not exceeded
5380(MAX_BYTES_PER_HOSTLISTS = 500000).
5381When PEERINFO finishes (with a last NULL callback), the server destroys
5382the previous hostlist response available for download on the web server
5383and replaces it with the updated hostlist. The hostlist format is
5384basically a sequence of HELLO messages (as obtained from PEERINFO) without
5385any special tokenization. Since each HELLO message contains a size field,
5386the response can easily be split into separate HELLO messages by the
5387client.
5388
5389A HOSTLIST client connecting to the HOSTLIST server will receive the
5390hostlist as a HTTP response and the the server will terminate the
5391connection with the result code @code{HTTP 200 OK}.
5392The connection will be closed immediately if no hostlist is available.
5393
5394@node Advertising the URL
5395@subsubsection Advertising the URL
5396
5397@c %**end of header
5398
5399The server also advertises the URL to download the hostlist to other peers
5400if hostlist advertisement is enabled.
5401When a new peer connects and has hostlist learning enabled, the server
5402sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5403peer using the CORE service.
5404
5405@cindex HOSTLIST client
5406@node The HOSTLIST client
5407@subsection The HOSTLIST client
5408
5409@c %**end of header
5410
5411The client provides the functionality to download the list of HELLOs from
5412a set of URLs.
5413It performs a standard HTTP request to the URLs configured and learned
5414from advertisement messages received from other peers. When a HELLO is
5415downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5416service for validation.
5417
5418The client supports two modes of operation:
5419
5420@itemize @bullet
5421@item download of HELLOs (bootstrapping)
5422@item learning of URLs
5423@end itemize
5424
5425@menu
5426* Bootstrapping::
5427* Learning::
5428@end menu
5429
5430@node Bootstrapping
5431@subsubsection Bootstrapping
5432
5433@c %**end of header
5434
5435For bootstrapping, it schedules a task to download the hostlist from the
5436set of known URLs.
5437The downloads are only performed if the number of current
5438connections is smaller than a minimum number of connections
5439(at the moment 4).
5440The interval between downloads increases exponentially; however, the
5441exponential growth is limited if it becomes longer than an hour.
5442At that point, the frequency growth is capped at
5443(#number of connections * 1h).
5444
5445Once the decision has been taken to download HELLOs, the daemon chooses a
5446random URL from the list of known URLs. URLs can be configured in the
5447configuration or be learned from advertisement messages.
5448The client uses a HTTP client library (libcurl) to initiate the download
5449using the libcurl multi interface.
5450Libcurl passes the data to the callback_download function which
5451stores the data in a buffer if space is available and the maximum size for
5452a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5453When a full HELLO was downloaded, the HOSTLIST client offers this
5454HELLO message to the TRANSPORT service for validation.
5455When the download is finished or failed, statistical information about the
5456quality of this URL is updated.
5457
5458@cindex HOSTLIST learning
5459@node Learning
5460@subsubsection Learning
5461
5462@c %**end of header
5463
5464The client also manages hostlist advertisements from other peers. The
5465HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5466messages to the client subsystem, which extracts the URL from the message.
5467Next, a test of the newly obtained URL is performed by triggering a
5468download from the new URL. If the URL works correctly, it is added to the
5469list of working URLs.
5470
5471The size of the list of URLs is restricted, so if an additional server is
5472added and the list is full, the URL with the worst quality ranking
5473(determined through successful downloads and number of HELLOs e.g.) is
5474discarded. During shutdown the list of URLs is saved to a file for
5475persistance and loaded on startup. URLs from the configuration file are
5476never discarded.
5477
5478@node Usage
5479@subsection Usage
5480
5481@c %**end of header
5482
5483To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5484section for the ARM services. This is done in the default configuration.
5485
5486For more information on how to configure the HOSTLIST subsystem see the
5487installation handbook:@
5488Configuring the hostlist to bootstrap@
5489Configuring your peer to provide a hostlist
5490
5491@cindex IDENTITY
5492@cindex identity subsystem
5493@node GNUnet's IDENTITY subsystem
5494@section GNUnet's IDENTITY subsystem
5495
5496@c %**end of header
5497
5498Identities of "users" in GNUnet are called egos.
5499Egos can be used as pseudonyms ("fake names") or be tied to an
5500organization (for example, "GNU") or even the actual identity of a human.
5501GNUnet users are expected to have many egos. They might have one tied to
5502their real identity, some for organizations they manage, and more for
5503different domains where they want to operate under a pseudonym.
5504
5505The IDENTITY service allows users to manage their egos. The identity
5506service manages the private keys egos of the local user; it does not
5507manage identities of other users (public keys). Public keys for other
5508users need names to become manageable. GNUnet uses the
5509@dfn{GNU Name System} (GNS) to give names to other users and manage their
5510public keys securely. This chapter is about the IDENTITY service,
5511which is about the management of private keys.
5512
5513On the network, an ego corresponds to an ECDSA key (over Curve25519,
5514using RFC 6979, as required by GNS). Thus, users can perform actions
5515under a particular ego by using (signing with) a particular private key.
5516Other users can then confirm that the action was really performed by that
5517ego by checking the signature against the respective public key.
5518
5519The IDENTITY service allows users to associate a human-readable name with
5520each ego. This way, users can use names that will remind them of the
5521purpose of a particular ego.
5522The IDENTITY service will store the respective private keys and
5523allows applications to access key information by name.
5524Users can change the name that is locally (!) associated with an ego.
5525Egos can also be deleted, which means that the private key will be removed
5526and it thus will not be possible to perform actions with that ego in the
5527future.
5528
5529Additionally, the IDENTITY subsystem can associate service functions with
5530egos.
5531For example, GNS requires the ego that should be used for the shorten
5532zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5533The IDENTITY service has a mapping of such service strings to the name of
5534the ego that the user wants to use for this service, for example
5535"my-short-zone-ego".
5536
5537Finally, the IDENTITY API provides access to a special ego, the
5538anonymous ego. The anonymous ego is special in that its private key is not
5539really private, but fixed and known to everyone.
5540Thus, anyone can perform actions as anonymous. This can be useful as with
5541this trick, code does not have to contain a special case to distinguish
5542between anonymous and pseudonymous egos.
5543
5544@menu
5545* libgnunetidentity::
5546* The IDENTITY Client-Service Protocol::
5547@end menu
5548
5549@cindex libgnunetidentity
5550@node libgnunetidentity
5551@subsection libgnunetidentity
5552@c %**end of header
5553
5554
5555@menu
5556* Connecting to the service::
5557* Operations on Egos::
5558* The anonymous Ego::
5559* Convenience API to lookup a single ego::
5560* Associating egos with service functions::
5561@end menu
5562
5563@node Connecting to the service
5564@subsubsection Connecting to the service
5565
5566@c %**end of header
5567
5568First, typical clients connect to the identity service using
5569@code{GNUNET_IDENTITY_connect}. This function takes a callback as a
5570parameter.
5571If the given callback parameter is non-null, it will be invoked to notify
5572the application about the current state of the identities in the system.
5573
5574@itemize @bullet
5575@item First, it will be invoked on all known egos at the time of the
5576connection. For each ego, a handle to the ego and the user's name for the
5577ego will be passed to the callback. Furthermore, a @code{void **} context
5578argument will be provided which gives the client the opportunity to
5579associate some state with the ego.
5580@item Second, the callback will be invoked with NULL for the ego, the name
5581and the context. This signals that the (initial) iteration over all egos
5582has completed.
5583@item Then, the callback will be invoked whenever something changes about
5584an ego.
5585If an ego is renamed, the callback is invoked with the ego handle of the
5586ego that was renamed, and the new name. If an ego is deleted, the callback
5587is invoked with the ego handle and a name of NULL. In the deletion case,
5588the application should also release resources stored in the context.
5589@item When the application destroys the connection to the identity service
5590using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
5591with the ego and a name of NULL (equivalent to deletion of the egos).
5592This should again be used to clean up the per-ego context.
5593@end itemize
5594
5595The ego handle passed to the callback remains valid until the callback is
5596invoked with a name of NULL, so it is safe to store a reference to the
5597ego's handle.
5598
5599@node Operations on Egos
5600@subsubsection Operations on Egos
5601
5602@c %**end of header
5603
5604Given an ego handle, the main operations are to get its associated private
5605key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
5606public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
5607
5608The other operations on egos are pretty straightforward.
5609Using @code{GNUNET_IDENTITY_create}, an application can request the
5610creation of an ego by specifying the desired name.
5611The operation will fail if that name is
5612already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
5613existing ego can be changed. Finally, egos can be deleted using
5614@code{GNUNET_IDENTITY_delete}. All of these operations will trigger
5615updates to the callback given to the @code{GNUNET_IDENTITY_connect}
5616function of all applications that are connected with the identity service
5617at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
5618operations before the respective continuations would be called.
5619It is not guaranteed that the operation will not be completed anyway,
5620only the continuation will no longer be called.
5621
5622@node The anonymous Ego
5623@subsubsection The anonymous Ego
5624
5625@c %**end of header
5626
5627A special way to obtain an ego handle is to call
5628@code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5629"anonymous" user --- anyone knows and can get the private key for this
5630user, so it is suitable for operations that are supposed to be anonymous
5631but require signatures (for example, to avoid a special path in the code).
5632The anonymous ego is always valid and accessing it does not require a
5633connection to the identity service.
5634
5635@node Convenience API to lookup a single ego
5636@subsubsection Convenience API to lookup a single ego
5637
5638
5639As applications commonly simply have to lookup a single ego, there is a
5640convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5641lookup a single ego by name. Note that this is the user's name for the
5642ego, not the service function. The resulting ego will be returned via a
5643callback and will only be valid during that callback. The operation can
5644be cancelled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
5645(cancellation is only legal before the callback is invoked).
5646
5647@node Associating egos with service functions
5648@subsubsection Associating egos with service functions
5649
5650
5651The @code{GNUNET_IDENTITY_set} function is used to associate a particular
5652ego with a service function. The name used by the service and the ego are
5653given as arguments.
5654Afterwards, the service can use its name to lookup the associated ego
5655using @code{GNUNET_IDENTITY_get}.
5656
5657@node The IDENTITY Client-Service Protocol
5658@subsection The IDENTITY Client-Service Protocol
5659
5660@c %**end of header
5661
5662A client connecting to the identity service first sends a message with
5663type
5664@code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5665client will receive information about changes to the egos by receiving
5666messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
5667Those messages contain the private key of the ego and the user's name of
5668the ego (or zero bytes for the name to indicate that the ego was deleted).
5669A special bit @code{end_of_list} is used to indicate the end of the
5670initial iteration over the identity service's egos.
5671
5672The client can trigger changes to the egos by sending @code{CREATE},
5673@code{RENAME} or @code{DELETE} messages.
5674The CREATE message contains the private key and the desired name.@
5675The RENAME message contains the old name and the new name.@
5676The DELETE message only needs to include the name of the ego to delete.@
5677The service responds to each of these messages with a @code{RESULT_CODE}
5678message which indicates success or error of the operation, and possibly
5679a human-readable error message.
5680
5681Finally, the client can bind the name of a service function to an ego by
5682sending a @code{SET_DEFAULT} message with the name of the service function
5683and the private key of the ego.
5684Such bindings can then be resolved using a @code{GET_DEFAULT} message,
5685which includes the name of the service function. The identity service
5686will respond to a GET_DEFAULT request with a SET_DEFAULT message
5687containing the respective information, or with a RESULT_CODE to
5688indicate an error.
5689
5690@cindex NAMESTORE
5691@cindex namestore subsystem
5692@node GNUnet's NAMESTORE Subsystem
5693@section GNUnet's NAMESTORE Subsystem
5694
5695The NAMESTORE subsystem provides persistent storage for local GNS zone
5696information. All local GNS zone information are managed by NAMESTORE. It
5697provides both the functionality to administer local GNS information (e.g.
5698delete and add records) as well as to retrieve GNS information (e.g to
5699list name information in a client).
5700NAMESTORE does only manage the persistent storage of zone information
5701belonging to the user running the service: GNS information from other
5702users obtained from the DHT are stored by the NAMECACHE subsystem.
5703
5704NAMESTORE uses a plugin-based database backend to store GNS information
5705with good performance. Here sqlite, MySQL and PostgreSQL are supported
5706database backends.
5707NAMESTORE clients interact with the IDENTITY subsystem to obtain
5708cryptographic information about zones based on egos as described with the
5709IDENTITY subsystem, but internally NAMESTORE refers to zones using the
5710ECDSA private key.
5711In addition, it collaborates with the NAMECACHE subsystem and
5712stores zone information when local information are modified in the
5713GNS cache to increase look-up performance for local information.
5714
5715NAMESTORE provides functionality to look-up and store records, to iterate
5716over a specific or all zones and to monitor zones for changes. NAMESTORE
5717functionality can be accessed using the NAMESTORE api or the NAMESTORE
5718command line tool.
5719
5720@menu
5721* libgnunetnamestore::
5722@end menu
5723
5724@cindex libgnunetnamestore
5725@node libgnunetnamestore
5726@subsection libgnunetnamestore
5727
5728To interact with NAMESTORE clients first connect to the NAMESTORE service
5729using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
5730As a result they obtain a NAMESTORE handle, they can use for operations,
5731or NULL is returned if the connection failed.
5732
5733To disconnect from NAMESTORE, clients use
5734@code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
5735
5736NAMESTORE internally uses the ECDSA private key to refer to zones. These
5737private keys can be obtained from the IDENTITY subsytem.
5738Here @emph{egos} @emph{can be used to refer to zones or the default ego
5739assigned to the GNS subsystem can be used to obtained the master zone's
5740private key.}
5741
5742
5743@menu
5744* Editing Zone Information::
5745* Iterating Zone Information::
5746* Monitoring Zone Information::
5747@end menu
5748
5749@node Editing Zone Information
5750@subsubsection Editing Zone Information
5751
5752@c %**end of header
5753
5754NAMESTORE provides functions to lookup records stored under a label in a
5755zone and to store records under a label in a zone.
5756
5757To store (and delete) records, the client uses the
5758@code{GNUNET_NAMESTORE_records_store} function and has to provide
5759namestore handle to use, the private key of the zone, the label to store
5760the records under, the records and number of records plus an callback
5761function.
5762After the operation is performed NAMESTORE will call the provided
5763callback function with the result GNUNET_SYSERR on failure
5764(including timeout/queue drop/failure to validate), GNUNET_NO if content
5765was already there or not found GNUNET_YES (or other positive value) on
5766success plus an additional error message.
5767
5768Records are deleted by using the store command with 0 records to store.
5769It is important to note, that records are not merged when records exist
5770with the label.
5771So a client has first to retrieve records, merge with existing records
5772and then store the result.
5773
5774To perform a lookup operation, the client uses the
5775@code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the
5776namestore handle, the private key of the zone and the label. He also has
5777to provide a callback function which will be called with the result of
5778the lookup operation:
5779the zone for the records, the label, and the records including the
5780number of records included.
5781
5782A special operation is used to set the preferred nickname for a zone.
5783This nickname is stored with the zone and is automatically merged with
5784all labels and records stored in a zone. Here the client uses the
5785@code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
5786the zone, the nickname as string plus a the callback with the result of
5787the operation.
5788
5789@node Iterating Zone Information
5790@subsubsection Iterating Zone Information
5791
5792@c %**end of header
5793
5794A client can iterate over all information in a zone or all zones managed
5795by NAMESTORE.
5796Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
5797function and passes the namestore handle, the zone to iterate over and a
5798callback function to call with the result.
5799If the client wants to iterate over all the, he passes NULL for the zone.
5800A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
5801continue iteration.
5802
5803NAMESTORE calls the callback for every result and expects the client to
5804call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
5805@code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
5806When NAMESTORE reached the last item it will call the callback with a
5807NULL value to indicate.
5808
5809@node Monitoring Zone Information
5810@subsubsection Monitoring Zone Information
5811
5812@c %**end of header
5813
5814Clients can also monitor zones to be notified about changes. Here the
5815clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
5816passes the private key of the zone and and a callback function to call
5817with updates for a zone.
5818The client can specify to obtain zone information first by iterating over
5819the zone and specify a synchronization callback to be called when the
5820client and the namestore are synced.
5821
5822On an update, NAMESTORE will call the callback with the private key of the
5823zone, the label and the records and their number.
5824
5825To stop monitoring, the client calls
5826@code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
5827from the function to start the monitoring.
5828
5829@cindex PEERINFO
5830@cindex peerinfo subsystem
5831@node GNUnet's PEERINFO subsystem
5832@section GNUnet's PEERINFO subsystem
5833
5834@c %**end of header
5835
5836The PEERINFO subsystem is used to store verified (validated) information
5837about known peers in a persistent way. It obtains these addresses for
5838example from TRANSPORT service which is in charge of address validation.
5839Validation means that the information in the HELLO message are checked by
5840connecting to the addresses and performing a cryptographic handshake to
5841authenticate the peer instance stating to be reachable with these
5842addresses.
5843Peerinfo does not validate the HELLO messages itself but only stores them
5844and gives them to interested clients.
5845
5846As future work, we think about moving from storing just HELLO messages to
5847providing a generic persistent per-peer information store.
5848More and more subsystems tend to need to store per-peer information in
5849persistent way.
5850To not duplicate this functionality we plan to provide a PEERSTORE
5851service providing this functionality.
5852
5853@menu
5854* PEERINFO - Features::
5855* PEERINFO - Limitations::
5856* DeveloperPeer Information::
5857* Startup::
5858* Managing Information::
5859* Obtaining Information::
5860* The PEERINFO Client-Service Protocol::
5861* libgnunetpeerinfo::
5862@end menu
5863
5864@node PEERINFO - Features
5865@subsection PEERINFO - Features
5866
5867@c %**end of header
5868
5869@itemize @bullet
5870@item Persistent storage
5871@item Client notification mechanism on update
5872@item Periodic clean up for expired information
5873@item Differentiation between public and friend-only HELLO
5874@end itemize
5875
5876@node PEERINFO - Limitations
5877@subsection PEERINFO - Limitations
5878
5879
5880@itemize @bullet
5881@item Does not perform HELLO validation
5882@end itemize
5883
5884@node DeveloperPeer Information
5885@subsection DeveloperPeer Information
5886
5887@c %**end of header
5888
5889The PEERINFO subsystem stores these information in the form of HELLO
5890messages you can think of as business cards.
5891These HELLO messages contain the public key of a peer and the addresses
5892a peer can be reached under.
5893The addresses include an expiration date describing how long they are
5894valid. This information is updated regularly by the TRANSPORT service by
5895revalidating the address.
5896If an address is expired and not renewed, it can be removed from the
5897HELLO message.
5898
5899Some peer do not want to have their HELLO messages distributed to other
5900peers, especially when GNUnet's friend-to-friend modus is enabled.
5901To prevent this undesired distribution. PEERINFO distinguishes between
5902@emph{public} and @emph{friend-only} HELLO messages.
5903Public HELLO messages can be freely distributed to other (possibly
5904unknown) peers (for example using the hostlist, gossiping, broadcasting),
5905whereas friend-only HELLO messages may not be distributed to other peers.
5906Friend-only HELLO messages have an additional flag @code{friend_only} set
5907internally. For public HELLO message this flag is not set.
5908PEERINFO does and cannot not check if a client is allowed to obtain a
5909specific HELLO type.
5910
5911The HELLO messages can be managed using the GNUnet HELLO library.
5912Other GNUnet systems can obtain these information from PEERINFO and use
5913it for their purposes.
5914Clients are for example the HOSTLIST component providing these
5915information to other peers in form of a hostlist or the TRANSPORT
5916subsystem using these information to maintain connections to other peers.
5917
5918@node Startup
5919@subsection Startup
5920
5921@c %**end of header
5922
5923During startup the PEERINFO services loads persistent HELLOs from disk.
5924First PEERINFO parses the directory configured in the HOSTS value of the
5925@code{PEERINFO} configuration section to store PEERINFO information.
5926For all files found in this directory valid HELLO messages are extracted.
5927In addition it loads HELLO messages shipped with the GNUnet distribution.
5928These HELLOs are used to simplify network bootstrapping by providing
5929valid peer information with the distribution.
5930The use of these HELLOs can be prevented by setting the
5931@code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
5932@code{NO}. Files containing invalid information are removed.
5933
5934@node Managing Information
5935@subsection Managing Information
5936
5937@c %**end of header
5938
5939The PEERINFO services stores information about known PEERS and a single
5940HELLO message for every peer.
5941A peer does not need to have a HELLO if no information are available.
5942HELLO information from different sources, for example a HELLO obtained
5943from a remote HOSTLIST and a second HELLO stored on disk, are combined
5944and merged into one single HELLO message per peer which will be given to
5945clients. During this merge process the HELLO is immediately written to
5946disk to ensure persistence.
5947
5948PEERINFO in addition periodically scans the directory where information
5949are stored for empty HELLO messages with expired TRANSPORT addresses.
5950This periodic task scans all files in the directory and recreates the
5951HELLO messages it finds.
5952Expired TRANSPORT addresses are removed from the HELLO and if the
5953HELLO does not contain any valid addresses, it is discarded and removed
5954from the disk.
5955
5956@node Obtaining Information
5957@subsection Obtaining Information
5958
5959@c %**end of header
5960
5961When a client requests information from PEERINFO, PEERINFO performs a
5962lookup for the respective peer or all peers if desired and transmits this
5963information to the client.
5964The client can specify if friend-only HELLOs have to be included or not
5965and PEERINFO filters the respective HELLO messages before transmitting
5966information.
5967
5968To notify clients about changes to PEERINFO information, PEERINFO
5969maintains a list of clients interested in this notifications.
5970Such a notification occurs if a HELLO for a peer was updated (due to a
5971merge for example) or a new peer was added.
5972
5973@node The PEERINFO Client-Service Protocol
5974@subsection The PEERINFO Client-Service Protocol
5975
5976@c %**end of header
5977
5978To connect and disconnect to and from the PEERINFO Service PEERINFO
5979utilizes the util client/server infrastructure, so no special messages
5980types are used here.
5981
5982To add information for a peer, the plain HELLO message is transmitted to
5983the service without any wrapping. All pieces of information required are
5984stored within the HELLO message.
5985The PEERINFO service provides a message handler accepting and processing
5986these HELLO messages.
5987
5988When obtaining PEERINFO information using the iterate functionality
5989specific messages are used. To obtain information for all peers, a
5990@code{struct ListAllPeersMessage} with message type
5991@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag
5992include_friend_only to indicate if friend-only HELLO messages should be
5993included are transmitted. If information for a specific peer is required
5994a @code{struct ListAllPeersMessage} with
5995@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
5996used.
5997
5998For both variants the PEERINFO service replies for each HELLO message it
5999wants to transmit with a @code{struct ListAllPeersMessage} with type
6000@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO.
6001The final message is @code{struct GNUNET_MessageHeader} with type
6002@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this
6003message, it can proceed with the next request if any is pending.
6004
6005@node libgnunetpeerinfo
6006@subsection libgnunetpeerinfo
6007
6008@c %**end of header
6009
6010The PEERINFO API consists mainly of three different functionalities:
6011
6012@itemize @bullet
6013@item maintaining a connection to the service
6014@item adding new information to the PEERINFO service
6015@item retrieving information from the PEERINFO service
6016@end itemize
6017
6018@menu
6019* Connecting to the PEERINFO Service::
6020* Adding Information to the PEERINFO Service::
6021* Obtaining Information from the PEERINFO Service::
6022@end menu
6023
6024@node Connecting to the PEERINFO Service
6025@subsubsection Connecting to the PEERINFO Service
6026
6027@c %**end of header
6028
6029To connect to the PEERINFO service the function
6030@code{GNUNET_PEERINFO_connect} is used, taking a configuration handle as
6031an argument, and to disconnect from PEERINFO the function
6032@code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
6033handle returned from the connect function has to be called.
6034
6035@node Adding Information to the PEERINFO Service
6036@subsubsection Adding Information to the PEERINFO Service
6037
6038@c %**end of header
6039
6040@code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
6041storage. This function takes the PEERINFO handle as an argument, the HELLO
6042message to store and a continuation with a closure to be called with the
6043result of the operation.
6044The @code{GNUNET_PEERINFO_add_peer} returns a handle to this operation
6045allowing to cancel the operation with the respective cancel function
6046@code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from
6047PEERINFO you can iterate over all information stored with PEERINFO or you
6048can tell PEERINFO to notify if new peer information are available.
6049
6050@node Obtaining Information from the PEERINFO Service
6051@subsubsection Obtaining Information from the PEERINFO Service
6052
6053@c %**end of header
6054
6055To iterate over information in PEERINFO you use
6056@code{GNUNET_PEERINFO_iterate}.
6057This function expects the PEERINFO handle, a flag if HELLO messages
6058intended for friend only mode should be included, a timeout how long the
6059operation should take and a callback with a callback closure to be called
6060for the results.
6061If you want to obtain information for a specific peer, you can specify
6062the peer identity, if this identity is NULL, information for all peers are
6063returned. The function returns a handle to allow to cancel the operation
6064using @code{GNUNET_PEERINFO_iterate_cancel}.
6065
6066To get notified when peer information changes, you can use
6067@code{GNUNET_PEERINFO_notify}.
6068This function expects a configuration handle and a flag if friend-only
6069HELLO messages should be included. The PEERINFO service will notify you
6070about every change and the callback function will be called to notify you
6071about changes. The function returns a handle to cancel notifications
6072with @code{GNUNET_PEERINFO_notify_cancel}.
6073
6074@cindex PEERSTORE subsystem
6075@node GNUnet's PEERSTORE subsystem
6076@section GNUnet's PEERSTORE subsystem
6077
6078@c %**end of header
6079
6080GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
6081GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently
6082store and retrieve arbitrary data.
6083Each data record stored with PEERSTORE contains the following fields:
6084
6085@itemize @bullet
6086@item subsystem: Name of the subsystem responsible for the record.
6087@item peerid: Identity of the peer this record is related to.
6088@item key: a key string identifying the record.
6089@item value: binary record value.
6090@item expiry: record expiry date.
6091@end itemize
6092
6093@menu
6094* Functionality::
6095* Architecture::
6096* libgnunetpeerstore::
6097@end menu
6098
6099@node Functionality
6100@subsection Functionality
6101
6102@c %**end of header
6103
6104Subsystems can store any type of value under a (subsystem, peerid, key)
6105combination. A "replace" flag set during store operations forces the
6106PEERSTORE to replace any old values stored under the same
6107(subsystem, peerid, key) combination with the new value.
6108Additionally, an expiry date is set after which the record is *possibly*
6109deleted by PEERSTORE.
6110
6111Subsystems can iterate over all values stored under any of the following
6112combination of fields:
6113
6114@itemize @bullet
6115@item (subsystem)
6116@item (subsystem, peerid)
6117@item (subsystem, key)
6118@item (subsystem, peerid, key)
6119@end itemize
6120
6121Subsystems can also request to be notified about any new values stored
6122under a (subsystem, peerid, key) combination by sending a "watch"
6123request to PEERSTORE.
6124
6125@node Architecture
6126@subsection Architecture
6127
6128@c %**end of header
6129
6130PEERSTORE implements the following components:
6131
6132@itemize @bullet
6133@item PEERSTORE service: Handles store, iterate and watch operations.
6134@item PEERSTORE API: API to be used by other subsystems to communicate and
6135issue commands to the PEERSTORE service.
6136@item PEERSTORE plugins: Handles the persistent storage. At the moment,
6137only an "sqlite" plugin is implemented.
6138@end itemize
6139
6140@cindex libgnunetpeerstore
6141@node libgnunetpeerstore
6142@subsection libgnunetpeerstore
6143
6144@c %**end of header
6145
6146libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
6147wishing to communicate with the PEERSTORE service use this API to open a
6148connection to PEERSTORE. This is done by calling
6149@code{GNUNET_PEERSTORE_connect} which returns a handle to the newly
6150created connection.
6151This handle has to be used with any further calls to the API.
6152
6153To store a new record, the function @code{GNUNET_PEERSTORE_store} is to
6154be used which requires the record fields and a continuation function that
6155will be called by the API after the STORE request is sent to the
6156PEERSTORE service.
6157Note that calling the continuation function does not mean that the record
6158is successfully stored, only that the STORE request has been successfully
6159sent to the PEERSTORE service.
6160@code{GNUNET_PEERSTORE_store_cancel} can be called to cancel the STORE
6161request only before the continuation function has been called.
6162
6163To iterate over stored records, the function
6164@code{GNUNET_PEERSTORE_iterate} is
6165to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
6166callback function will be called with each matching record found and a
6167NULL record at the end to signal the end of result set.
6168@code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
6169request before the iterator callback is called with a NULL record.
6170
6171To be notified with new values stored under a (subsystem, peerid, key)
6172combination, the function @code{GNUNET_PEERSTORE_watch} is to be used.
6173This will register the watcher with the PEERSTORE service, any new
6174records matching the given combination will trigger the callback
6175function passed to @code{GNUNET_PEERSTORE_watch}. This continues until
6176@code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the
6177service is destroyed.
6178
6179After the connection is no longer needed, the function
6180@code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
6181PEERSTORE service.
6182Any pending ITERATE or WATCH requests will be destroyed.
6183If the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will
6184delay the disconnection until all pending STORE requests are sent to
6185the PEERSTORE service, otherwise, the pending STORE requests will be
6186destroyed as well.
6187
6188@cindex SET Subsystem
6189@node GNUnet's SET Subsystem
6190@section GNUnet's SET Subsystem
6191
6192@c %**end of header
6193
6194The SET service implements efficient set operations between two peers
6195over a mesh tunnel.
6196Currently, set union and set intersection are the only supported
6197operations. Elements of a set consist of an @emph{element type} and
6198arbitrary binary @emph{data}.
6199The size of an element's data is limited to around 62 KB.
6200
6201@menu
6202* Local Sets::
6203* Set Modifications::
6204* Set Operations::
6205* Result Elements::
6206* libgnunetset::
6207* The SET Client-Service Protocol::
6208* The SET Intersection Peer-to-Peer Protocol::
6209* The SET Union Peer-to-Peer Protocol::
6210@end menu
6211
6212@node Local Sets
6213@subsection Local Sets
6214
6215@c %**end of header
6216
6217Sets created by a local client can be modified and reused for multiple
6218operations. As each set operation requires potentially expensive special
6219auxilliary data to be computed for each element of a set, a set can only
6220participate in one type of set operation (i.e. union or intersection).
6221The type of a set is determined upon its creation.
6222If a the elements of a set are needed for an operation of a different
6223type, all of the set's element must be copied to a new set of appropriate
6224type.
6225
6226@node Set Modifications
6227@subsection Set Modifications
6228
6229@c %**end of header
6230
6231Even when set operations are active, one can add to and remove elements
6232from a set.
6233However, these changes will only be visible to operations that have been
6234created after the changes have taken place. That is, every set operation
6235only sees a snapshot of the set from the time the operation was started.
6236This mechanism is @emph{not} implemented by copying the whole set, but by
6237attaching @emph{generation information} to each element and operation.
6238
6239@node Set Operations
6240@subsection Set Operations
6241
6242@c %**end of header
6243
6244Set operations can be started in two ways: Either by accepting an
6245operation request from a remote peer, or by requesting a set operation
6246from a remote peer.
6247Set operations are uniquely identified by the involved @emph{peers}, an
6248@emph{application id} and the @emph{operation type}.
6249
6250The client is notified of incoming set operations by @emph{set listeners}.
6251A set listener listens for incoming operations of a specific operation
6252type and application id.
6253Once notified of an incoming set request, the client can accept the set
6254request (providing a local set for the operation) or reject it.
6255
6256@node Result Elements
6257@subsection Result Elements
6258
6259@c %**end of header
6260
6261The SET service has three @emph{result modes} that determine how an
6262operation's result set is delivered to the client:
6263
6264@itemize @bullet
6265@item @strong{Full Result Set.} All elements of set resulting from the set
6266operation are returned to the client.
6267@item @strong{Added Elements.} Only elements that result from the
6268operation and are not already in the local peer's set are returned.
6269Note that for some operations (like set intersection) this result mode
6270will never return any elements.
6271This can be useful if only the remove peer is actually interested in
6272the result of the set operation.
6273@item @strong{Removed Elements.} Only elements that are in the local
6274peer's initial set but not in the operation's result set are returned.
6275Note that for some operations (like set union) this result mode will
6276never return any elements. This can be useful if only the remove peer is
6277actually interested in the result of the set operation.
6278@end itemize
6279
6280@cindex libgnunetset
6281@node libgnunetset
6282@subsection libgnunetset
6283
6284@c %**end of header
6285
6286@menu
6287* Sets::
6288* Listeners::
6289* Operations::
6290* Supplying a Set::
6291* The Result Callback::
6292@end menu
6293
6294@node Sets
6295@subsubsection Sets
6296
6297@c %**end of header
6298
6299New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6300configuration (as each set has its own client connection) and the
6301operation type must be specified.
6302The set exists until either the client calls @code{GNUNET_SET_destroy} or
6303the client's connection to the service is disrupted.
6304In the latter case, the client is notified by the return value of
6305functions dealing with sets. This return value must always be checked.
6306
6307Elements are added and removed with @code{GNUNET_SET_add_element} and
6308@code{GNUNET_SET_remove_element}.
6309
6310@node Listeners
6311@subsubsection Listeners
6312
6313@c %**end of header
6314
6315Listeners are created with @code{GNUNET_SET_listen}. Each time time a
6316remote peer suggests a set operation with an application id and operation
6317type matching a listener, the listener's callback is invoked.
6318The client then must synchronously call either @code{GNUNET_SET_accept}
6319or @code{GNUNET_SET_reject}. Note that the operation will not be started
6320until the client calls @code{GNUNET_SET_commit}
6321(see Section "Supplying a Set").
6322
6323@node Operations
6324@subsubsection Operations
6325
6326@c %**end of header
6327
6328Operations to be initiated by the local peer are created with
6329@code{GNUNET_SET_prepare}. Note that the operation will not be started
6330until the client calls @code{GNUNET_SET_commit}
6331(see Section "Supplying a Set").
6332
6333@node Supplying a Set
6334@subsubsection Supplying a Set
6335
6336@c %**end of header
6337
6338To create symmetry between the two ways of starting a set operation
6339(accepting and nitiating it), the operation handles returned by
6340@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare} do not yet have a
6341set to operate on, thus they can not do any work yet.
6342
6343The client must call @code{GNUNET_SET_commit} to specify a set to use for
6344an operation. @code{GNUNET_SET_commit} may only be called once per set
6345operation.
6346
6347@node The Result Callback
6348@subsubsection The Result Callback
6349
6350@c %**end of header
6351
6352Clients must specify both a result mode and a result callback with
6353@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result
6354callback with a status indicating either that an element was received, or
6355the operation failed or succeeded.
6356The interpretation of the received element depends on the result mode.
6357The callback needs to know which result mode it is used in, as the
6358arguments do not indicate if an element is part of the full result set,
6359or if it is in the difference between the original set and the final set.
6360
6361@node The SET Client-Service Protocol
6362@subsection The SET Client-Service Protocol
6363
6364@c %**end of header
6365
6366@menu
6367* Creating Sets::
6368* Listeners2::
6369* Initiating Operations::
6370* Modifying Sets::
6371* Results and Operation Status::
6372* Iterating Sets::
6373@end menu
6374
6375@node Creating Sets
6376@subsubsection Creating Sets
6377
6378@c %**end of header
6379
6380For each set of a client, there exists a client connection to the service.
6381Sets are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message
6382over a new client connection. Multiple operations for one set are
6383multiplexed over one client connection, using a request id supplied by
6384the client.
6385
6386@node Listeners2
6387@subsubsection Listeners2
6388
6389@c %**end of header
6390
6391Each listener also requires a seperate client connection. By sending the
6392@code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service
6393of the application id and operation type it is interested in. A client
6394rejects an incoming request by sending @code{GNUNET_SERVICE_SET_REJECT}
6395on the listener's client connection.
6396In contrast, when accepting an incoming request, a
6397@code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that
6398is supplied for the set operation.
6399
6400@node Initiating Operations
6401@subsubsection Initiating Operations
6402
6403@c %**end of header
6404
6405Operations with remote peers are initiated by sending a
6406@code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6407connection that this message is sent by determines the set to use.
6408
6409@node Modifying Sets
6410@subsubsection Modifying Sets
6411
6412@c %**end of header
6413
6414Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6415@code{GNUNET_SERVICE_SET_REMOVE} messages.
6416
6417
6418@c %@menu
6419@c %* Results and Operation Status::
6420@c %* Iterating Sets::
6421@c %@end menu
6422
6423@node Results and Operation Status
6424@subsubsection Results and Operation Status
6425@c %**end of header
6426
6427The service notifies the client of result elements and success/failure of
6428a set operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6429
6430@node Iterating Sets
6431@subsubsection Iterating Sets
6432
6433@c %**end of header
6434
6435All elements of a set can be requested by sending
6436@code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6437@code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the
6438iteration with @code{GNUNET_SERVICE_SET_ITER_DONE}.
6439After each received element, the client
6440must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6441iteration may be active for a set at any given time.
6442
6443@node The SET Intersection Peer-to-Peer Protocol
6444@subsection The SET Intersection Peer-to-Peer Protocol
6445
6446@c %**end of header
6447
6448The intersection protocol operates over CADET and starts with a
6449GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6450initiating the operation to the peer listening for inbound requests.
6451It includes the number of elements of the initiating peer, which is used
6452to decide which side will send a Bloom filter first.
6453
6454The listening peer checks if the operation type and application
6455identifier are acceptable for its current state.
6456If not, it responds with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of
6457GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel).
6458
6459If the application accepts the request, the listener sends back a
6460@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} if it has
6461more elements in the set than the client.
6462Otherwise, it immediately starts with the Bloom filter exchange.
6463If the initiator receives a
6464@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO} response,
6465it beings the Bloom filter exchange, unless the set size is indicated to
6466be zero, in which case the intersection is considered finished after
6467just the initial handshake.
6468
6469
6470@menu
6471* The Bloom filter exchange::
6472* Salt::
6473@end menu
6474
6475@node The Bloom filter exchange
6476@subsubsection The Bloom filter exchange
6477
6478@c %**end of header
6479
6480In this phase, each peer transmits a Bloom filter over the remaining
6481keys of the local set to the other peer using a
6482@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF} message. This
6483message additionally includes the number of elements left in the sender's
6484set, as well as the XOR over all of the keys in that set.
6485
6486The number of bits 'k' set per element in the Bloom filter is calculated
6487based on the relative size of the two sets.
6488Furthermore, the size of the Bloom filter is calculated based on 'k' and
6489the number of elements in the set to maximize the amount of data filtered
6490per byte transmitted on the wire (while avoiding an excessively high
6491number of iterations).
6492
6493The receiver of the message removes all elements from its local set that
6494do not pass the Bloom filter test.
6495It then checks if the set size of the sender and the XOR over the keys
6496match what is left of his own set. If they do, he sends a
6497@code{GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE} back to indicate
6498that the latest set is the final result.
6499Otherwise, the receiver starts another Bloom fitler exchange, except
6500this time as the sender.
6501
6502@node Salt
6503@subsubsection Salt
6504
6505@c %**end of header
6506
6507Bloomfilter operations are probablistic: With some non-zero probability
6508the test may incorrectly say an element is in the set, even though it is
6509not.
6510
6511To mitigate this problem, the intersection protocol iterates exchanging
6512Bloom filters using a different random 32-bit salt in each iteration (the
6513salt is also included in the message).
6514With different salts, set operations may fail for different elements.
6515Merging the results from the executions, the probability of failure drops
6516to zero.
6517
6518The iterations terminate once both peers have established that they have
6519sets of the same size, and where the XOR over all keys computes the same
6520512-bit value (leaving a failure probability of 2-511).
6521
6522@node The SET Union Peer-to-Peer Protocol
6523@subsection The SET Union Peer-to-Peer Protocol
6524
6525@c %**end of header
6526
6527The SET union protocol is based on Eppstein's efficient set reconciliation
6528without prior context. You should read this paper first if you want to
6529understand the protocol.
6530
6531The union protocol operates over CADET and starts with a
6532GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer
6533initiating the operation to the peer listening for inbound requests.
6534It includes the number of elements of the initiating peer, which is
6535currently not used.
6536
6537The listening peer checks if the operation type and application
6538identifier are acceptable for its current state. If not, it responds with
6539a @code{GNUNET_MESSAGE_TYPE_SET_RESULT} and a status of
6540@code{GNUNET_SET_STATUS_FAILURE} (and terminates the CADET channel).
6541
6542If the application accepts the request, it sends back a strata estimator
6543using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The
6544initiator evaluates the strata estimator and initiates the exchange of
6545invertible Bloom filters, sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6546
6547During the IBF exchange, if the receiver cannot invert the Bloom filter or
6548detects a cycle, it sends a larger IBF in response (up to a defined
6549maximum limit; if that limit is reached, the operation fails).
6550Elements decoded while processing the IBF are transmitted to the other
6551peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the
6552other peer using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages,
6553depending on the sign observed during decoding of the IBF.
6554Peers respond to a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message
6555with the respective element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS
6556message. If the IBF fully decodes, the peer responds with a
6557GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another
6558GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6559
6560All Bloom filter operations use a salt to mingle keys before hasing them
6561into buckets, such that future iterations have a fresh chance of
6562succeeding if they failed due to collisions before.
6563
6564@cindex STATISTICS subsystem
6565@node GNUnet's STATISTICS subsystem
6566@section GNUnet's STATISTICS subsystem
6567
6568@c %**end of header
6569
6570In GNUnet, the STATISTICS subsystem offers a central place for all
6571subsystems to publish unsigned 64-bit integer run-time statistics.
6572Keeping this information centrally means that there is a unified way for
6573the user to obtain data on all subsystems, and individual subsystems do
6574not have to always include a custom data export method for performance
6575metrics and other statistics. For example, the TRANSPORT system uses
6576STATISTICS to update information about the number of directly connected
6577peers and the bandwidth that has been consumed by the various plugins.
6578This information is valuable for diagnosing connectivity and performance
6579issues.
6580
6581Following the GNUnet service architecture, the STATISTICS subsystem is
6582divided into an API which is exposed through the header
6583@strong{gnunet_statistics_service.h} and the STATISTICS service
6584@strong{gnunet-service-statistics}. The @strong{gnunet-statistics}
6585command-line tool can be used to obtain (and change) information about
6586the values stored by the STATISTICS service. The STATISTICS service does
6587not communicate with other peers.
6588
6589Data is stored in the STATISTICS service in the form of tuples
6590@strong{(subsystem, name, value, persistence)}. The subsystem determines
6591to which other GNUnet's subsystem the data belongs. name is the name
6592through which value is associated. It uniquely identifies the record
6593from among other records belonging to the same subsystem.
6594In some parts of the code, the pair @strong{(subsystem, name)} is called
6595a @strong{statistic} as it identifies the values stored in the STATISTCS
6596service.The persistence flag determines if the record has to be preserved
6597across service restarts. A record is said to be persistent if this flag
6598is set for it; if not, the record is treated as a non-persistent record
6599and it is lost after service restart. Persistent records are written to
6600and read from the file @strong{statistics.data} before shutdown
6601and upon startup. The file is located in the HOME directory of the peer.
6602
6603An anomaly of the STATISTICS service is that it does not terminate
6604immediately upon receiving a shutdown signal if it has any clients
6605connected to it. It waits for all the clients that are not monitors to
6606close their connections before terminating itself.
6607This is to prevent the loss of data during peer shutdown --- delaying the
6608STATISTICS service shutdown helps other services to store important data
6609to STATISTICS during shutdown.
6610
6611@menu
6612* libgnunetstatistics::
6613* The STATISTICS Client-Service Protocol::
6614@end menu
6615
6616@cindex libgnunetstatistics
6617@node libgnunetstatistics
6618@subsection libgnunetstatistics
6619
6620@c %**end of header
6621
6622@strong{libgnunetstatistics} is the library containing the API for the
6623STATISTICS subsystem. Any process requiring to use STATISTICS should use
6624this API by to open a connection to the STATISTICS service.
6625This is done by calling the function @code{GNUNET_STATISTICS_create()}.
6626This function takes the subsystem's name which is trying to use STATISTICS
6627and a configuration.
6628All values written to STATISTICS with this connection will be placed in
6629the section corresponding to the given subsystem's name.
6630The connection to STATISTICS can be destroyed with the function
6631@code{GNUNET_STATISTICS_destroy()}. This function allows for the
6632connection to be destroyed immediately or upon transferring all
6633pending write requests to the service.
6634
6635Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
6636under the @code{[STATISTICS]} section in the configuration. With such a
6637configuration all calls to @code{GNUNET_STATISTICS_create()} return
6638@code{NULL} as the STATISTICS subsystem is unavailable and no other
6639functions from the API can be used.
6640
6641
6642@menu
6643* Statistics retrieval::
6644* Setting statistics and updating them::
6645* Watches::
6646@end menu
6647
6648@node Statistics retrieval
6649@subsubsection Statistics retrieval
6650
6651@c %**end of header
6652
6653Once a connection to the statistics service is obtained, information
6654about any other system which uses statistics can be retrieved with the
6655function GNUNET_STATISTICS_get().
6656This function takes the connection handle, the name of the subsystem
6657whose information we are interested in (a @code{NULL} value will
6658retrieve information of all available subsystems using STATISTICS), the
6659name of the statistic we are interested in (a @code{NULL} value will
6660retrieve all available statistics), a continuation callback which is
6661called when all of requested information is retrieved, an iterator
6662callback which is called for each parameter in the retrieved information
6663and a closure for the aforementioned callbacks. The library then invokes
6664the iterator callback for each value matching the request.
6665
6666Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be
6667canceled with the function @code{GNUNET_STATISTICS_get_cancel()}.
6668This is helpful when retrieving statistics takes too long and especially
6669when we want to shutdown and cleanup everything.
6670
6671@node Setting statistics and updating them
6672@subsubsection Setting statistics and updating them
6673
6674@c %**end of header
6675
6676So far we have seen how to retrieve statistics, here we will learn how we
6677can set statistics and update them so that other subsystems can retrieve
6678them.
6679
6680A new statistic can be set using the function
6681@code{GNUNET_STATISTICS_set()}.
6682This function takes the name of the statistic and its value and a flag to
6683make the statistic persistent.
6684The value of the statistic should be of the type @code{uint64_t}.
6685The function does not take the name of the subsystem; it is determined
6686from the previous @code{GNUNET_STATISTICS_create()} invocation. If
6687the given statistic is already present, its value is overwritten.
6688
6689An existing statistics can be updated, i.e its value can be increased or
6690decreased by an amount with the function
6691@code{GNUNET_STATISTICS_update()}.
6692The parameters to this function are similar to
6693@code{GNUNET_STATISTICS_set()}, except that it takes the amount to be
6694changed as a type @code{int64_t} instead of the value.
6695
6696The library will combine multiple set or update operations into one
6697message if the client performs requests at a rate that is faster than the
6698available IPC with the STATISTICS service. Thus, the client does not have
6699to worry about sending requests too quickly.
6700
6701@node Watches
6702@subsubsection Watches
6703
6704@c %**end of header
6705
6706As interesting feature of STATISTICS lies in serving notifications
6707whenever a statistic of our interest is modified.
6708This is achieved by registering a watch through the function
6709@code{GNUNET_STATISTICS_watch()}.
6710The parameters of this function are similar to those of
6711@code{GNUNET_STATISTICS_get()}.
6712Changes to the respective statistic's value will then cause the given
6713iterator callback to be called.
6714Note: A watch can only be registered for a specific statistic. Hence
6715the subsystem name and the parameter name cannot be @code{NULL} in a
6716call to @code{GNUNET_STATISTICS_watch()}.
6717
6718A registered watch will keep notifying any value changes until
6719@code{GNUNET_STATISTICS_watch_cancel()} is called with the same
6720parameters that are used for registering the watch.
6721
6722@node The STATISTICS Client-Service Protocol
6723@subsection The STATISTICS Client-Service Protocol
6724@c %**end of header
6725
6726
6727@menu
6728* Statistics retrieval2::
6729* Setting and updating statistics::
6730* Watching for updates::
6731@end menu
6732
6733@node Statistics retrieval2
6734@subsubsection Statistics retrieval2
6735
6736@c %**end of header
6737
6738To retrieve statistics, the client transmits a message of type
6739@code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem
6740name and statistic parameter to the STATISTICS service.
6741The service responds with a message of type
6742@code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the statistics
6743parameters that match the client request for the client. The end of
6744information retrieved is signaled by the service by sending a message of
6745type @code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
6746
6747@node Setting and updating statistics
6748@subsubsection Setting and updating statistics
6749
6750@c %**end of header
6751
6752The subsystem name, parameter name, its value and the persistence flag are
6753communicated to the service through the message
6754@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
6755
6756When the service receives a message of type
6757@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem
6758name and checks for a statistic parameter with matching the name given in
6759the message.
6760If a statistic parameter is found, the value is overwritten by the new
6761value from the message; if not found then a new statistic parameter is
6762created with the given name and value.
6763
6764In addition to just setting an absolute value, it is possible to perform a
6765relative update by sending a message of type
6766@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
6767(@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in
6768the message should be treated as an update value.
6769
6770@node Watching for updates
6771@subsubsection Watching for updates
6772
6773@c %**end of header
6774
6775The function registers the watch at the service by sending a message of
6776type @code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
6777notifications through messages of type
6778@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
6779parameter's value is changed.
6780
6781@cindex DHT
6782@cindex Distributed Hash Table
6783@node GNUnet's Distributed Hash Table (DHT)
6784@section GNUnet's Distributed Hash Table (DHT)
6785
6786@c %**end of header
6787
6788GNUnet includes a generic distributed hash table that can be used by
6789developers building P2P applications in the framework.
6790This section documents high-level features and how developers are
6791expected to use the DHT.
6792We have a research paper detailing how the DHT works.
6793Also, Nate's thesis includes a detailed description and performance
6794analysis (in chapter 6).
6795
6796Key features of GNUnet's DHT include:
6797
6798@itemize @bullet
6799@item stores key-value pairs with values up to (approximately) 63k in size
6800@item works with many underlay network topologies (small-world, random
6801graph), underlay does not need to be a full mesh / clique
6802@item support for extended queries (more than just a simple 'key'),
6803filtering duplicate replies within the network (bloomfilter) and content
6804validation (for details, please read the subsection on the block library)
6805@item can (optionally) return paths taken by the PUT and GET operations
6806to the application
6807@item provides content replication to handle churn
6808@end itemize
6809
6810GNUnet's DHT is randomized and unreliable. Unreliable means that there is
6811no strict guarantee that a value stored in the DHT is always
6812found --- values are only found with high probability.
6813While this is somewhat true in all P2P DHTs, GNUnet developers should be
6814particularly wary of this fact (this will help you write secure,
6815fault-tolerant code). Thus, when writing any application using the DHT,
6816you should always consider the possibility that a value stored in the
6817DHT by you or some other peer might simply not be returned, or returned
6818with a significant delay.
6819Your application logic must be written to tolerate this (naturally, some
6820loss of performance or quality of service is expected in this case).
6821
6822@menu
6823* Block library and plugins::
6824* libgnunetdht::
6825* The DHT Client-Service Protocol::
6826* The DHT Peer-to-Peer Protocol::
6827@end menu
6828
6829@node Block library and plugins
6830@subsection Block library and plugins
6831
6832@c %**end of header
6833
6834@menu
6835* What is a Block?::
6836* The API of libgnunetblock::
6837* Queries::
6838* Sample Code::
6839* Conclusion2::
6840@end menu
6841
6842@node What is a Block?
6843@subsubsection What is a Block?
6844
6845@c %**end of header
6846
6847Blocks are small (< 63k) pieces of data stored under a key (struct
6848GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
6849their data format. Blocks are used in GNUnet as units of static data
6850exchanged between peers and stored (or cached) locally.
6851Uses of blocks include file-sharing (the files are broken up into blocks),
6852the VPN (DNS information is stored in blocks) and the DHT (all
6853information in the DHT and meta-information for the maintenance of the
6854DHT are both stored using blocks).
6855The block subsystem provides a few common functions that must be
6856available for any type of block.
6857
6858@cindex libgnunetblock API
6859@node The API of libgnunetblock
6860@subsubsection The API of libgnunetblock
6861
6862@c %**end of header
6863
6864The block library requires for each (family of) block type(s) a block
6865plugin (implementing @file{gnunet_block_plugin.h}) that provides basic
6866functions that are needed by the DHT (and possibly other subsystems) to
6867manage the block.
6868These block plugins are typically implemented within their respective
6869subsystems.
6870The main block library is then used to locate, load and query the
6871appropriate block plugin.
6872Which plugin is appropriate is determined by the block type (which is
6873just a 32-bit integer). Block plugins contain code that specifies which
6874block types are supported by a given plugin. The block library loads all
6875block plugins that are installed at the local peer and forwards the
6876application request to the respective plugin.
6877
6878The central functions of the block APIs (plugin and main library) are to
6879allow the mapping of blocks to their respective key (if possible) and the
6880ability to check that a block is well-formed and matches a given
6881request (again, if possible).
6882This way, GNUnet can avoid storing invalid blocks, storing blocks under
6883the wrong key and forwarding blocks in response to a query that they do
6884not answer.
6885
6886One key function of block plugins is that it allows GNUnet to detect
6887duplicate replies (via the Bloom filter). All plugins MUST support
6888detecting duplicate replies (by adding the current response to the
6889Bloom filter and rejecting it if it is encountered again).
6890If a plugin fails to do this, responses may loop in the network.
6891
6892@node Queries
6893@subsubsection Queries
6894@c %**end of header
6895
6896The query format for any block in GNUnet consists of four main components.
6897First, the type of the desired block must be specified. Second, the query
6898must contain a hash code. The hash code is used for lookups in hash
6899tables and databases and must not be unique for the block (however, if
6900possible a unique hash should be used as this would be best for
6901performance).
6902Third, an optional Bloom filter can be specified to exclude known results;
6903replies that hash to the bits set in the Bloom filter are considered
6904invalid. False-positives can be eliminated by sending the same query
6905again with a different Bloom filter mutator value, which parameterizes
6906the hash function that is used.
6907Finally, an optional application-specific "eXtended query" (xquery) can
6908be specified to further constrain the results. It is entirely up to
6909the type-specific plugin to determine whether or not a given block
6910matches a query (type, hash, Bloom filter, and xquery).
6911Naturally, not all xquery's are valid and some types of blocks may not
6912support Bloom filters either, so the plugin also needs to check if the
6913query is valid in the first place.
6914
6915Depending on the results from the plugin, the DHT will then discard the
6916(invalid) query, forward the query, discard the (invalid) reply, cache the
6917(valid) reply, and/or forward the (valid and non-duplicate) reply.
6918
6919@node Sample Code
6920@subsubsection Sample Code
6921
6922@c %**end of header
6923
6924The source code in @strong{plugin_block_test.c} is a good starting point
6925for new block plugins --- it does the minimal work by implementing a
6926plugin that performs no validation at all.
6927The respective @strong{Makefile.am} shows how to build and install a
6928block plugin.
6929
6930@node Conclusion2
6931@subsubsection Conclusion2
6932
6933@c %**end of header
6934
6935In conclusion, GNUnet subsystems that want to use the DHT need to define a
6936block format and write a plugin to match queries and replies. For testing,
6937the "@code{GNUNET_BLOCK_TYPE_TEST}" block type can be used; it accepts
6938any query as valid and any reply as matching any query.
6939This type is also used for the DHT command line tools.
6940However, it should NOT be used for normal applications due to the lack
6941of error checking that results from this primitive implementation.
6942
6943@cindex libgnunetdht
6944@node libgnunetdht
6945@subsection libgnunetdht
6946
6947@c %**end of header
6948
6949The DHT API itself is pretty simple and offers the usual GET and PUT
6950functions that work as expected. The specified block type refers to the
6951block library which allows the DHT to run application-specific logic for
6952data stored in the network.
6953
6954
6955@menu
6956* GET::
6957* PUT::
6958* MONITOR::
6959* DHT Routing Options::
6960@end menu
6961
6962@node GET
6963@subsubsection GET
6964
6965@c %**end of header
6966
6967When using GET, the main consideration for developers (other than the
6968block library) should be that after issuing a GET, the DHT will
6969continuously cause (small amounts of) network traffic until the operation
6970is explicitly canceled.
6971So GET does not simply send out a single network request once; instead,
6972the DHT will continue to search for data. This is needed to achieve good
6973success rates and also handles the case where the respective PUT
6974operation happens after the GET operation was started.
6975Developers should not cancel an existing GET operation and then
6976explicitly re-start it to trigger a new round of network requests;
6977this is simply inefficient, especially as the internal automated version
6978can be more efficient, for example by filtering results in the network
6979that have already been returned.
6980
6981If an application that performs a GET request has a set of replies that it
6982already knows and would like to filter, it can call@
6983@code{GNUNET_DHT_get_filter_known_results} with an array of hashes over
6984the respective blocks to tell the DHT that these results are not
6985desired (any more).
6986This way, the DHT will filter the respective blocks using the block
6987library in the network, which may result in a significant reduction in
6988bandwidth consumption.
6989
6990@node PUT
6991@subsubsection PUT
6992
6993@c %**end of header
6994
6995In contrast to GET operations, developers @strong{must} manually re-run
6996PUT operations periodically (if they intend the content to continue to be
6997available). Content stored in the DHT expires or might be lost due to
6998churn.
6999Furthermore, GNUnet's DHT typically requires multiple rounds of PUT
7000operations before a key-value pair is consistently available to all
7001peers (the DHT randomizes paths and thus storage locations, and only
7002after multiple rounds of PUTs there will be a sufficient number of
7003replicas in large DHTs). An explicit PUT operation using the DHT API will
7004only cause network traffic once, so in order to ensure basic availability
7005and resistance to churn (and adversaries), PUTs must be repeated.
7006While the exact frequency depends on the application, a rule of thumb is
7007that there should be at least a dozen PUT operations within the content
7008lifetime. Content in the DHT typically expires after one day, so
7009DHT PUT operations should be repeated at least every 1-2 hours.
7010
7011@node MONITOR
7012@subsubsection MONITOR
7013
7014@c %**end of header
7015
7016The DHT API also allows applications to monitor messages crossing the
7017local DHT service.
7018The types of messages used by the DHT are GET, PUT and RESULT messages.
7019Using the monitoring API, applications can choose to monitor these
7020requests, possibly limiting themselves to requests for a particular block
7021type.
7022
7023The monitoring API is not only usefu only for diagnostics, it can also be
7024used to trigger application operations based on PUT operations.
7025For example, an application may use PUTs to distribute work requests to
7026other peers.
7027The workers would then monitor for PUTs that give them work, instead of
7028looking for work using GET operations.
7029This can be beneficial, especially if the workers have no good way to
7030guess the keys under which work would be stored.
7031Naturally, additional protocols might be needed to ensure that the desired
7032number of workers will process the distributed workload.
7033
7034@node DHT Routing Options
7035@subsubsection DHT Routing Options
7036
7037@c %**end of header
7038
7039There are two important options for GET and PUT requests:
7040
7041@table @asis
7042@item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all
7043peers should process the request, even if their peer ID is not closest to
7044the key. For a PUT request, this means that all peers that a request
7045traverses may make a copy of the data.
7046Similarly for a GET request, all peers will check their local database
7047for a result. Setting this option can thus significantly improve caching
7048and reduce bandwidth consumption --- at the expense of a larger DHT
7049database. If in doubt, we recommend that this option should be used.
7050@item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record
7051the path that a GET or a PUT request is taking through the overlay
7052network. The resulting paths are then returned to the application with
7053the respective result. This allows the receiver of a result to construct
7054a path to the originator of the data, which might then be used for
7055routing. Naturally, setting this option requires additional bandwidth
7056and disk space, so applications should only set this if the paths are
7057needed by the application logic.
7058@item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
7059the DHT's peer discovery mechanism and should not be used by applications.
7060@item GNUNET_DHT_RO_BART This option is currently not implemented. It may
7061in the future offer performance improvements for clique topologies.
7062@end table
7063
7064@node The DHT Client-Service Protocol
7065@subsection The DHT Client-Service Protocol
7066
7067@c %**end of header
7068
7069@menu
7070* PUTting data into the DHT::
7071* GETting data from the DHT::
7072* Monitoring the DHT::
7073@end menu
7074
7075@node PUTting data into the DHT
7076@subsubsection PUTting data into the DHT
7077
7078@c %**end of header
7079
7080To store (PUT) data into the DHT, the client sends a
7081@code{struct GNUNET_DHT_ClientPutMessage} to the service.
7082This message specifies the block type, routing options, the desired
7083replication level, the expiration time, key,
7084value and a 64-bit unique ID for the operation. The service responds with
7085a @code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same
708664-bit unique ID. Note that the service sends the confirmation as soon as
7087it has locally processed the PUT request. The PUT may still be
7088propagating through the network at this time.
7089
7090In the future, we may want to change this to provide (limited) feedback
7091to the client, for example if we detect that the PUT operation had no
7092effect because the same key-value pair was already stored in the DHT.
7093However, changing this would also require additional state and messages
7094in the P2P interaction.
7095
7096@node GETting data from the DHT
7097@subsubsection GETting data from the DHT
7098
7099@c %**end of header
7100
7101To retrieve (GET) data from the DHT, the client sends a
7102@code{struct GNUNET_DHT_ClientGetMessage} to the service. The message
7103specifies routing options, a replication level (for replicating the GET,
7104not the content), the desired block type, the key, the (optional)
7105extended query and unique 64-bit request ID.
7106
7107Additionally, the client may send any number of
7108@code{struct GNUNET_DHT_ClientGetResultSeenMessage}s to notify the
7109service about results that the client is already aware of.
7110These messages consist of the key, the unique 64-bit ID of the request,
7111and an arbitrary number of hash codes over the blocks that the client is
7112already aware of. As messages are restricted to 64k, a client that
7113already knows more than about a thousand blocks may need to send
7114several of these messages. Naturally, the client should transmit these
7115messages as quickly as possible after the original GET request such that
7116the DHT can filter those results in the network early on. Naturally, as
7117these messages are send after the original request, it is conceivalbe
7118that the DHT service may return blocks that match those already known
7119to the client anyway.
7120
7121In response to a GET request, the service will send @code{struct
7122GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
7123block type, expiration, key, unique ID of the request and of course the
7124value (a block). Depending on the options set for the respective
7125operations, the replies may also contain the path the GET and/or the PUT
7126took through the network.
7127
7128A client can stop receiving replies either by disconnecting or by sending
7129a @code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the
7130key and the 64-bit unique ID of the original request. Using an
7131explicit "stop" message is more common as this allows a client to run
7132many concurrent GET operations over the same connection with the DHT
7133service --- and to stop them individually.
7134
7135@node Monitoring the DHT
7136@subsubsection Monitoring the DHT
7137
7138@c %**end of header
7139
7140To begin monitoring, the client sends a
7141@code{struct GNUNET_DHT_MonitorStartStop} message to the DHT service.
7142In this message, flags can be set to enable (or disable) monitoring of
7143GET, PUT and RESULT messages that pass through a peer. The message can
7144also restrict monitoring to a particular block type or a particular key.
7145Once monitoring is enabled, the DHT service will notify the client about
7146any matching event using @code{struct GNUNET_DHT_MonitorGetMessage}s for
7147GET events, @code{struct GNUNET_DHT_MonitorPutMessage} for PUT events
7148and @code{struct GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of
7149these messages contains all of the information about the event.
7150
7151@node The DHT Peer-to-Peer Protocol
7152@subsection The DHT Peer-to-Peer Protocol
7153@c %**end of header
7154
7155
7156@menu
7157* Routing GETs or PUTs::
7158* PUTting data into the DHT2::
7159* GETting data from the DHT2::
7160@end menu
7161
7162@node Routing GETs or PUTs
7163@subsubsection Routing GETs or PUTs
7164
7165@c %**end of header
7166
7167When routing GETs or PUTs, the DHT service selects a suitable subset of
7168neighbours for forwarding. The exact number of neighbours can be zero or
7169more and depends on the hop counter of the query (initially zero) in
7170relation to the (log of) the network size estimate, the desired
7171replication level and the peer's connectivity.
7172Depending on the hop counter and our network size estimate, the selection
7173of the peers maybe randomized or by proximity to the key.
7174Furthermore, requests include a set of peers that a request has already
7175traversed; those peers are also excluded from the selection.
7176
7177@node PUTting data into the DHT2
7178@subsubsection PUTting data into the DHT2
7179
7180@c %**end of header
7181
7182To PUT data into the DHT, the service sends a @code{struct PeerPutMessage}
7183of type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective
7184neighbour.
7185In addition to the usual information about the content (type, routing
7186options, desired replication level for the content, expiration time, key
7187and value), the message contains a fixed-size Bloom filter with
7188information about which peers (may) have already seen this request.
7189This Bloom filter is used to ensure that DHT messages never loop back to
7190a peer that has already processed the request.
7191Additionally, the message includes the current hop counter and, depending
7192on the routing options, the message may include the full path that the
7193message has taken so far.
7194The Bloom filter should already contain the identity of the previous hop;
7195however, the path should not include the identity of the previous hop and
7196the receiver should append the identity of the sender to the path, not
7197its own identity (this is done to reduce bandwidth).
7198
7199@node GETting data from the DHT2
7200@subsubsection GETting data from the DHT2
7201
7202@c %**end of header
7203
7204A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
7205@code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the
7206usual information about the request (type, routing options, desired
7207replication level for the request, the key and the extended query), a GET
7208request also again contains a hop counter, a Bloom filter over the peers
7209that have processed the request already and depending on the routing
7210options the full path traversed by the GET.
7211Finally, a GET request includes a variable-size second Bloom filter and a
7212so-called Bloom filter mutator value which together indicate which
7213replies the sender has already seen. During the lookup, each block that
7214matches they block type, key and extended query is additionally subjected
7215to a test against this Bloom filter.
7216The block plugin is expected to take the hash of the block and combine it
7217with the mutator value and check if the result is not yet in the Bloom
7218filter. The originator of the query will from time to time modify the
7219mutator to (eventually) allow false-positives filtered by the Bloom filter
7220to be returned.
7221
7222Peers that receive a GET request perform a local lookup (depending on
7223their proximity to the key and the query options) and forward the request
7224to other peers.
7225They then remember the request (including the Bloom filter for blocking
7226duplicate results) and when they obtain a matching, non-filtered response
7227a @code{struct PeerResultMessage} of type
7228@code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous
7229hop.
7230Whenver a result is forwarded, the block plugin is used to update the
7231Bloom filter accordingly, to ensure that the same result is never
7232forwarded more than once.
7233The DHT service may also cache forwarded results locally if the
7234"CACHE_RESULTS" option is set to "YES" in the configuration.
7235
7236@node The GNU Name System (GNS)
7237@section The GNU Name System (GNS)
7238
7239@c %**end of header
7240
7241The GNU Name System (GNS) is a decentralized database that enables users
7242to securely resolve names to values.
7243Names can be used to identify other users (for example, in social
7244networking), or network services (for example, VPN services running at a
7245peer in GNUnet, or purely IP-based services on the Internet).
7246Users interact with GNS by typing in a hostname that ends in ".gnu"
7247or ".zkey".
7248
7249Videos giving an overview of most of the GNS and the motivations behind
7250it is available here and here.
7251The remainder of this chapter targets developers that are familiar with
7252high level concepts of GNS as presented in these talks.
7253@c TODO: Add links to here and here and to these.
7254
7255GNS-aware applications should use the GNS resolver to obtain the
7256respective records that are stored under that name in GNS.
7257Each record consists of a type, value, expiration time and flags.
7258
7259The type specifies the format of the value. Types below 65536 correspond
7260to DNS record types, larger values are used for GNS-specific records.
7261Applications can define new GNS record types by reserving a number and
7262implementing a plugin (which mostly needs to convert the binary value
7263representation to a human-readable text format and vice-versa).
7264The expiration time specifies how long the record is to be valid.
7265The GNS API ensures that applications are only given non-expired values.
7266The flags are typically irrelevant for applications, as GNS uses them
7267internally to control visibility and validity of records.
7268
7269Records are stored along with a signature.
7270The signature is generated using the private key of the authoritative
7271zone. This allows any GNS resolver to verify the correctness of a
7272name-value mapping.
7273
7274Internally, GNS uses the NAMECACHE to cache information obtained from
7275other users, the NAMESTORE to store information specific to the local
7276users, and the DHT to exchange data between users.
7277A plugin API is used to enable applications to define new GNS
7278record types.
7279
7280@menu
7281* libgnunetgns::
7282* libgnunetgnsrecord::
7283* GNS plugins::
7284* The GNS Client-Service Protocol::
7285* Hijacking the DNS-Traffic using gnunet-service-dns::
7286* Serving DNS lookups via GNS on W32::
7287@end menu
7288
7289@node libgnunetgns
7290@subsection libgnunetgns
7291
7292@c %**end of header
7293
7294The GNS API itself is extremely simple. Clients first connec to the
7295GNS service using @code{GNUNET_GNS_connect}.
7296They can then perform lookups using @code{GNUNET_GNS_lookup} or cancel
7297pending lookups using @code{GNUNET_GNS_lookup_cancel}.
7298Once finished, clients disconnect using @code{GNUNET_GNS_disconnect}.
7299
7300@menu
7301* Looking up records::
7302* Accessing the records::
7303* Creating records::
7304* Future work::
7305@end menu
7306
7307@node Looking up records
7308@subsubsection Looking up records
7309
7310@c %**end of header
7311
7312@code{GNUNET_GNS_lookup} takes a number of arguments:
7313
7314@table @asis
7315@item handle This is simply the GNS connection handle from
7316@code{GNUNET_GNS_connect}.
7317@item name The client needs to specify the name to
7318be resolved. This can be any valid DNS or GNS hostname.
7319@item zone The client
7320needs to specify the public key of the GNS zone against which the
7321resolution should be done (the ".gnu" zone).
7322Note that a key must be provided, even if the name ends in ".zkey".
7323This should typically be the public key of the master-zone of the user.
7324@item type This is the desired GNS or DNS record type
7325to look for. While all records for the given name will be returned, this
7326can be important if the client wants to resolve record types that
7327themselves delegate resolution, such as CNAME, PKEY or GNS2DNS.
7328Resolving a record of any of these types will only work if the respective
7329record type is specified in the request, as the GNS resolver will
7330otherwise follow the delegation and return the records from the
7331respective destination, instead of the delegating record.
7332@item only_cached This argument should typically be set to
7333@code{GNUNET_NO}. Setting it to @code{GNUNET_YES} disables resolution via
7334the overlay network.
7335@item shorten_zone_key If GNS encounters new names during resolution,
7336their respective zones can automatically be learned and added to the
7337"shorten zone". If this is desired, clients must pass the private key of
7338the shorten zone. If NULL is passed, shortening is disabled.
7339@item proc This argument identifies
7340the function to call with the result. It is given proc_cls, the number of
7341records found (possilby zero) and the array of the records as arguments.
7342proc will only be called once. After proc,> has been called, the lookup
7343must no longer be cancelled.
7344@item proc_cls The closure for proc.
7345@end table
7346
7347@node Accessing the records
7348@subsubsection Accessing the records
7349
7350@c %**end of header
7351
7352The @code{libgnunetgnsrecord} library provides an API to manipulate the
7353GNS record array that is given to proc. In particular, it offers
7354functions such as converting record values to human-readable
7355strings (and back). However, most @code{libgnunetgnsrecord} functions are
7356not interesting to GNS client applications.
7357
7358For DNS records, the @code{libgnunetdnsparser} library provides
7359functions for parsing (and serializing) common types of DNS records.
7360
7361@node Creating records
7362@subsubsection Creating records
7363
7364@c %**end of header
7365
7366Creating GNS records is typically done by building the respective record
7367information (possibly with the help of @code{libgnunetgnsrecord} and
7368@code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7369publish the information. The GNS API is not involved in this
7370operation.
7371
7372@node Future work
7373@subsubsection Future work
7374
7375@c %**end of header
7376
7377In the future, we want to expand @code{libgnunetgns} to allow
7378applications to observe shortening operations performed during GNS
7379resolution, for example so that users can receive visual feedback when
7380this happens.
7381
7382@node libgnunetgnsrecord
7383@subsection libgnunetgnsrecord
7384
7385@c %**end of header
7386
7387The @code{libgnunetgnsrecord} library is used to manipulate GNS
7388records (in plaintext or in their encrypted format).
7389Applications mostly interact with @code{libgnunetgnsrecord} by using the
7390functions to convert GNS record values to strings or vice-versa, or to
7391lookup a GNS record type number by name (or vice-versa).
7392The library also provides various other functions that are mostly
7393used internally within GNS, such as converting keys to names, checking for
7394expiration, encrypting GNS records to GNS blocks, verifying GNS block
7395signatures and decrypting GNS records from GNS blocks.
7396
7397We will now discuss the four commonly used functions of the API.@
7398@code{libgnunetgnsrecord} does not perform these operations itself,
7399but instead uses plugins to perform the operation.
7400GNUnet includes plugins to support common DNS record types as well as
7401standard GNS record types.
7402
7403@menu
7404* Value handling::
7405* Type handling::
7406@end menu
7407
7408@node Value handling
7409@subsubsection Value handling
7410
7411@c %**end of header
7412
7413@code{GNUNET_GNSRECORD_value_to_string} can be used to convert
7414the (binary) representation of a GNS record value to a human readable,
74150-terminated UTF-8 string.
7416NULL is returned if the specified record type is not supported by any
7417available plugin.
7418
7419@code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a
7420human readable string to the respective (binary) representation of
7421a GNS record value.
7422
7423@node Type handling
7424@subsubsection Type handling
7425
7426@c %**end of header
7427
7428@code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the
7429numeric value associated with a given typename. For example, given the
7430typename "A" (for DNS A reocrds), the function will return the number 1.
7431A list of common DNS record types is
7432@uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here}.
7433Note that not all DNS record types are supported by GNUnet GNSRECORD
7434plugins at this time.
7435
7436@code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the
7437typename associated with a given numeric value.
7438For example, given the type number 1, the function will return the
7439typename "A".
7440
7441@node GNS plugins
7442@subsection GNS plugins
7443
7444@c %**end of header
7445
7446Adding a new GNS record type typically involves writing (or extending) a
7447GNSRECORD plugin. The plugin needs to implement the
7448@code{gnunet_gnsrecord_plugin.h} API which provides basic functions that
7449are needed by GNSRECORD to convert typenames and values of the respective
7450record type to strings (and back).
7451These gnsrecord plugins are typically implemented within their respective
7452subsystems.
7453Examples for such plugins can be found in the GNSRECORD, GNS and
7454CONVERSATION subsystems.
7455
7456The @code{libgnunetgnsrecord} library is then used to locate, load and
7457query the appropriate gnsrecord plugin.
7458Which plugin is appropriate is determined by the record type (which is
7459just a 32-bit integer). The @code{libgnunetgnsrecord} library loads all
7460block plugins that are installed at the local peer and forwards the
7461application request to the plugins. If the record type is not
7462supported by the plugin, it should simply return an error code.
7463
7464The central functions of the block APIs (plugin and main library) are the
7465same four functions for converting between values and strings, and
7466typenames and numbers documented in the previous subsection.
7467
7468@node The GNS Client-Service Protocol
7469@subsection The GNS Client-Service Protocol
7470@c %**end of header
7471
7472The GNS client-service protocol consists of two simple messages, the
7473@code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP}
7474message contains a unique 32-bit identifier, which will be included in the
7475corresponding response. Thus, clients can send many lookup requests in
7476parallel and receive responses out-of-order.
7477A @code{LOOKUP} request also includes the public key of the GNS zone,
7478the desired record type and fields specifying whether shortening is
7479enabled or networking is disabled. Finally, the @code{LOOKUP} message
7480includes the name to be resolved.
7481
7482The response includes the number of records and the records themselves
7483in the format created by @code{GNUNET_GNSRECORD_records_serialize}.
7484They can thus be deserialized using
7485@code{GNUNET_GNSRECORD_records_deserialize}.
7486
7487@node Hijacking the DNS-Traffic using gnunet-service-dns
7488@subsection Hijacking the DNS-Traffic using gnunet-service-dns
7489
7490@c %**end of header
7491
7492This section documents how the gnunet-service-dns (and the
7493gnunet-helper-dns) intercepts DNS queries from the local system.
7494This is merely one method for how we can obtain GNS queries.
7495It is also possible to change @code{resolv.conf} to point to a machine
7496running @code{gnunet-dns2gns} or to modify libc's name system switch
7497(NSS) configuration to include a GNS resolution plugin.
7498The method described in this chaper is more of a last-ditch catch-all
7499approach.
7500
7501@code{gnunet-service-dns} enables intercepting DNS traffic using policy
7502based routing.
7503We MARK every outgoing DNS-packet if it was not sent by our application.
7504Using a second routing table in the Linux kernel these marked packets are
7505then routed through our virtual network interface and can thus be
7506captured unchanged.
7507
7508Our application then reads the query and decides how to handle it: A
7509query to an address ending in ".gnu" or ".zkey" is hijacked by
7510@code{gnunet-service-gns} and resolved internally using GNS.
7511In the future, a reverse query for an address of the configured virtual
7512network could be answered with records kept about previous forward
7513queries.
7514Queries that are not hijacked by some application using the DNS service
7515will be sent to the original recipient.
7516The answer to the query will always be sent back through the virtual
7517interface with the original nameserver as source address.
7518
7519
7520@menu
7521* Network Setup Details::
7522@end menu
7523
7524@node Network Setup Details
7525@subsubsection Network Setup Details
7526
7527@c %**end of header
7528
7529The DNS interceptor adds the following rules to the Linux kernel:
7530@example
7531iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 \
7532-j ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK \
7533--set-mark 3 ip rule add fwmark 3 table2 ip route add default via \
7534$VIRTUALDNS table2
7535@end example
7536
7537@c FIXME: Rewrite to reflect display which is no longer content by line
7538@c FIXME: due to the < 74 characters limit.
7539Line 1 makes sure that all packets coming from a port our application
7540opened beforehand (@code{$LOCALPORT}) will be routed normally.
7541Line 2 marks every other packet to a DNS-Server with mark 3 (chosen
7542arbitrarily). The third line adds a routing policy based on this mark
75433 via the routing table.
7544
7545@node Serving DNS lookups via GNS on W32
7546@subsection Serving DNS lookups via GNS on W32
7547
7548@c %**end of header
7549
7550This section documents how the libw32nsp (and
7551gnunet-gns-helper-service-w32) do DNS resolutions of DNS queries on the
7552local system. This only applies to GNUnet running on W32.
7553
7554W32 has a concept of "Namespaces" and "Namespace providers".
7555These are used to present various name systems to applications in a
7556generic way.
7557Namespaces include DNS, mDNS, NLA and others. For each namespace any
7558number of providers could be registered, and they are queried in an order
7559of priority (which is adjustable).
7560
7561Applications can resolve names by using WSALookupService*() family of
7562functions.
7563
7564However, these are WSA-only facilities. Common BSD socket functions for
7565namespace resolutions are gethostbyname and getaddrinfo (among others).
7566These functions are implemented internally (by default - by mswsock,
7567which also implements the default DNS provider) as wrappers around
7568WSALookupService*() functions (see "Sample Code for a Service Provider"
7569on MSDN).
7570
7571On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
7572installed into the system by using w32nsp-install (and uninstalled by
7573w32nsp-uninstall), as described in "Installation Handbook".
7574
7575libw32nsp is very simple and has almost no dependencies. As a response to
7576NSPLookupServiceBegin(), it only checks that the provider GUID passed to
7577it by the caller matches GNUnet DNS Provider GUID, checks that name being
7578resolved ends in ".gnu" or ".zkey", then connects to
7579gnunet-gns-helper-service-w32 at 127.0.0.1:5353 (hardcoded) and sends the
7580name resolution request there, returning the connected socket to the
7581caller.
7582
7583When the caller invokes NSPLookupServiceNext(), libw32nsp reads a
7584completely formed reply from that socket, unmarshalls it, then gives
7585it back to the caller.
7586
7587At the moment gnunet-gns-helper-service-w32 is implemented to ever give
7588only one reply, and subsequent calls to NSPLookupServiceNext() will fail
7589with WSA_NODATA (first call to NSPLookupServiceNext() might also fail if
7590GNS failed to find the name, or there was an error connecting to it).
7591
7592gnunet-gns-helper-service-w32 does most of the processing:
7593
7594@itemize @bullet
7595@item Maintains a connection to GNS.
7596@item Reads GNS config and loads appropriate keys.
7597@item Checks service GUID and decides on the type of record to look up,
7598refusing to make a lookup outright when unsupported service GUID is
7599passed.
7600@item Launches the lookup
7601@end itemize
7602
7603When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
7604reply (including filling a WSAQUERYSETW structure and, possibly, a binary
7605blob with a hostent structure for gethostbyname() client), marshalls it,
7606and sends it back to libw32nsp. If no records were found, it sends an
7607empty header.
7608
7609This works for most normal applications that use gethostbyname() or
7610getaddrinfo() to resolve names, but fails to do anything with
7611applications that use alternative means of resolving names (such as
7612sending queries to a DNS server directly by themselves).
7613This includes some of well known utilities, like "ping" and "nslookup".
7614
7615@node The GNS Namecache
7616@section The GNS Namecache
7617
7618@c %**end of header
7619
7620The NAMECACHE subsystem is responsible for caching (encrypted) resolution
7621results of the GNU Name System (GNS). GNS makes zone information available
7622to other users via the DHT. However, as accessing the DHT for every
7623lookup is expensive (and as the DHT's local cache is lost whenever the
7624peer is restarted), GNS uses the NAMECACHE as a more persistent cache for
7625DHT lookups.
7626Thus, instead of always looking up every name in the DHT, GNS first
7627checks if the result is already available locally in the NAMECACHE.
7628Only if there is no result in the NAMECACHE, GNS queries the DHT.
7629The NAMECACHE stores data in the same (encrypted) format as the DHT.
7630It thus makes no sense to iterate over all items in the
7631NAMECACHE --- the NAMECACHE does not have a way to provide the keys
7632required to decrypt the entries.
7633
7634Blocks in the NAMECACHE share the same expiration mechanism as blocks in
7635the DHT --- the block expires wheneever any of the records in
7636the (encrypted) block expires.
7637The expiration time of the block is the only information stored in
7638plaintext. The NAMECACHE service internally performs all of the required
7639work to expire blocks, clients do not have to worry about this.
7640Also, given that NAMECACHE stores only GNS blocks that local users
7641requested, there is no configuration option to limit the size of the
7642NAMECACHE. It is assumed to be always small enough (a few MB) to fit on
7643the drive.
7644
7645The NAMECACHE supports the use of different database backends via a
7646plugin API.
7647
7648@menu
7649* libgnunetnamecache::
7650* The NAMECACHE Client-Service Protocol::
7651* The NAMECACHE Plugin API::
7652@end menu
7653
7654@node libgnunetnamecache
7655@subsection libgnunetnamecache
7656
7657@c %**end of header
7658
7659The NAMECACHE API consists of five simple functions. First, there is
7660@code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service.
7661This returns the handle required for all other operations on the
7662NAMECACHE. Using @code{GNUNET_NAMECACHE_block_cache} clients can insert a
7663block into the cache.
7664@code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that
7665were stored in the NAMECACHE. Both operations can be cancelled using
7666@code{GNUNET_NAMECACHE_cancel}. Note that cancelling a
7667@code{GNUNET_NAMECACHE_block_cache} operation can result in the block
7668being stored in the NAMECACHE --- or not. Cancellation primarily ensures
7669that the continuation function with the result of the operation will no
7670longer be invoked.
7671Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to the
7672NAMECACHE.
7673
7674The maximum size of a block that can be stored in the NAMECACHE is
7675@code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
7676
7677@node The NAMECACHE Client-Service Protocol
7678@subsection The NAMECACHE Client-Service Protocol
7679
7680@c %**end of header
7681
7682All messages in the NAMECACHE IPC protocol start with the
7683@code{struct GNUNET_NAMECACHE_Header} which adds a request
7684ID (32-bit integer) to the standard message header.
7685The request ID is used to match requests with the
7686respective responses from the NAMECACHE, as they are allowed to happen
7687out-of-order.
7688
7689
7690@menu
7691* Lookup::
7692* Store::
7693@end menu
7694
7695@node Lookup
7696@subsubsection Lookup
7697
7698@c %**end of header
7699
7700The @code{struct LookupBlockMessage} is used to lookup a block stored in
7701the cache.
7702It contains the query hash. The NAMECACHE always responds with a
7703@code{struct LookupBlockResponseMessage}. If the NAMECACHE has no
7704response, it sets the expiration time in the response to zero.
7705Otherwise, the response is expected to contain the expiration time, the
7706ECDSA signature, the derived key and the (variable-size) encrypted data
7707of the block.
7708
7709@node Store
7710@subsubsection Store
7711
7712@c %**end of header
7713
7714The @code{struct BlockCacheMessage} is used to cache a block in the
7715NAMECACHE.
7716It has the same structure as the @code{struct LookupBlockResponseMessage}.
7717The service responds with a @code{struct BlockCacheResponseMessage} which
7718contains the result of the operation (success or failure).
7719In the future, we might want to make it possible to provide an error
7720message as well.
7721
7722@node The NAMECACHE Plugin API
7723@subsection The NAMECACHE Plugin API
7724@c %**end of header
7725
7726The NAMECACHE plugin API consists of two functions, @code{cache_block} to
7727store a block in the database, and @code{lookup_block} to lookup a block
7728in the database.
7729
7730
7731@menu
7732* Lookup2::
7733* Store2::
7734@end menu
7735
7736@node Lookup2
7737@subsubsection Lookup2
7738
7739@c %**end of header
7740
7741The @code{lookup_block} function is expected to return at most one block
7742to the iterator, and return @code{GNUNET_NO} if there were no non-expired
7743results.
7744If there are multiple non-expired results in the cache, the lookup is
7745supposed to return the result with the largest expiration time.
7746
7747@node Store2
7748@subsubsection Store2
7749
7750@c %**end of header
7751
7752The @code{cache_block} function is expected to try to store the block in
7753the database, and return @code{GNUNET_SYSERR} if this was not possible
7754for any reason.
7755Furthermore, @code{cache_block} is expected to implicitly perform cache
7756maintenance and purge blocks from the cache that have expired. Note that
7757@code{cache_block} might encounter the case where the database already has
7758another block stored under the same key. In this case, the plugin must
7759ensure that the block with the larger expiration time is preserved.
7760Obviously, this can done either by simply adding new blocks and selecting
7761for the most recent expiration time during lookup, or by checking which
7762block is more recent during the store operation.
7763
7764@node The REVOCATION Subsystem
7765@section The REVOCATION Subsystem
7766@c %**end of header
7767
7768The REVOCATION subsystem is responsible for key revocation of Egos.
7769If a user learns that theis private key has been compromised or has lost
7770it, they can use the REVOCATION system to inform all of the other users
7771that their private key is no longer valid.
7772The subsystem thus includes ways to query for the validity of keys and to
7773propagate revocation messages.
7774
7775@menu
7776* Dissemination::
7777* Revocation Message Design Requirements::
7778* libgnunetrevocation::
7779* The REVOCATION Client-Service Protocol::
7780* The REVOCATION Peer-to-Peer Protocol::
7781@end menu
7782
7783@node Dissemination
7784@subsection Dissemination
7785
7786@c %**end of header
7787
7788When a revocation is performed, the revocation is first of all
7789disseminated by flooding the overlay network.
7790The goal is to reach every peer, so that when a peer needs to check if a
7791key has been revoked, this will be purely a local operation where the
7792peer looks at his local revocation list. Flooding the network is also the
7793most robust form of key revocation --- an adversary would have to control
7794a separator of the overlay graph to restrict the propagation of the
7795revocation message. Flooding is also very easy to implement --- peers that
7796receive a revocation message for a key that they have never seen before
7797simply pass the message to all of their neighbours.
7798
7799Flooding can only distribute the revocation message to peers that are
7800online.
7801In order to notify peers that join the network later, the revocation
7802service performs efficient set reconciliation over the sets of known
7803revocation messages whenever two peers (that both support REVOCATION
7804dissemination) connect.
7805The SET service is used to perform this operation efficiently.
7806
7807@node Revocation Message Design Requirements
7808@subsection Revocation Message Design Requirements
7809
7810@c %**end of header
7811
7812However, flooding is also quite costly, creating O(|E|) messages on a
7813network with |E| edges.
7814Thus, revocation messages are required to contain a proof-of-work, the
7815result of an expensive computation (which, however, is cheap to verify).
7816Only peers that have expended the CPU time necessary to provide
7817this proof will be able to flood the network with the revocation message.
7818This ensures that an attacker cannot simply flood the network with
7819millions of revocation messages. The proof-of-work required by GNUnet is
7820set to take days on a typical PC to compute; if the ability to quickly
7821revoke a key is needed, users have the option to pre-compute revocation
7822messages to store off-line and use instantly after their key has expired.
7823
7824Revocation messages must also be signed by the private key that is being
7825revoked. Thus, they can only be created while the private key is in the
7826possession of the respective user. This is another reason to create a
7827revocation message ahead of time and store it in a secure location.
7828
7829@node libgnunetrevocation
7830@subsection libgnunetrevocation
7831
7832@c %**end of header
7833
7834The REVOCATION API consists of two parts, to query and to issue
7835revocations.
7836
7837
7838@menu
7839* Querying for revoked keys::
7840* Preparing revocations::
7841* Issuing revocations::
7842@end menu
7843
7844@node Querying for revoked keys
7845@subsubsection Querying for revoked keys
7846
7847@c %**end of header
7848
7849@code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public
7850key has been revoked.
7851The given callback will be invoked with the result of the check.
7852The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on
7853the return value.
7854
7855@node Preparing revocations
7856@subsubsection Preparing revocations
7857
7858@c %**end of header
7859
7860It is often desirable to create a revocation record ahead-of-time and
7861store it in an off-line location to be used later in an emergency.
7862This is particularly true for GNUnet revocations, where performing the
7863revocation operation itself is computationally expensive and thus is
7864likely to take some time.
7865Thus, if users want the ability to perform revocations quickly in an
7866emergency, they must pre-compute the revocation message.
7867The revocation API enables this with two functions that are used to
7868compute the revocation message, but not trigger the actual revocation
7869operation.
7870
7871@code{GNUNET_REVOCATION_check_pow} should be used to calculate the
7872proof-of-work required in the revocation message. This function takes the
7873public key, the required number of bits for the proof of work (which in
7874GNUnet is a network-wide constant) and finally a proof-of-work number as
7875arguments.
7876The function then checks if the given proof-of-work number is a valid
7877proof of work for the given public key. Clients preparing a revocation
7878are expected to call this function repeatedly (typically with a
7879monotonically increasing sequence of numbers of the proof-of-work number)
7880until a given number satisfies the check.
7881That number should then be saved for later use in the revocation
7882operation.
7883
7884@code{GNUNET_REVOCATION_sign_revocation} is used to generate the
7885signature that is required in a revocation message.
7886It takes the private key that (possibly in the future) is to be revoked
7887and returns the signature.
7888The signature can again be saved to disk for later use, which will then
7889allow performing a revocation even without access to the private key.
7890
7891@node Issuing revocations
7892@subsubsection Issuing revocations
7893
7894
7895Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign}
7896and the proof-of-work,
7897@code{GNUNET_REVOCATION_revoke} can be used to perform the
7898actual revocation. The given callback is called upon completion of the
7899operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
7900library from calling the continuation; however, in that case it is
7901undefined whether or not the revocation operation will be executed.
7902
7903@node The REVOCATION Client-Service Protocol
7904@subsection The REVOCATION Client-Service Protocol
7905
7906
7907The REVOCATION protocol consists of four simple messages.
7908
7909A @code{QueryMessage} containing a public ECDSA key is used to check if a
7910particular key has been revoked. The service responds with a
7911@code{QueryResponseMessage} which simply contains a bit that says if the
7912given public key is still valid, or if it has been revoked.
7913
7914The second possible interaction is for a client to revoke a key by
7915passing a @code{RevokeMessage} to the service. The @code{RevokeMessage}
7916contains the ECDSA public key to be revoked, a signature by the
7917corresponding private key and the proof-of-work, The service responds
7918with a @code{RevocationResponseMessage} which can be used to indicate
7919that the @code{RevokeMessage} was invalid (i.e. proof of work incorrect),
7920or otherwise indicates that the revocation has been processed
7921successfully.
7922
7923@node The REVOCATION Peer-to-Peer Protocol
7924@subsection The REVOCATION Peer-to-Peer Protocol
7925
7926@c %**end of header
7927
7928Revocation uses two disjoint ways to spread revocation information among
7929peers.
7930First of all, P2P gossip exchanged via CORE-level neighbours is used to
7931quickly spread revocations to all connected peers.
7932Second, whenever two peers (that both support revocations) connect,
7933the SET service is used to compute the union of the respective revocation
7934sets.
7935
7936In both cases, the exchanged messages are @code{RevokeMessage}s which
7937contain the public key that is being revoked, a matching ECDSA signature,
7938and a proof-of-work.
7939Whenever a peer learns about a new revocation this way, it first
7940validates the signature and the proof-of-work, then stores it to disk
7941(typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally
7942spreads the information to all directly connected neighbours.
7943
7944For computing the union using the SET service, the peer with the smaller
7945hashed peer identity will connect (as a "client" in the two-party set
7946protocol) to the other peer after one second (to reduce traffic spikes
7947on connect) and initiate the computation of the set union.
7948All revocation services use a common hash to identify the SET operation
7949over revocation sets.
7950
7951The current implementation accepts revocation set union operations from
7952all peers at any time; however, well-behaved peers should only initiate
7953this operation once after establishing a connection to a peer with a
7954larger hashed peer identity.
7955
7956@cindex gnunet-fs
7957@cindex FS
7958@cindex FS subsystem
7959@node GNUnet's File-sharing (FS) Subsystem
7960@section GNUnet's File-sharing (FS) Subsystem
7961
7962@c %**end of header
7963
7964This chapter describes the details of how the file-sharing service works.
7965As with all services, it is split into an API (libgnunetfs), the service
7966process (gnunet-service-fs) and user interface(s).
7967The file-sharing service uses the datastore service to store blocks and
7968the DHT (and indirectly datacache) for lookups for non-anonymous
7969file-sharing.
7970Furthermore, the file-sharing service uses the block library (and the
7971block fs plugin) for validation of DHT operations.
7972
7973In contrast to many other services, libgnunetfs is rather complex since
7974the client library includes a large number of high-level abstractions;
7975this is necessary since the Fs service itself largely only operates on
7976the block level.
7977The FS library is responsible for providing a file-based abstraction to
7978applications, including directories, meta data, keyword search,
7979verification, and so on.
7980
7981The method used by GNUnet to break large files into blocks and to use
7982keyword search is called the
7983"Encoding for Censorship Resistant Sharing" (ECRS).
7984ECRS is largely implemented in the fs library; block validation is also
7985reflected in the block FS plugin and the FS service.
7986ECRS on-demand encoding is implemented in the FS service.
7987
7988NOTE: The documentation in this chapter is quite incomplete.
7989
7990@menu
7991* Encoding for Censorship-Resistant Sharing (ECRS)::
7992* File-sharing persistence directory structure::
7993@end menu
7994
7995@cindex ecrs
7996@cindex Encoding for Censorship-Resistant Sharing
7997@node Encoding for Censorship-Resistant Sharing (ECRS)
7998@subsection Encoding for Censorship-Resistant Sharing (ECRS)
7999
8000@c %**end of header
8001
8002When GNUnet shares files, it uses a content encoding that is called ECRS,
8003the Encoding for Censorship-Resistant Sharing.
8004Most of ECRS is described in the (so far unpublished) research paper
8005attached to this page. ECRS obsoletes the previous ESED and ESED II
8006encodings which were used in GNUnet before version 0.7.0.
8007The rest of this page assumes that the reader is familiar with the
8008attached paper. What follows is a description of some minor extensions
8009that GNUnet makes over what is described in the paper.
8010The reason why these extensions are not in the paper is that we felt
8011that they were obvious or trivial extensions to the original scheme and
8012thus did not warrant space in the research report.
8013
8014@menu
8015* Namespace Advertisements::
8016* KSBlocks::
8017@end menu
8018
8019@node Namespace Advertisements
8020@subsubsection Namespace Advertisements
8021
8022@c %**end of header
8023@c %**FIXME: all zeroses -> ?
8024
8025An @code{SBlock} with identifier all zeros is a signed
8026advertisement for a namespace. This special @code{SBlock} contains
8027metadata describing the content of the namespace.
8028Instead of the name of the identifier for a potential update, it contains
8029the identifier for the root of the namespace.
8030The URI should always be empty. The @code{SBlock} is signed with the
8031content provder's RSA private key (just like any other SBlock). Peers
8032can search for @code{SBlock}s in order to find out more about a namespace.
8033
8034@node KSBlocks
8035@subsubsection KSBlocks
8036
8037@c %**end of header
8038
8039GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead
8040of encrypting a CHK and metadata, encrypt an @code{SBlock} instead.
8041In other words, @code{KSBlocks} enable GNUnet to find @code{SBlocks}
8042using the global keyword search.
8043Usually the encrypted @code{SBlock} is a namespace advertisement.
8044The rationale behind @code{KSBlock}s and @code{SBlock}s is to enable
8045peers to discover namespaces via keyword searches, and, to associate
8046useful information with namespaces. When GNUnet finds @code{KSBlocks}
8047during a normal keyword search, it adds the information to an internal
8048list of discovered namespaces. Users looking for interesting namespaces
8049can then inspect this list, reducing the need for out-of-band discovery
8050of namespaces.
8051Naturally, namespaces (or more specifically, namespace advertisements) can
8052also be referenced from directories, but @code{KSBlock}s should make it
8053easier to advertise namespaces for the owner of the pseudonym since they
8054eliminate the need to first create a directory.
8055
8056Collections are also advertised using @code{KSBlock}s.
8057
8058@table @asis
8059@item Attachment Size
8060@item ecrs.pdf 270.68 KB
8061@item https://gnunet.org/sites/default/files/ecrs.pdf
8062@end table
8063
8064@node File-sharing persistence directory structure
8065@subsection File-sharing persistence directory structure
8066
8067@c %**end of header
8068
8069This section documents how the file-sharing library implements
8070persistence of file-sharing operations and specifically the resulting
8071directory structure.
8072This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag
8073was set when calling @code{GNUNET_FS_start}.
8074In this case, the file-sharing library will try hard to ensure that all
8075major operations (searching, downloading, publishing, unindexing) are
8076persistent, that is, can live longer than the process itself.
8077More specifically, an operation is supposed to live until it is
8078explicitly stopped.
8079
8080If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
8081@code{SUSPEND} event is generated and then when the process calls
8082@code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
8083Additionally, even if an application crashes (segfault, SIGKILL, system
8084crash) and hence @code{GNUNET_FS_stop} is never called and no
8085@code{SUSPEND} events are generated, operations are still resumed (with
8086@code{RESUME} events).
8087This is implemented by constantly writing the current state of the
8088file-sharing operations to disk.
8089Specifically, the current state is always written to disk whenever
8090anything significant changes (the exception are block-wise progress in
8091publishing and unindexing, since those operations would be slowed down
8092significantly and can be resumed cheaply even without detailed
8093accounting).
8094Note that if the process crashes (or is killed) during a serialization
8095operation, FS does not guarantee that this specific operation is
8096recoverable (no strict transactional semantics, again for performance
8097reasons). However, all other unrelated operations should resume nicely.
8098
8099Since we need to serialize the state continuously and want to recover as
8100much as possible even after crashing during a serialization operation,
8101we do not use one large file for serialization.
8102Instead, several directories are used for the various operations.
8103When @code{GNUNET_FS_start} executes, the master directories are scanned
8104for files describing operations to resume.
8105Sometimes, these operations can refer to related operations in child
8106directories which may also be resumed at this point.
8107Note that corrupted files are cleaned up automatically.
8108However, dangling files in child directories (those that are not
8109referenced by files from the master directories) are not automatically
8110removed.
8111
8112Persistence data is kept in a directory that begins with the "STATE_DIR"
8113prefix from the configuration file
8114(by default, "$SERVICEHOME/persistence/") followed by the name of the
8115client as given to @code{GNUNET_FS_start} (for example, "gnunet-gtk")
8116followed by the actual name of the master or child directory.
8117
8118The names for the master directories follow the names of the operations:
8119
8120@itemize @bullet
8121@item "search"
8122@item "download"
8123@item "publish"
8124@item "unindex"
8125@end itemize
8126
8127Each of the master directories contains names (chosen at random) for each
8128active top-level (master) operation.
8129Note that a download that is associated with a search result is not a
8130top-level operation.
8131
8132In contrast to the master directories, the child directories are only
8133consulted when another operation refers to them.
8134For each search, a subdirectory (named after the master search
8135synchronization file) contains the search results.
8136Search results can have an associated download, which is then stored in
8137the general "download-child" directory.
8138Downloads can be recursive, in which case children are stored in
8139subdirectories mirroring the structure of the recursive download
8140(either starting in the master "download" directory or in the
8141"download-child" directory depending on how the download was initiated).
8142For publishing operations, the "publish-file" directory contains
8143information about the individual files and directories that are part of
8144the publication.
8145However, this directory structure is flat and does not mirror the
8146structure of the publishing operation.
8147Note that unindex operations cannot have associated child operations.
8148
8149@cindex REGEX subsystem
8150@cindex regex subsystem
8151@node GNUnet's REGEX Subsystem
8152@section GNUnet's REGEX Subsystem
8153
8154@c %**end of header
8155
8156Using the REGEX subsystem, you can discover peers that offer a particular
8157service using regular expressions.
8158The peers that offer a service specify it using a regular expressions.
8159Peers that want to patronize a service search using a string.
8160The REGEX subsystem will then use the DHT to return a set of matching
8161offerers to the patrons.
8162
8163For the technical details, we have Max's defense talk and Max's Master's
8164thesis.
8165
8166@c An additional publication is under preparation and available to
8167@c team members (in Git).
8168@c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
8169
8170@menu
8171* How to run the regex profiler::
8172@end menu
8173
8174@node How to run the regex profiler
8175@subsection How to run the regex profiler
8176
8177@c %**end of header
8178
8179The gnunet-regex-profiler can be used to profile the usage of mesh/regex
8180for a given set of regular expressions and strings.
8181Mesh/regex allows you to announce your peer ID under a certain regex and
8182search for peers matching a particular regex using a string.
8183See @uref{https://gnunet.org/szengel2012ms, szengel2012ms} for a full
8184introduction.
8185
8186First of all, the regex profiler uses GNUnet testbed, thus all the
8187implications for testbed also apply to the regex profiler
8188(for example you need password-less ssh login to the machines listed in
8189your hosts file).
8190
8191@strong{Configuration}
8192
8193Moreover, an appropriate configuration file is needed.
8194Generally you can refer to the
8195@file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
8196of GNUnet for an example configuration.
8197In the following paragraph the important details are highlighted.
8198
8199Announcing of the regular expressions is done by the
8200gnunet-daemon-regexprofiler, therefore you have to make sure it is
8201started, by adding it to the AUTOSTART set of ARM:
8202
8203@example
8204[regexprofiler]
8205AUTOSTART = YES
8206@end example
8207
8208@noindent
8209Furthermore you have to specify the location of the binary:
8210
8211@example
8212[regexprofiler]
8213# Location of the gnunet-daemon-regexprofiler binary.
8214BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
8215# Regex prefix that will be applied to all regular expressions and
8216# search string.
8217REGEX_PREFIX = "GNVPN-0001-PAD"
8218@end example
8219
8220@noindent
8221When running the profiler with a large scale deployment, you probably
8222want to reduce the workload of each peer.
8223Use the following options to do this.
8224
8225@example
8226[dht]
8227# Force network size estimation
8228FORCE_NSE = 1
8229
8230[dhtcache]
8231DATABASE = heap
8232# Disable RC-file for Bloom filter? (for benchmarking with limited IO
8233# availability)
8234DISABLE_BF_RC = YES
8235# Disable Bloom filter entirely
8236DISABLE_BF = YES
8237
8238[nse]
8239# Minimize proof-of-work CPU consumption by NSE
8240WORKBITS = 1
8241@end example
8242
8243@noindent
8244@strong{Options}
8245
8246To finally run the profiler some options and the input data need to be
8247specified on the command line.
8248
8249@example
8250gnunet-regex-profiler -c config-file -d log-file -n num-links \
8251-p path-compression-length -s search-delay -t matching-timeout \
8252-a num-search-strings hosts-file policy-dir search-strings-file
8253@end example
8254
8255@noindent
8256Where...
8257
8258@itemize @bullet
8259@item ... @code{config-file} means the configuration file created earlier.
8260@item ... @code{log-file} is the file where to write statistics output.
8261@item ... @code{num-links} indicates the number of random links between
8262started peers.
8263@item ... @code{path-compression-length} is the maximum path compression
8264length in the DFA.
8265@item ... @code{search-delay} time to wait between peers finished linking
8266and starting to match strings.
8267@item ... @code{matching-timeout} timeout after which to cancel the
8268searching.
8269@item ... @code{num-search-strings} number of strings in the
8270search-strings-file.
8271@item ... the @code{hosts-file} should contain a list of hosts for the
8272testbed, one per line in the following format:
8273
8274@itemize @bullet
8275@item @code{user@@host_ip:port}
8276@end itemize
8277@item ... the @code{policy-dir} is a folder containing text files
8278containing one or more regular expressions. A peer is started for each
8279file in that folder and the regular expressions in the corresponding file
8280are announced by this peer.
8281@item ... the @code{search-strings-file} is a text file containing search
8282strings, one in each line.
8283@end itemize
8284
8285@noindent
8286You can create regular expressions and search strings for every AS in the
8287Internet using the attached scripts. You need one of the
8288@uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA routeviews prefix2as}
8289data files for this. Run
8290
8291@example
8292create_regex.py <filename> <output path>
8293@end example
8294
8295@noindent
8296to create the regular expressions and
8297
8298@example
8299create_strings.py <input path> <outfile>
8300@end example
8301
8302@noindent
8303to create a search strings file from the previously created
8304regular expressions.