aboutsummaryrefslogtreecommitdiff
path: root/doc/documentation/chapters/developer.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/documentation/chapters/developer.texi')
-rw-r--r--doc/documentation/chapters/developer.texi7926
1 files changed, 7926 insertions, 0 deletions
diff --git a/doc/documentation/chapters/developer.texi b/doc/documentation/chapters/developer.texi
new file mode 100644
index 000000000..e690e5f5b
--- /dev/null
+++ b/doc/documentation/chapters/developer.texi
@@ -0,0 +1,7926 @@
1@c ***********************************************************************
2@node GNUnet Developer Handbook
3@chapter GNUnet Developer Handbook
4
5This book is intended to be an introduction for programmers that want to
6extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
7application. For developers, GNUnet is:
8
9@itemize @bullet
10@item Free software under the GNU General Public License, with a community
11that believes in the GNU philosophy
12@item
13A set of standards, including coding conventions and architectural rules
14@item
15A set of layered protocols, both specifying the communication between
16peers as well as the communication between components of a single peer.
17@item
18A set of libraries with well-defined APIs suitable for writing extensions
19@end itemize
20
21In particular, the architecture specifies that a peer consists of many
22processes communicating via protocols. Processes can be written in almost
23any language. C and Java APIs exist for accessing existing services and
24for writing extensions. It is possible to write extensions in other
25languages by implementing the necessary IPC protocols.
26
27GNUnet can be extended and improved along many possible dimensions, and
28anyone interested in free software and freedom-enhancing networking is
29welcome to join the effort. This developer handbook attempts to provide
30an initial introduction to some of the key design choices and central
31components of the system. This manual is far from complete, and we
32welcome informed contributions, be it in the form of new chapters or
33insightful comments.
34
35However, the website is experiencing a constant onslaught of sophisticated
36link-spam entered manually by exploited workers solving puzzles and
37customizing text. To limit this commercial defacement, we are strictly
38moderating comments and have disallowed "normal" users from posting new
39content. However, this is really only intended to keep the spam at bay. If
40you are a real user or aspiring developer, please drop us a note
41(IRC, e-mail, contact form) with your user profile ID number included.
42We will then relax these restrictions on your account. We're sorry for
43this inconvenience; however, few people would want to read this site
44if 99% of it was advertisements for bogus websites.
45
46
47
48@c ***********************************************************************
49
50
51
52
53
54
55
56
57@menu
58* Developer Introduction::
59* Code overview::
60* System Architecture::
61* Subsystem stability::
62* Naming conventions and coding style guide::
63* Build-system::
64* Developing extensions for GNUnet using the gnunet-ext template::
65* Writing testcases::
66* GNUnet's TESTING library::
67* Performance regression analysis with Gauger::
68* GNUnet's TESTBED Subsystem::
69* libgnunetutil::
70* The Automatic Restart Manager (ARM)::
71* GNUnet's TRANSPORT Subsystem::
72* NAT library::
73* Distance-Vector plugin::
74* SMTP plugin::
75* Bluetooth plugin::
76* WLAN plugin::
77* The ATS Subsystem::
78* GNUnet's CORE Subsystem::
79* GNUnet's CADET subsystem::
80* GNUnet's NSE subsystem::
81* GNUnet's HOSTLIST subsystem::
82* GNUnet's IDENTITY subsystem::
83* GNUnet's NAMESTORE Subsystem::
84* GNUnet's PEERINFO subsystem::
85* GNUnet's PEERSTORE subsystem::
86* GNUnet's SET Subsystem::
87* GNUnet's STATISTICS subsystem::
88* GNUnet's Distributed Hash Table (DHT)::
89* The GNU Name System (GNS)::
90* The GNS Namecache::
91* The REVOCATION Subsystem::
92* GNUnet's File-sharing (FS) Subsystem::
93* GNUnet's REGEX Subsystem::
94@end menu
95
96@node Developer Introduction
97@section Developer Introduction
98
99This developer handbook is intended as first introduction to GNUnet for
100new developers that want to extend the GNUnet framework. After the
101introduction, each of the GNUnet subsystems (directories in the
102@file{src/} tree) is (supposed to be) covered in its own chapter. In
103addition to this documentation, GNUnet developers should be aware of the
104services available on the GNUnet server to them.
105
106New developers can have a look a the GNUnet tutorials for C and java
107available in the @file{src/} directory of the repository or under the
108following links:
109
110@c ** FIXME: Link to files in source, not online.
111@c ** FIXME: Where is the Java tutorial?
112@itemize @bullet
113@item @uref{https://gnunet.org/git/gnunet.git/plain/doc/gnunet-c-tutoria
114l.pdf, GNUnet C tutorial}
115@item GNUnet Java tutorial
116@end itemize
117
118In addition to this book, the GNUnet server contains various resources for
119GNUnet developers. They are all conveniently reachable via the "Developer"
120entry in the navigation menu. Some additional tools (such as static
121analysis reports) require a special developer access to perform certain
122operations. If you feel you need access, you should contact
123@uref{http://grothoff.org/christian/, Christian Grothoff},
124GNUnet's maintainer.
125
126The public subsystems on the GNUnet server that help developers are:
127
128@itemize @bullet
129@item The Version control system keeps our code and enables distributed
130development. Only developers with write access can commit code, everyone
131else is encouraged to submit patches to the
132@uref{https://lists.gnu.org/mailman/listinfo/gnunet-developers,
133GNUnet-developers mailinglist}.
134@item The GNUnet bugtracking system is used to track feature requests,
135open bug reports and their resolutions. Anyone can report bugs, only
136developers can claim to have fixed them.
137@item A buildbot is used to check GNUnet builds automatically on a range
138of platforms. Builds are triggered automatically after 30 minutes of no
139changes to Git.
140@item The current quality of our automated test suite is assessed using
141Code coverage analysis. This analysis is run daily; however the webpage
142is only updated if all automated tests pass at that time. Testcases that
143improve our code coverage are always welcome.
144@item We try to automatically find bugs using a static analysis scan.
145This scan is run daily; however the webpage is only updated if all
146automated tests pass at the time. Note that not everything that is
147flagged by the analysis is a bug, sometimes even good code can be marked
148as possibly problematic. Nevertheless, developers are encouraged to at
149least be aware of all issues in their code that are listed.
150@item We use Gauger for automatic performance regression visualization.
151Details on how to use Gauger are here.
152@item We use @uref{http://junit.org/, junit} to automatically test
153gnunet-java. Automatically generated, current reports on the test suite
154are here.
155@item We use Cobertura to generate test coverage reports for gnunet-java.
156Current reports on test coverage are here.
157@end itemize
158
159
160
161@c ***********************************************************************
162@menu
163* Project overview::
164@end menu
165
166@node Project overview
167@subsection Project overview
168
169The GNUnet project consists at this point of several sub-projects. This
170section is supposed to give an initial overview about the various
171sub-projects. Note that this description also lists projects that are far
172from complete, including even those that have literally not a single line
173of code in them yet.
174
175GNUnet sub-projects in order of likely relevance are currently:
176
177@table @asis
178
179@item gnunet Core of the P2P framework, including file-sharing, VPN and
180chat applications; this is what the developer handbook covers mostly
181@item gnunet-gtk Gtk+-based user interfaces, including gnunet-fs-gtk
182(file-sharing), gnunet-statistics-gtk (statistics over time),
183gnunet-peerinfo-gtk (information about current connections and known
184peers), gnunet-chat-gtk (chat GUI) and gnunet-setup (setup tool for
185"everything")
186@item gnunet-fuse Mounting directories shared via GNUnet's file-sharing
187on Linux
188@item gnunet-update Installation and update tool
189@item gnunet-ext Template for starting 'external' GNUnet projects
190@item gnunet-java Java APIs for writing GNUnet services and applications
191@c ** FIXME: Point to new website repository once we have it:
192@c ** @item svn/gnunet-www/ Code and media helping drive the GNUnet
193website
194@item eclectic Code to run
195GNUnet nodes on testbeds for research, development, testing and evaluation
196@c ** FIXME: Solve the status and location of gnunet-qt
197@item gnunet-qt qt-based GNUnet GUI (dead?)
198@item gnunet-cocoa cocoa-based GNUnet GUI (dead?)
199
200@end table
201
202We are also working on various supporting libraries and tools:
203@c ** FIXME: What about gauger, and what about libmwmodem?
204
205@table @asis
206@item libextractor GNU libextractor (meta data extraction)
207@item libmicrohttpd GNU libmicrohttpd (embedded HTTP(S) server library)
208@item gauger Tool for performance regression analysis
209@item monkey Tool for automated debugging of distributed systems
210@item libmwmodem Library for accessing satellite connection quality
211reports
212@end table
213
214Finally, there are various external projects (see links for a list of
215those that have a public website) which build on top of the GNUnet
216framework.
217
218@c ***********************************************************************
219@node Code overview
220@section Code overview
221
222This section gives a brief overview of the GNUnet source code.
223Specifically, we sketch the function of each of the subdirectories in
224the @file{gnunet/src/} directory. The order given is roughly bottom-up
225(in terms of the layers of the system).
226
227@table @asis
228@item util/ --- libgnunetutil Library with general utility functions, all
229GNUnet binaries link against this library. Anything from memory
230allocation and data structures to cryptography and inter-process
231communication. The goal is to provide an OS-independent interface and
232more 'secure' or convenient implementations of commonly used primitives.
233The API is spread over more than a dozen headers, developers should study
234those closely to avoid duplicating existing functions.
235@item hello/ --- libgnunethello HELLO messages are used to
236describe under which addresses a peer can be reached (for example,
237protocol, IP, port). This library manages parsing and generating of HELLO
238messages.
239@item block/ --- libgnunetblock The DHT and other components of GNUnet
240store information in units called 'blocks'. Each block has a type and the
241type defines a particular format and how that binary format is to be
242linked to a hash code (the key for the DHT and for databases). The block
243library is a wapper around block plugins which provide the necessary
244functions for each block type.
245@item statistics/ The statistics service enables associating
246values (of type uint64_t) with a componenet name and a string. The main
247uses is debugging (counting events), performance tracking and user
248entertainment (what did my peer do today?).
249@item arm/ The automatic-restart-manager (ARM) service
250is the GNUnet master service. Its role is to start gnunet-services, to
251re-start them when they crashed and finally to shut down the system when
252requested.
253@item peerinfo/ The peerinfo service keeps track of which peers are known
254to the local peer and also tracks the validated addresses for each peer
255(in the form of a HELLO message) for each of those peers. The peer is not
256necessarily connected to all peers known to the peerinfo service.
257Peerinfo provides persistent storage for peer identities --- peers are
258not forgotten just because of a system restart.
259@item datacache/ --- libgnunetdatacache The datacache
260library provides (temporary) block storage for the DHT. Existing plugins
261can store blocks in Sqlite, Postgres or MySQL databases. All data stored
262in the cache is lost when the peer is stopped or restarted (datacache
263uses temporary tables).
264@item datastore/ The datastore service stores file-sharing blocks in
265databases for extended periods of time. In contrast to the datacache, data
266is not lost when peers restart. However, quota restrictions may still
267cause old, expired or low-priority data to be eventually discarded.
268Existing plugins can store blocks in Sqlite, Postgres or MySQL databases.
269@item template/ Template for writing a new service. Does nothing.
270@item ats/ The automatic transport
271selection (ATS) service is responsible for deciding which address (i.e.
272which transport plugin) should be used for communication with other peers,
273and at what bandwidth.
274@item nat/ --- libgnunetnat Library that provides basic
275functions for NAT traversal. The library supports NAT traversal with
276manual hole-punching by the user, UPnP and ICMP-based autonomous NAT
277traversal. The library also includes an API for testing if the current
278configuration works and the @code{gnunet-nat-server} which provides an
279external service to test the local configuration.
280@item fragmentation/ --- libgnunetfragmentation Some
281transports (UDP and WLAN, mostly) have restrictions on the maximum
282transfer unit (MTU) for packets. The fragmentation library can be used to
283break larger packets into chunks of at most 1k and transmit the resulting
284fragments reliabily (with acknowledgement, retransmission, timeouts,
285etc.).
286@item transport/ The transport service is responsible for managing the
287basic P2P communication. It uses plugins to support P2P communication
288over TCP, UDP, HTTP, HTTPS and other protocols.The transport service
289validates peer addresses, enforces bandwidth restrictions, limits the
290total number of connections and enforces connectivity restrictions (i.e.
291friends-only).
292@item peerinfo-tool/
293This directory contains the gnunet-peerinfo binary which can be used to
294inspect the peers and HELLOs known to the peerinfo service.
295@item core/ The core
296service is responsible for establishing encrypted, authenticated
297connections with other peers, encrypting and decrypting messages and
298forwarding messages to higher-level services that are interested in them.
299@item testing/ ---
300libgnunettesting The testing library allows starting (and stopping) peers
301for writing testcases.@
302It also supports automatic generation of configurations for peers
303ensuring that the ports and paths are disjoint. libgnunettesting is also
304the foundation for the testbed service
305@item testbed/ The testbed service is
306used for creating small or large scale deployments of GNUnet peers for
307evaluation of protocols. It facilitates peer depolyments on multiple
308hosts (for example, in a cluster) and establishing varous network
309topologies (both underlay and overlay).
310@item nse/ The network size estimation (NSE) service
311implements a protocol for (securely) estimating the current size of the
312P2P network.
313@item dht/ The distributed hash table (DHT) service provides a
314distributed implementation of a hash table to store blocks under hash
315keys in the P2P network.
316@item hostlist/ The hostlist service allows learning about
317other peers in the network by downloading HELLO messages from an HTTP
318server, can be configured to run such an HTTP server and also implements
319a P2P protocol to advertise and automatically learn about other peers
320that offer a public hostlist server.
321@item topology/ The topology service is responsible for
322maintaining the mesh topology. It tries to maintain connections to friends
323(depending on the configuration) and also tries to ensure that the peer
324has a decent number of active connections at all times. If necessary, new
325connections are added. All peers should run the topology service,
326otherwise they may end up not being connected to any other peer (unless
327some other service ensures that core establishes the required
328connections). The topology service also tells the transport service which
329connections are permitted (for friend-to-friend networking)
330@item fs/ The file-sharing (FS) service implements GNUnet's
331file-sharing application. Both anonymous file-sharing (using gap) and
332non-anonymous file-sharing (using dht) are supported.
333@item cadet/ The CADET
334service provides a general-purpose routing abstraction to create
335end-to-end encrypted tunnels in mesh networks. We wrote a paper
336documenting key aspects of the design.
337@item tun/ --- libgnunettun Library for building IPv4, IPv6
338packets and creating checksums for UDP, TCP and ICMP packets. The header
339defines C structs for common Internet packet formats and in particular
340structs for interacting with TUN (virtual network) interfaces.
341@item mysql/ ---
342libgnunetmysql Library for creating and executing prepared MySQL
343statements and to manage the connection to the MySQL database.
344Essentially a lightweight wrapper for the interaction between GNUnet
345components and libmysqlclient.
346@item dns/ Service that allows intercepting and modifying DNS requests of
347the local machine. Currently used for IPv4-IPv6 protocol translation
348(DNS-ALG) as implemented by "pt/" and for the GNUnet naming system. The
349service can also be configured to offer an exit service for DNS traffic.
350@item vpn/ The virtual
351public network (VPN) service provides a virtual tunnel interface (VTUN)
352for IP routing over GNUnet. Needs some other peers to run an "exit"
353service to work.
354Can be activated using the "gnunet-vpn" tool or integrated with DNS using
355the "pt" daemon.
356@item exit/ Daemon to allow traffic from the VPN to exit this
357peer to the Internet or to specific IP-based services of the local peer.
358Currently, an exit service can only be restricted to IPv4 or IPv6, not to
359specific ports and or IP address ranges. If this is not acceptable,
360additional firewall rules must be added manually. exit currently only
361works for normal UDP, TCP and ICMP traffic; DNS queries need to leave the
362system via a DNS service.
363@item pt/ protocol translation daemon. This daemon enables 4-to-6,
3646-to-4, 4-over-6 or 6-over-4 transitions for the local system. It
365essentially uses "DNS" to intercept DNS replies and then maps results to
366those offered by the VPN, which then sends them using mesh to some daemon
367offering an appropriate exit service.
368@item identity/ Management of egos (alter egos) of a user; identities are
369essentially named ECC private keys and used for zones in the GNU name
370system and for namespaces in file-sharing, but might find other uses later
371@item revocation/ Key revocation service, can be used to revoke the
372private key of an identity if it has been compromised
373@item namecache/ Cache
374for resolution results for the GNU name system; data is encrypted and can
375be shared among users, loss of the data should ideally only result in a
376performance degradation (persistence not required)
377@item namestore/ Database
378for the GNU name system with per-user private information, persistence
379required
380@item gns/ GNU name system, a GNU approach to DNS and PKI.
381@item dv/ A plugin
382for distance-vector (DV)-based routing. DV consists of a service and a
383transport plugin to provide peers with the illusion of a direct P2P
384connection for connections that use multiple (typically up to 3) hops in
385the actual underlay network.
386@item regex/ Service for the (distributed) evaluation of
387regular expressions.
388@item scalarproduct/ The scalar product service offers an
389API to perform a secure multiparty computation which calculates a scalar
390product between two peers without exposing the private input vectors of
391the peers to each other.
392@item consensus/ The consensus service will allow a set
393of peers to agree on a set of values via a distributed set union
394computation.
395@item rest/ The rest API allows access to GNUnet services using RESTful
396interaction. The services provide plugins that can exposed by the rest
397server.
398@item experimentation/ The experimentation daemon coordinates distributed
399experimentation to evaluate transport and ats properties
400@end table
401
402@c ***********************************************************************
403@node System Architecture
404@section System Architecture
405
406GNUnet developers like legos. The blocks are indestructible, can be
407stacked together to construct complex buildings and it is generally easy
408to swap one block for a different one that has the same shape. GNUnet's
409architecture is based on legos:
410
411@c images here
412
413This chapter documents the GNUnet lego system, also known as GNUnet's
414system architecture.
415
416The most common GNUnet component is a service. Services offer an API (or
417several, depending on what you count as "an API") which is implemented as
418a library. The library communicates with the main process of the service
419using a service-specific network protocol. The main process of the service
420typically doesn't fully provide everything that is needed --- it has holes
421to be filled by APIs to other services.
422
423A special kind of component in GNUnet are user interfaces and daemons.
424Like services, they have holes to be filled by APIs of other services.
425Unlike services, daemons do not implement their own network protocol and
426they have no API:
427
428The GNUnet system provides a range of services, daemons and user
429interfaces, which are then combined into a layered GNUnet instance (also
430known as a peer).
431
432Note that while it is generally possible to swap one service for another
433compatible service, there is often only one implementation. However,
434during development we often have a "new" version of a service in parallel
435with an "old" version. While the "new" version is not working, developers
436working on other parts of the service can continue their development by
437simply using the "old" service. Alternative design ideas can also be
438easily investigated by swapping out individual components. This is
439typically achieved by simply changing the name of the "BINARY" in the
440respective configuration section.
441
442Key properties of GNUnet services are that they must be separate
443processes and that they must protect themselves by applying tight error
444checking against the network protocol they implement (thereby achieving a
445certain degree of robustness).
446
447On the other hand, the APIs are implemented to tolerate failures of the
448service, isolating their host process from errors by the service. If the
449service process crashes, other services and daemons around it should not
450also fail, but instead wait for the service process to be restarted by
451ARM.
452
453
454@c ***********************************************************************
455@node Subsystem stability
456@section Subsystem stability
457
458This page documents the current stability of the various GNUnet
459subsystems. Stability here describes the expected degree of compatibility
460with future versions of GNUnet. For each subsystem we distinguish between
461compatibility on the P2P network level (communication protocol between
462peers), the IPC level (communication between the service and the service
463library) and the API level (stability of the API). P2P compatibility is
464relevant in terms of which applications are likely going to be able to
465communicate with future versions of the network. IPC communication is
466relevant for the implementation of language bindings that re-implement the
467IPC messages. Finally, API compatibility is relevant to developers that
468hope to be able to avoid changes to applications build on top of the APIs
469of the framework.
470
471The following table summarizes our current view of the stability of the
472respective protocols or APIs:
473
474@multitable @columnfractions .20 .20 .20 .20
475@headitem Subsystem @tab P2P @tab IPC @tab C API
476@item util @tab n/a @tab n/a @tab stable
477@item arm @tab n/a @tab stable @tab stable
478@item ats @tab n/a @tab unstable @tab testing
479@item block @tab n/a @tab n/a @tab stable
480@item cadet @tab testing @tab testing @tab testing
481@item consensus @tab experimental @tab experimental @tab experimental
482@item core @tab stable @tab stable @tab stable
483@item datacache @tab n/a @tab n/a @tab stable
484@item datastore @tab n/a @tab stable @tab stable
485@item dht @tab stable @tab stable @tab stable
486@item dns @tab stable @tab stable @tab stable
487@item dv @tab testing @tab testing @tab n/a
488@item exit @tab testing @tab n/a @tab n/a
489@item fragmentation @tab stable @tab n/a @tab stable
490@item fs @tab stable @tab stable @tab stable
491@item gns @tab stable @tab stable @tab stable
492@item hello @tab n/a @tab n/a @tab testing
493@item hostlist @tab stable @tab stable @tab n/a
494@item identity @tab stable @tab stable @tab n/a
495@item multicast @tab experimental @tab experimental @tab experimental
496@item mysql @tab stable @tab n/a @tab stable
497@item namestore @tab n/a @tab stable @tab stable
498@item nat @tab n/a @tab n/a @tab stable
499@item nse @tab stable @tab stable @tab stable
500@item peerinfo @tab n/a @tab stable @tab stable
501@item psyc @tab experimental @tab experimental @tab experimental
502@item pt @tab n/a @tab n/a @tab n/a
503@item regex @tab stable @tab stable @tab stable
504@item revocation @tab stable @tab stable @tab stable
505@item social @tab experimental @tab experimental @tab experimental
506@item statistics @tab n/a @tab stable @tab stable
507@item testbed @tab n/a @tab testing @tab testing
508@item testing @tab n/a @tab n/a @tab testing
509@item topology @tab n/a @tab n/a @tab n/a
510@item transport @tab stable @tab stable @tab stable
511@item tun @tab n/a @tab n/a @tab stable
512@item vpn @tab testing @tab n/a @tab n/a
513@end multitable
514
515Here is a rough explanation of the values:
516
517@table @samp
518@item stable
519No incompatible changes are planned at this time; for IPC/APIs, if
520there are incompatible changes, they will be minor and might only require
521minimal changes to existing code; for P2P, changes will be avoided if at
522all possible for the 0.10.x-series
523
524@item testing
525No incompatible changes are
526planned at this time, but the code is still known to be in flux; so while
527we have no concrete plans, our expectation is that there will still be
528minor modifications; for P2P, changes will likely be extensions that
529should not break existing code
530
531@item unstable
532Changes are planned and will happen; however, they
533will not be totally radical and the result should still resemble what is
534there now; nevertheless, anticipated changes will break protocol/API
535compatibility
536
537@item experimental
538Changes are planned and the result may look nothing like
539what the API/protocol looks like today
540
541@item unknown
542Someone should think about where this subsystem headed
543
544@item n/a
545This subsystem does not have an API/IPC-protocol/P2P-protocol
546@end table
547
548@c ***********************************************************************
549@node Naming conventions and coding style guide
550@section Naming conventions and coding style guide
551
552Here you can find some rules to help you write code for GNUnet.
553
554
555
556@c ***********************************************************************
557@menu
558* Naming conventions::
559* Coding style::
560@end menu
561
562@node Naming conventions
563@subsection Naming conventions
564
565
566@c ***********************************************************************
567@menu
568* include files::
569* binaries::
570* logging::
571* configuration::
572* exported symbols::
573* private (library-internal) symbols (including structs and macros)::
574* testcases::
575* performance tests::
576* src/ directories::
577@end menu
578
579@node include files
580@subsubsection include files
581
582@itemize @bullet
583@item _lib: library without need for a process
584@item _service: library that needs a service process
585@item _plugin: plugin definition
586@item _protocol: structs used in network protocol
587@item exceptions:
588@itemize @bullet
589@item gnunet_config.h --- generated
590@item platform.h --- first included
591@item plibc.h --- external library
592@item gnunet_common.h --- fundamental routines
593@item gnunet_directories.h --- generated
594@item gettext.h --- external library
595@end itemize
596@end itemize
597
598@c ***********************************************************************
599@node binaries
600@subsubsection binaries
601
602@itemize @bullet
603@item gnunet-service-xxx: service process (has listen socket)
604@item gnunet-daemon-xxx: daemon process (no listen socket)
605@item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
606@item gnunet-yyy: command-line tool for end-users
607@item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
608@item libgnunetxxx.so: library for API xxx
609@end itemize
610
611@c ***********************************************************************
612@node logging
613@subsubsection logging
614
615@itemize @bullet
616@item services and daemons use their directory name in GNUNET_log_setup
617(i.e. 'core') and log using plain 'GNUNET_log'.
618@item command-line tools use their full name in GNUNET_log_setup (i.e.
619'gnunet-publish') and log using plain 'GNUNET_log'.
620@item service access libraries log using 'GNUNET_log_from' and use
621'DIRNAME-api' for the component (i.e. 'core-api')
622@item pure libraries (without associated service) use 'GNUNET_log_from'
623with the component set to their library name (without lib or '.so'),
624which should also be their directory name (i.e. 'nat')
625@item plugins should use 'GNUNET_log_from' with the directory name and the
626plugin name combined to produce the component name (i.e. 'transport-tcp').
627@item logging should be unified per-file by defining a LOG macro with the
628appropriate arguments, along these lines:@ #define LOG(kind,...)
629GNUNET_log_from (kind, "example-api",__VA_ARGS__)
630@end itemize
631
632@c ***********************************************************************
633@node configuration
634@subsubsection configuration
635
636@itemize @bullet
637@item paths (that are substituted in all filenames) are in PATHS (have as
638few as possible)
639@item all options for a particular module (src/MODULE) are under [MODULE]
640@item options for a plugin of a module are under [MODULE-PLUGINNAME]
641@end itemize
642
643@c ***********************************************************************
644@node exported symbols
645@subsubsection exported symbols
646
647@itemize @bullet
648@item must start with "GNUNET_modulename_" and be defined in
649"modulename.c"
650@item exceptions: those defined in gnunet_common.h
651@end itemize
652
653@c ***********************************************************************
654@node private (library-internal) symbols (including structs and macros)
655@subsubsection private (library-internal) symbols (including structs and macros)
656
657@itemize @bullet
658@item must NOT start with any prefix
659@item must not be exported in a way that linkers could use them or@ other
660libraries might see them via headers; they must be either@
661declared/defined in C source files or in headers that are in@ the
662respective directory under src/modulename/ and NEVER be@ declared
663in src/include/.
664@end itemize
665
666@node testcases
667@subsubsection testcases
668
669@itemize @bullet
670@item must be called "test_module-under-test_case-description.c"
671@item "case-description" maybe omitted if there is only one test
672@end itemize
673
674@c ***********************************************************************
675@node performance tests
676@subsubsection performance tests
677
678@itemize @bullet
679@item must be called "perf_module-under-test_case-description.c"
680@item "case-description" maybe omitted if there is only one performance
681test
682@item Must only be run if HAVE_BENCHMARKS is satisfied
683@end itemize
684
685@c ***********************************************************************
686@node src/ directories
687@subsubsection src/ directories
688
689@itemize @bullet
690@item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
691@item gnunet-service-NAME: service processes with accessor library (i.e.,
692gnunet-service-arm)
693@item libgnunetNAME: accessor library (_service.h-header) or standalone
694library (_lib.h-header)
695@item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
696gnunet-daemon-hostlist) and no GNUnet management port
697@item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
698libgnunet_plugin_transport_tcp)
699@end itemize
700
701@c ***********************************************************************
702@node Coding style
703@subsection Coding style
704
705@itemize @bullet
706@item GNU guidelines generally apply
707@item Indentation is done with spaces, two per level, no tabs
708@item C99 struct initialization is fine
709@item declare only one variable per line, so@
710
711@example
712int i; int j;
713@end example
714
715instead of
716
717@example
718int i,j;
719@end example
720
721This helps keep diffs small and forces developers to think precisely about
722the type of every variable. Note that @code{char *} is different from
723@code{const char*} and @code{int} is different from @code{unsigned int}
724or @code{uint32_t}. Each variable type should be chosen with care.
725
726@item While @code{goto} should generally be avoided, having a @code{goto}
727to the end of a function to a block of clean up statements (free, close,
728etc.) can be acceptable.
729
730@item Conditions should be written with constants on the left (to avoid
731accidental assignment) and with the 'true' target being either the
732'error' case or the significantly simpler continuation. For example:
733
734@example
735if (0 != stat ("filename," &sbuf)) @{ error(); @} else @{
736 /* handle normal case here */
737@}
738@end example
739
740instead of
741
742@example
743if (stat ("filename," &sbuf) == 0) @{
744 /* handle normal case here */
745@} else @{ error(); @}
746@end example
747
748If possible, the error clause should be terminated with a 'return' (or
749'goto' to some cleanup routine) and in this case, the 'else' clause
750should be omitted:
751
752@example
753if (0 != stat ("filename," &sbuf)) @{ error(); return; @}
754/* handle normal case here */
755@end example
756
757This serves to avoid deep nesting. The 'constants on the left' rule
758applies to all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}),
759NULL, and enums). With the two above rules (constants on left, errors in
760'true' branch), there is only one way to write most branches correctly.
761
762@item Combined assignments and tests are allowed if they do not hinder
763code clarity. For example, one can write:
764
765@example
766if (NULL == (value = lookup_function())) @{ error(); return; @}
767@end example
768
769
770@item Use @code{break} and @code{continue} wherever possible to avoid
771deep(er) nesting. Thus, we would write:
772
773@example
774next = head; while (NULL != (pos = next)) @{ next = pos->next; if (!
775should_free (pos)) continue; GNUNET_CONTAINER_DLL_remove (head, tail, pos);
776GNUNET_free (pos); @}
777@end example
778
779
780instead of
781@example
782next = head; while (NULL != (pos = next)) @{ next =
783pos->next; if (should_free (pos)) @{
784 /* unnecessary nesting! */
785 GNUNET_CONTAINER_DLL_remove (head, tail, pos); GNUNET_free (pos); @} @}
786@end example
787
788
789@item We primarily use @code{for} and @code{while} loops. A @code{while}
790loop is used if the method for advancing in the loop is not a
791straightforward increment operation. In particular, we use:
792
793@example
794next = head;
795while (NULL != (pos = next))
796@{
797 next = pos->next;
798 if (! should_free (pos))
799 continue;
800 GNUNET_CONTAINER_DLL_remove (head, tail, pos);
801 GNUNET_free (pos);
802@}
803@end example
804
805
806to free entries in a list (as the iteration changes the structure of the
807list due to the free; the equivalent @code{for} loop does no longer
808follow the simple @code{for} paradigm of @code{for(INIT;TEST;INC)}).
809However, for loops that do follow the simple @code{for} paradigm we do
810use @code{for}, even if it involves linked lists:
811
812@example
813/* simple iteration over a linked list */
814for (pos = head; NULL != pos; pos = pos->next)
815@{
816 use (pos);
817@}
818@end example
819
820
821@item The first argument to all higher-order functions in GNUnet must be
822declared to be of type @code{void *} and is reserved for a closure. We do
823not use inner functions, as trampolines would conflict with setups that
824use non-executable stacks.@ The first statement in a higher-order
825function, which unusually should be part of the variable declarations,
826should assign the @code{cls} argument to the precise expected type.
827For example:
828
829@example
830int callback (void *cls, char *args) @{
831 struct Foo *foo = cls; int other_variables;
832
833 /* rest of function */
834@}
835@end example
836
837
838@item It is good practice to write complex @code{if} expressions instead
839of using deeply nested @code{if} statements. However, except for addition
840and multiplication, all operators should use parens. This is fine:
841
842@example
843if ( (1 == foo) || ((0 == bar) && (x != y)) )
844 return x;
845@end example
846
847
848However, this is not:
849@example
850if (1 == foo)
851 return x;
852if (0 == bar && x != y)
853 return x;
854@end example
855
856
857Note that splitting the @code{if} statement above is debateable as the
858@code{return x} is a very trivial statement. However, once the logic after
859the branch becomes more complicated (and is still identical), the "or"
860formulation should be used for sure.
861
862@item There should be two empty lines between the end of the function and
863the comments describing the following function. There should be a single
864empty line after the initial variable declarations of a function. If a
865function has no local variables, there should be no initial empty line. If
866a long function consists of several complex steps, those steps might be
867separated by an empty line (possibly followed by a comment describing the
868following step). The code should not contain empty lines in arbitrary
869places; if in doubt, it is likely better to NOT have an empty line (this
870way, more code will fit on the screen).
871@end itemize
872
873@c ***********************************************************************
874@node Build-system
875@section Build-system
876
877If you have code that is likely not to compile or build rules you might
878want to not trigger for most developers, use "if HAVE_EXPERIMENTAL" in
879your Makefile.am. Then it is OK to (temporarily) add non-compiling (or
880known-to-not-port) code.
881
882If you want to compile all testcases but NOT run them, run configure with
883the @code{--enable-test-suppression} option.
884
885If you want to run all testcases, including those that take a while, run
886configure with the @code{--enable-expensive-testcases} option.
887
888If you want to compile and run benchmarks, run configure with the
889@code{--enable-benchmarks} option.
890
891If you want to obtain code coverage results, run configure with the
892@code{--enable-coverage} option and run the coverage.sh script in
893@file{contrib/}.
894
895@c ***********************************************************************
896@node Developing extensions for GNUnet using the gnunet-ext template
897@section Developing extensions for GNUnet using the gnunet-ext template
898
899
900For developers who want to write extensions for GNUnet we provide the
901gnunet-ext template to provide an easy to use skeleton.
902
903gnunet-ext contains the build environment and template files for the
904development of GNUnet services, command line tools, APIs and tests.
905
906First of all you have to obtain gnunet-ext from git:
907
908@code{git clone https://gnunet.org/git/gnunet-ext.git}
909
910The next step is to bootstrap and configure it. For configure you have to
911provide the path containing GNUnet with
912@code{--with-gnunet=/path/to/gnunet} and the prefix where you want the
913install the extension using @code{--prefix=/path/to/install}:
914
915@example
916./bootstrap
917./configure --prefix=/path/to/install --with-gnunet=/path/to/gnunet
918@end example
919
920When your GNUnet installation is not included in the default linker search
921path, you have to add @code{/path/to/gnunet} to the file
922@file{/etc/ld.so.conf} and run @code{ldconfig} or your add it to the
923environmental variable @code{LD_LIBRARY_PATH} by using
924
925@code{export LD_LIBRARY_PATH=/path/to/gnunet/lib}
926
927@c ***********************************************************************
928@node Writing testcases
929@section Writing testcases
930
931Ideally, any non-trivial GNUnet code should be covered by automated
932testcases. Testcases should reside in the same place as the code that is
933being tested. The name of source files implementing tests should begin
934with "test_" followed by the name of the file that contains the code that
935is being tested.
936
937Testcases in GNUnet should be integrated with the autotools build system.
938This way, developers and anyone building binary packages will be able to
939run all testcases simply by running @code{make check}. The final
940testcases shipped with the distribution should output at most some brief
941progress information and not display debug messages by default. The
942success or failure of a testcase must be indicated by returning zero
943(success) or non-zero (failure) from the main method of the testcase. The
944integration with the autotools is relatively straightforward and only
945requires modifications to the @code{Makefile.am} in the directory
946containing the testcase. For a testcase testing the code in @code{foo.c}
947the @code{Makefile.am} would contain the following lines:
948
949@example
950check_PROGRAMS = test_foo TESTS = $(check_PROGRAMS) test_foo_SOURCES =
951test_foo.c test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
952@end example
953
954Naturally, other libraries used by the testcase may be specified in the
955@code{LDADD} directive as necessary.
956
957Often testcases depend on additional input files, such as a configuration
958file. These support files have to be listed using the EXTRA_DIST
959directive in order to ensure that they are included in the distribution.
960Example:
961
962@example
963EXTRA_DIST = test_foo_data.conf
964@end example
965
966Executing @code{make check} will run all testcases in the current
967directory and all subdirectories. Testcases can be compiled individually
968by running @code{make test_foo} and then invoked directly using
969@code{./test_foo}. Note that due to the use of plugins in GNUnet, it is
970typically necessary to run @code{make install} before running any
971testcases. Thus the canonical command @code{make check install} has to be
972changed to @code{make install check} for GNUnet.
973
974@c ***********************************************************************
975@node GNUnet's TESTING library
976@section GNUnet's TESTING library
977
978The TESTING library is used for writing testcases which involve starting a
979single or multiple peers. While peers can also be started by testcases
980using the ARM subsystem, using TESTING library provides an elegant way to
981do this. The configurations of the peers are auto-generated from a given
982template to have non-conflicting port numbers ensuring that peers'
983services do not run into bind errors. This is achieved by testing ports'
984availability by binding a listening socket to them before allocating them
985to services in the generated configurations.
986
987An another advantage while using TESTING is that it shortens the testcase
988startup time as the hostkeys for peers are copied from a pre-computed set
989of hostkeys instead of generating them at peer startup which may take a
990considerable amount of time when starting multiple peers or on an embedded
991processor.
992
993TESTING also allows for certain services to be shared among peers. This
994feature is invaluable when testing with multiple peers as it helps to
995reduce the number of services run per each peer and hence the total
996number of processes run per testcase.
997
998TESTING library only handles creating, starting and stopping peers.
999Features useful for testcases such as connecting peers in a topology are
1000not available in TESTING but are available in the TESTBED subsystem.
1001Furthermore, TESTING only creates peers on the localhost, however by
1002using TESTBED testcases can benefit from creating peers across multiple
1003hosts.
1004
1005@menu
1006* API::
1007* Finer control over peer stop::
1008* Helper functions::
1009* Testing with multiple processes::
1010@end menu
1011
1012@c ***********************************************************************
1013@node API
1014@subsection API
1015
1016TESTING abstracts a group of peers as a TESTING system. All peers in a
1017system have common hostname and no two services of these peers have a
1018same port or a UNIX domain socket path.
1019
1020TESTING system can be created with the function
1021@code{GNUNET_TESTING_system_create()} which returns a handle to the
1022system. This function takes a directory path which is used for generating
1023the configurations of peers, an IP address from which connections to the
1024peers' services should be allowed, the hostname to be used in peers'
1025configuration, and an array of shared service specifications of type
1026@code{struct GNUNET_TESTING_SharedService}.
1027
1028The shared service specification must specify the name of the service to
1029share, the configuration pertaining to that shared service and the
1030maximum number of peers that are allowed to share a single instance of
1031the shared service.
1032
1033TESTING system created with @code{GNUNET_TESTING_system_create()} chooses
1034ports from the default range 12000 - 56000 while auto-generating
1035configurations for peers. This range can be customised with the function
1036@code{GNUNET_TESTING_system_create_with_portrange()}. This function is
1037similar to @code{GNUNET_TESTING_system_create()} except that it take 2
1038additional parameters --- the start and end of the port range to use.
1039
1040A TESTING system is destroyed with the funciton
1041@code{GNUNET_TESTING_system_destory()}. This function takes the handle of
1042the system and a flag to remove the files created in the directory used
1043to generate configurations.
1044
1045A peer is created with the function
1046@code{GNUNET_TESTING_peer_configure()}. This functions takes the system
1047handle, a configuration template from which the configuration for the peer
1048is auto-generated and the index from where the hostkey for the peer has to
1049be copied from. When successfull, this function returs a handle to the
1050peer which can be used to start and stop it and to obtain the identity of
1051the peer. If unsuccessful, a NULL pointer is returned with an error
1052message. This function handles the generated configuration to have
1053non-conflicting ports and paths.
1054
1055Peers can be started and stopped by calling the functions
1056@code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
1057respectively. A peer can be destroyed by calling the function
1058@code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports
1059and paths in allocated in its configuration are reclaimed for usage in new
1060peers.
1061
1062@c ***********************************************************************
1063@node Finer control over peer stop
1064@subsection Finer control over peer stop
1065
1066Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
1067However, calling this function for each peer is inefficient when trying to
1068shutdown multiple peers as this function sends the termination signal to
1069the given peer process and waits for it to terminate. It would be faster
1070in this case to send the termination signals to the peers first and then
1071wait on them. This is accomplished by the functions
1072@code{GNUNET_TESTING_peer_kill()} which sends a termination signal to the
1073peer, and the function @code{GNUNET_TESTING_peer_wait()} which waits on
1074the peer.
1075
1076Further finer control can be achieved by choosing to stop a peer
1077asynchronously with the function @code{GNUNET_TESTING_peer_stop_async()}.
1078This function takes a callback parameter and a closure for it in addition
1079to the handle to the peer to stop. The callback function is called with
1080the given closure when the peer is stopped. Using this function
1081eliminates blocking while waiting for the peer to terminate.
1082
1083An asynchronous peer stop can be cancelled by calling the function
1084@code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this
1085function does not prevent the peer from terminating if the termination
1086signal has already been sent to it. It does, however, cancels the
1087callback to be called when the peer is stopped.
1088
1089@c ***********************************************************************
1090@node Helper functions
1091@subsection Helper functions
1092
1093Most of the testcases can benefit from an abstraction which configures a
1094peer and starts it. This is provided by the function
1095@code{GNUNET_TESTING_peer_run()}. This function takes the testing
1096directory pathname, a configuration template, a callback and its closure.
1097This function creates a peer in the given testing directory by using the
1098configuration template, starts the peer and calls the given callback with
1099the given closure.
1100
1101The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of
1102the peer which starts the rest of the configured services. A similar
1103function @code{GNUNET_TESTING_service_run} can be used to just start a
1104single service of a peer. In this case, the peer's ARM service is not
1105started; instead, only the given service is run.
1106
1107@c ***********************************************************************
1108@node Testing with multiple processes
1109@subsection Testing with multiple processes
1110
1111When testing GNUnet, the splitting of the code into a services and clients
1112often complicates testing. The solution to this is to have the testcase
1113fork @code{gnunet-service-arm}, ask it to start the required server and
1114daemon processes and then execute appropriate client actions (to test the
1115client APIs or the core module or both). If necessary, multiple ARM
1116services can be forked using different ports (!) to simulate a network.
1117However, most of the time only one ARM process is needed. Note that on
1118exit, the testcase should shutdown ARM with a @code{TERM} signal (to give
1119it the chance to cleanly stop its child processes).
1120
1121The following code illustrates spawning and killing an ARM process from a
1122testcase:
1123
1124@example
1125static void run (void *cls, char *const *args, const char
1126*cfgfile, const struct GNUNET_CONFIGURATION_Handle *cfg) @{ struct
1127GNUNET_OS_Process *arm_pid; arm_pid = GNUNET_OS_start_process (NULL, NULL,
1128"gnunet-service-arm", "gnunet-service-arm", "-c", cfgname, NULL);
1129 /* do real test work here */
1130 if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM)) GNUNET_log_strerror
1131 (GNUNET_ERROR_TYPE_WARNING, "kill"); GNUNET_assert (GNUNET_OK ==
1132 GNUNET_OS_process_wait (arm_pid)); GNUNET_OS_process_close (arm_pid); @}
1133
1134GNUNET_PROGRAM_run (argc, argv, "NAME-OF-TEST", "nohelp", options, &run, cls);
1135@end example
1136
1137
1138An alternative way that works well to test plugins is to implement a
1139mock-version of the environment that the plugin expects and then to
1140simply load the plugin directly.
1141
1142@c ***********************************************************************
1143@node Performance regression analysis with Gauger
1144@section Performance regression analysis with Gauger
1145
1146To help avoid performance regressions, GNUnet uses Gauger. Gauger is a
1147simple logging tool that allows remote hosts to send performance data to
1148a central server, where this data can be analyzed and visualized. Gauger
1149shows graphs of the repository revisions and the performace data recorded
1150for each revision, so sudden performance peaks or drops can be identified
1151and linked to a specific revision number.
1152
1153In the case of GNUnet, the buildbots log the performance data obtained
1154during the tests after each build. The data can be accesed on GNUnet's
1155Gauger page.
1156
1157The menu on the left allows to select either the results of just one
1158build bot (under "Hosts") or review the data from all hosts for a given
1159test result (under "Metrics"). In case of very different absolute value
1160of the results, for instance arm vs. amd64 machines, the option
1161"Normalize" on a metric view can help to get an idea about the
1162performance evolution across all hosts.
1163
1164Using Gauger in GNUnet and having the performance of a module tracked over
1165time is very easy. First of course, the testcase must generate some
1166consistent metric, which makes sense to have logged. Highly volatile or
1167random dependant metrics probably are not ideal candidates for meaningful
1168regression detection.
1169
1170To start logging any value, just include @code{gauger.h} in your testcase
1171code. Then, use the macro @code{GAUGER()} to make the buildbots log
1172whatever value is of interest for you to @code{gnunet.org}'s Gauger
1173server. No setup is necessary as most buildbots have already everything
1174in place and new metrics are created on demand. To delete a metric, you
1175need to contact a member of the GNUnet development team (a file will need
1176to be removed manually from the respective directory).
1177
1178The code in the test should look like this:
1179
1180@example
1181[other includes]
1182#include <gauger.h>
1183
1184int main (int argc, char *argv[]) @{
1185
1186 [run test, generate data] GAUGER("YOUR_MODULE", "METRIC_NAME", (float)value,
1187 "UNIT"); @}
1188@end example
1189
1190
1191Where:
1192
1193@table @asis
1194
1195@item @strong{YOUR_MODULE} is a category in the gauger page and should be
1196the name of the module or subsystem like "Core" or "DHT"
1197@item @strong{METRIC} is
1198the name of the metric being collected and should be concise and
1199descriptive, like "PUT operations in sqlite-datastore".
1200@item @strong{value} is the value
1201of the metric that is logged for this run.
1202@item @strong{UNIT} is the unit in
1203which the value is measured, for instance "kb/s" or "kb of RAM/node".
1204@end table
1205
1206If you wish to use Gauger for your own project, you can grab a copy of the
1207latest stable release or check out Gauger's Subversion repository.
1208
1209@c ***********************************************************************
1210@node GNUnet's TESTBED Subsystem
1211@section GNUnet's TESTBED Subsystem
1212
1213The TESTBED subsystem facilitates testing and measuring of multi-peer
1214deployments on a single host or over multiple hosts.
1215
1216The architecture of the testbed module is divided into the following:
1217@itemize @bullet
1218
1219@item Testbed API: An API which is used by the testing driver programs. It
1220provides with functions for creating, destroying, starting, stopping
1221peers, etc.
1222
1223@item Testbed service (controller): A service which is started through the
1224Testbed API. This service handles operations to create, destroy, start,
1225stop peers, connect them, modify their configurations.
1226
1227@item Testbed helper: When a controller has to be started on a host, the
1228testbed API starts the testbed helper on that host which in turn starts
1229the controller. The testbed helper receives a configuration for the
1230controller through its stdin and changes it to ensure the controller
1231doesn't run into any port conflict on that host.
1232@end itemize
1233
1234
1235The testbed service (controller) is different from the other GNUnet
1236services in that it is not started by ARM and is not supposed to be run
1237as a daemon. It is started by the testbed API through a testbed helper.
1238In a typical scenario involving multiple hosts, a controller is started
1239on each host. Controllers take up the actual task of creating peers,
1240starting and stopping them on the hosts they run.
1241
1242While running deployments on a single localhost the testbed API starts the
1243testbed helper directly as a child process. When running deployments on
1244remote hosts the testbed API starts Testbed Helpers on each remote host
1245through remote shell. By default testbed API uses SSH as a remote shell.
1246This can be changed by setting the environmental variable
1247GNUNET_TESTBED_RSH_CMD to the required remote shell program. This
1248variable can also contain parameters which are to be passed to the remote
1249shell program. For e.g:
1250
1251@example
1252export GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes \
1253-o NoHostAuthenticationForLocalhost=yes %h"@
1254@end example
1255
1256Substitutions are allowed int the above command string also allows for
1257substitions. through placemarks which begin with a `%'. At present the
1258following substitutions are supported
1259
1260@itemize @bullet
1261@item
1262%h: hostname
1263@item
1264%u: username
1265@item
1266%p: port
1267@end itemize
1268
1269Note that the substitution placemark is replaced only when the
1270corresponding field is available and only once. Specifying @code{%u@@%h}
1271doesn't work either. If you want to user username substitutions for SSH
1272use the argument @code{-l} before the username substitution.
1273Ex: @code{ssh -l %u -p %p %h}
1274
1275The testbed API and the helper communicate through the helpers stdin and
1276stdout. As the helper is started through a remote shell on remote hosts
1277any output messages from the remote shell interfere with the communication
1278and results in a failure while starting the helper. For this reason, it is
1279suggested to use flags to make the remote shells produce no output
1280messages and to have password-less logins. The default remote shell, SSH,
1281the default options are:
1282
1283@example
1284-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes"
1285@end example
1286
1287Password-less logins should be ensured by using SSH keys.
1288
1289Since the testbed API executes the remote shell as a non-interactive
1290shell, certain scripts like .bashrc, .profiler may not be executed. If
1291this is the case testbed API can be forced to execute an interactive
1292shell by setting up the environmental variable
1293`GNUNET_TESTBED_RSH_CMD_SUFFIX' to a shell program.
1294An example could be:
1295
1296@example
1297export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"
1298@end example
1299
1300The testbed API will then execute the remote shell program as:
1301
1302@example
1303$GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX \
1304gnunet-helper-testbed
1305@end example
1306
1307On some systems, problems may arise while starting testbed helpers if
1308GNUnet is installed into a custom location since the helper may not be
1309found in the standard path. This can be addressed by setting the variable
1310`HELPER_BINARY_PATH' to the path of the testbed helper. Testbed API will
1311then use this path to start helper binaries both locally and remotely.
1312
1313Testbed API can accessed by including "gnunet_testbed_service.h" file and
1314linking with -lgnunettestbed.
1315
1316
1317
1318@c ***********************************************************************
1319@menu
1320* Supported Topologies::
1321* Hosts file format::
1322* Topology file format::
1323* Testbed Barriers::
1324* Automatic large-scale deployment of GNUnet in the PlanetLab testbed::
1325* TESTBED Caveats::
1326@end menu
1327
1328@node Supported Topologies
1329@subsection Supported Topologies
1330
1331While testing multi-peer deployments, it is often needed that the peers
1332are connected in some topology. This requirement is addressed by the
1333function @code{GNUNET_TESTBED_overlay_connect()} which connects any given
1334two peers in the testbed.
1335
1336The API also provides a helper function
1337@code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set
1338of peers in any of the following supported topologies:
1339
1340@itemize @bullet
1341
1342@item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with
1343each other
1344
1345@item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a
1346line
1347
1348@item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a
1349ring topology
1350
1351@item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to
1352form a 2 dimensional torus topology. The number of peers may not be a
1353perfect square, in that case the resulting torus may not have the uniform
1354poloidal and toroidal lengths
1355
1356@item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated
1357to form a random graph. The number of links to be present should be given
1358
1359@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to
1360form a 2D Torus with some random links among them. The number of random
1361links are to be given
1362
1363@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are
1364connected to form a ring with some random links among them. The number of
1365random links are to be given
1366
1367@item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a
1368topology where peer connectivity follows power law - new peers are
1369connected with high probabililty to well connected peers.
1370@footnote{See Emergence of Scaling in Random Networks. Science 286,
1371509-512, 1999.}
1372
1373@item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information
1374is loaded from a file. The path to the file has to be given. See Topology
1375file format for the format of this file.
1376
1377@item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
1378@end itemize
1379
1380
1381The above supported topologies can be specified respectively by setting
1382the variable @code{OVERLAY_TOPOLOGY} to the following values in the
1383configuration passed to Testbed API functions
1384@code{GNUNET_TESTBED_test_run()} and
1385@code{GNUNET_TESTBED_run()}:
1386@itemize @bullet
1387@item @code{CLIQUE}
1388@item @code{RING}
1389@item @code{LINE}
1390@item @code{2D_TORUS}
1391@item @code{RANDOM}
1392@item @code{SMALL_WORLD}
1393@item @code{SMALL_WORLD_RING}
1394@item @code{SCALE_FREE}
1395@item @code{FROM_FILE}
1396@item @code{NONE}
1397@end itemize
1398
1399
1400Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
1401require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
1402random links to be generated in the configuration. The option will be
1403ignored for the rest of the topologies.
1404
1405Topology @code{SCALE_FREE} requires the options
1406@code{SCALE_FREE_TOPOLOGY_CAP} to be set to the maximum number of peers
1407which can connect to a peer and @code{SCALE_FREE_TOPOLOGY_M} to be set to
1408how many peers a peer should be atleast connected to.
1409
1410Similarly, the topology @code{FROM_FILE} requires the option
1411@code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing
1412the topology information. This option is ignored for the rest of the
1413topologies. See Topology file format for the format of this file.
1414
1415@c ***********************************************************************
1416@node Hosts file format
1417@subsection Hosts file format
1418
1419The testbed API offers the function GNUNET_TESTBED_hosts_load_from_file()
1420to load from a given file details about the hosts which testbed can use
1421for deploying peers. This function is useful to keep the data about hosts
1422separate instead of hard coding them in code.
1423
1424Another helper function from testbed API, GNUNET_TESTBED_run() also takes
1425a hosts file name as its parameter. It uses the above function to
1426populate the hosts data structures and start controllers to deploy peers.
1427
1428These functions require the hosts file to be of the following format:
1429@itemize @bullet
1430@item Each line is interpreted to have details about a host
1431@item Host details should include the username to use for logging into the
1432host, the hostname of the host and the port number to use for the remote
1433shell program. All thee values should be given.
1434@item These details should be given in the following format:
1435@code{<username>@@<hostname>:<port>}
1436@end itemize
1437
1438Note that having canonical hostnames may cause problems while resolving
1439the IP addresses (See this bug). Hence it is advised to provide the hosts'
1440IP numerical addresses as hostnames whenever possible.
1441
1442@c ***********************************************************************
1443@node Topology file format
1444@subsection Topology file format
1445
1446A topology file describes how peers are to be connected. It should adhere
1447to the following format for testbed to parse it correctly.
1448
1449Each line should begin with the target peer id. This should be followed by
1450a colon(`:') and origin peer ids seperated by `|'. All spaces except for
1451newline characters are ignored. The API will then try to connect each
1452origin peer to the target peer.
1453
1454For example, the following file will result in 5 overlay connections:
1455[2->1], [3->1],[4->3], [0->3], [2->0]@ @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
1456
1457@c ***********************************************************************
1458@node Testbed Barriers
1459@subsection Testbed Barriers
1460
1461The testbed subsystem's barriers API facilitates coordination among the
1462peers run by the testbed and the experiment driver. The concept is
1463similar to the barrier synchronisation mechanism found in parallel
1464programming or multi-threading paradigms - a peer waits at a barrier upon
1465reaching it until the barrier is reached by a predefined number of peers.
1466This predefined number of peers required to cross a barrier is also called
1467quorum. We say a peer has reached a barrier if the peer is waiting for the
1468barrier to be crossed. Similarly a barrier is said to be reached if the
1469required quorum of peers reach the barrier. A barrier which is reached is
1470deemed as crossed after all the peers waiting on it are notified.
1471
1472The barriers API provides the following functions:
1473@itemize @bullet
1474@item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to
1475initialse a barrier in the experiment
1476@item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel
1477a barrier which has been initialised before
1478@item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal
1479barrier service that the caller has reached a barrier and is waiting for
1480it to be crossed
1481@item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to
1482stop waiting for a barrier to be crossed
1483@end itemize
1484
1485
1486Among the above functions, the first two, namely
1487@code{GNUNET_TESTBED_barrier_init()} and
1488@code{GNUNET_TESTBED_barrier_cancel()} are used by experiment drivers. All
1489barriers should be initialised by the experiment driver by calling
1490@code{GNUNET_TESTBED_barrier_init()}. This function takes a name to
1491identify the barrier, the quorum required for the barrier to be crossed
1492and a notification callback for notifying the experiment driver when the
1493barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()} cancels an
1494initialised barrier and frees the resources allocated for it. This
1495function can be called upon a initialised barrier before it is crossed.
1496
1497The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
1498@code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's
1499processes. @code{GNUNET_TESTBED_barrier_wait()} connects to the local
1500barrier service running on the same host the peer is running on and
1501registers that the caller has reached the barrier and is waiting for the
1502barrier to be crossed. Note that this function can only be used by peers
1503which are started by testbed as this function tries to access the local
1504barrier service which is part of the testbed controller service. Calling
1505@code{GNUNET_TESTBED_barrier_wait()} on an uninitialised barrier results
1506in failure. @code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the
1507notification registered by @code{GNUNET_TESTBED_barrier_wait()}.
1508
1509
1510@c ***********************************************************************
1511@menu
1512* Implementation::
1513@end menu
1514
1515@node Implementation
1516@subsubsection Implementation
1517
1518Since barriers involve coordination between experiment driver and peers,
1519the barrier service in the testbed controller is split into two
1520components. The first component responds to the message generated by the
1521barrier API used by the experiment driver (functions
1522@code{GNUNET_TESTBED_barrier_init()} and
1523@code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
1524messages generated by barrier API used by peers (functions
1525@code{GNUNET_TESTBED_barrier_wait()} and
1526@code{GNUNET_TESTBED_barrier_wait_cancel()}).
1527
1528Calling @code{GNUNET_TESTBED_barrier_init()} sends a
1529@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
1530controller. The master controller then registers a barrier and calls
1531@code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this
1532way barrier initialisation is propagated to the controller hierarchy.
1533While propagating initialisation, any errors at a subcontroller such as
1534timeout during further propagation are reported up the hierarchy back to
1535the experiment driver.
1536
1537Similar to @code{GNUNET_TESTBED_barrier_init()},
1538@code{GNUNET_TESTBED_barrier_cancel()} propagates
1539@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
1540controllers to remove an initialised barrier.
1541
1542The second component is implemented as a separate service in the binary
1543`gnunet-service-testbed' which already has the testbed controller service.
1544Although this deviates from the gnunet process architecture of having one
1545service per binary, it is needed in this case as this component needs
1546access to barrier data created by the first component. This component
1547responds to @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from
1548local peers when they call @code{GNUNET_TESTBED_barrier_wait()}. Upon
1549receiving @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the
1550service checks if the requested barrier has been initialised before and
1551if it was not initialised, an error status is sent through
1552@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local
1553peer and the connection from the peer is terminated. If the barrier is
1554initialised before, the barrier's counter for reached peers is incremented
1555and a notification is registered to notify the peer when the barrier is
1556reached. The connection from the peer is left open.
1557
1558When enough peers required to attain the quorum send
1559@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller
1560sends a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its
1561parent informing that the barrier is crossed. If the controller has
1562started further subcontrollers, it delays this message until it receives
1563a similar notification from each of those subcontrollers. Finally, the
1564barriers API at the experiment driver receives the
1565@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the barrier is
1566reached at all the controllers.
1567
1568The barriers API at the experiment driver responds to the
1569@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it
1570back to the master controller and notifying the experiment controller
1571through the notification callback that a barrier has been crossed. The
1572echoed @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is
1573propagated by the master controller to the controller hierarchy. This
1574propagation triggers the notifications registered by peers at each of the
1575controllers in the hierarchy. Note the difference between this downward
1576propagation of the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS}
1577message from its upward propagation --- the upward propagation is needed
1578for ensuring that the barrier is reached by all the controllers and the
1579downward propagation is for triggering that the barrier is crossed.
1580
1581@c ***********************************************************************
1582@node Automatic large-scale deployment of GNUnet in the PlanetLab testbed
1583@subsection Automatic large-scale deployment of GNUnet in the PlanetLab testbed
1584
1585PlanetLab is as a testbed for computer networking and distributed systems
1586research. It was established in 2002 and as of June 2010 was composed of
15871090 nodes at 507 sites worldwide.
1588
1589To automate the GNUnet we created a set of automation tools to simplify
1590the large-scale deployment. We provide you a set of scripts you can use
1591to deploy GNUnet on a set of nodes and manage your installation.
1592
1593Please also check @uref{https://gnunet.org/installation-fedora8-svn} and
1594@uref{https://gnunet.org/installation-fedora12-svn} to find detailled
1595instructions how to install GNUnet on a PlanetLab node.
1596
1597
1598@c ***********************************************************************
1599@menu
1600* PlanetLab Automation for Fedora8 nodes::
1601* Install buildslave on PlanetLab nodes running fedora core 8::
1602* Setup a new PlanetLab testbed using GPLMT::
1603* Why do i get an ssh error when using the regex profiler?::
1604@end menu
1605
1606@node PlanetLab Automation for Fedora8 nodes
1607@subsubsection PlanetLab Automation for Fedora8 nodes
1608
1609@c ***********************************************************************
1610@node Install buildslave on PlanetLab nodes running fedora core 8
1611@subsubsection Install buildslave on PlanetLab nodes running fedora core 8
1612@c ** Actually this is a subsubsubsection, but must be fixed differently
1613@c ** as subsubsection is the lowest.
1614
1615Since most of the PlanetLab nodes are running the very old fedora core 8
1616image, installing the buildslave software is quite some pain. For our
1617PlanetLab testbed we figured out how to install the buildslave software
1618best.
1619
1620@c This is a vvery terrible way to suggest installing software.
1621@c FIXME: Is there an official, safer way instead of blind-piping a
1622@c script?
1623@c FIXME: Use newer pypi URLs below.
1624Install Distribute for python:@ @code{@ curl
1625http://python-distribute.org/distribute_setup.py | sudo python@ }
1626
1627Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not
1628work):
1629
1630@example
1631wget https://pypi.python.org/packages/source/z/zope.interface/zope.interface-3.8.0.tar.gz
1632tar zvfz zope.interface-3.8.0.tar.gz@ cd zope.interface-3.8.0
1633sudo python setup.py install
1634@end example
1635
1636Install the buildslave software (0.8.6 was the latest version):
1637
1638@example
1639wget http://buildbot.googlecode.com/files/buildbot-slave-0.8.6p1.tar.gz
1640tar xvfz buildbot-slave-0.8.6p1.tar.gz@ cd buildslave-0.8.6p1
1641sudo python setup.py install
1642@end example
1643
1644The setup will download the matching twisted package and install it.
1645It will also try to install the latest version of zope.interface which
1646will fail to install. Buildslave will work anyway since version 3.8.0
1647was installed before!
1648
1649@c ***********************************************************************
1650@node Setup a new PlanetLab testbed using GPLMT
1651@subsubsection Setup a new PlanetLab testbed using GPLMT
1652
1653@itemize @bullet
1654@item Get a new slice and assign nodes
1655Ask your PlanetLab PI to give you a new slice and assign the nodes you
1656need
1657@item Install a buildmaster
1658You can stick to the buildbot documentation:@
1659@uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
1660@item Install the buildslave software on all nodes
1661To install the buildslave on all nodes assigned to your slice you can use
1662the tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
1663
1664@example
1665./gplmt.py -c contrib/tumple_gnunet.conf -t \
1666contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>
1667@end example
1668
1669@item Create the buildmaster configuration and the slave setup commands
1670
1671The master and the and the slaves have need to have credentials and the
1672master has to have all nodes configured. This can be done with the
1673@code{create_buildbot_configuration.py} script in the @code{scripts}
1674directory
1675
1676This scripts takes a list of nodes retrieved directly from PlanetLab or
1677read from a file and a configuration template and creates:
1678
1679@itemize @bullet
1680@item a tasklist which can be executed with gplmt to setup the slaves
1681@item a master.cfg file containing a PlanetLab nodes
1682@end itemize
1683
1684A configuration template is included in the <contrib>, most important is
1685that the script replaces the following tags in the template:
1686
1687%GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
1688%GPLMT_SCHEDULER_BUILDERS
1689
1690Create configuration for all nodes assigned to a slice:@ @code{@
1691./create_buildbot_configuration.py -u <planetlab username> -p <planetlab
1692password> -s <slice> -m <buildmaster+port> -t <template>@ }@ Create
1693configuration for some nodes in a file:@ @code{@
1694./create_buildbot_configuration.p -f <node_file> -m <buildmaster+port> -t
1695<template>@ }
1696
1697@item Copy the @code{master.cfg} to the buildmaster and start it
1698Use @code{buildbot start <basedir>} to start the server
1699@item Setup the buildslaves
1700@end itemize
1701
1702@c ***********************************************************************
1703@node Why do i get an ssh error when using the regex profiler?
1704@subsubsection Why do i get an ssh error when using the regex profiler?
1705
1706Why do i get an ssh error "Permission denied (publickey,password)." when
1707using the regex profiler although passwordless ssh to localhost works
1708using publickey and ssh-agent?
1709
1710You have to generate a public/private-key pair with no password:@
1711@code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
1712and then add the following to your ~/.ssh/config file:
1713
1714@code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
1715
1716now make sure your hostsfile looks like@
1717
1718[USERNAME]@@127.0.0.1:22@
1719[USERNAME]@@127.0.0.1:22
1720
1721You can test your setup by running `ssh 127.0.0.1` in a terminal and then
1722in the opened session run it again. If you were not asked for a password
1723on either login, then you should be good to go.
1724
1725@c ***********************************************************************
1726@node TESTBED Caveats
1727@subsection TESTBED Caveats
1728
1729This section documents a few caveats when using the GNUnet testbed
1730subsystem.
1731
1732
1733@c ***********************************************************************
1734@menu
1735* CORE must be started::
1736* ATS must want the connections::
1737@end menu
1738
1739@node CORE must be started
1740@subsubsection CORE must be started
1741
1742A simple issue is #3993: Your configuration MUST somehow ensure that for
1743each peer the CORE service is started when the peer is setup, otherwise
1744TESTBED may fail to connect peers when the topology is initialized, as
1745TESTBED will start some CORE services but not necessarily all (but it
1746relies on all of them running). The easiest way is to set
1747'FORCESTART = YES' in the '[core]' section of the configuration file.
1748Alternatively, having any service that directly or indirectly depends on
1749CORE being started with FORCESTART will also do. This issue largely arises
1750if users try to over-optimize by not starting any services with
1751FORCESTART.
1752
1753@c ***********************************************************************
1754@node ATS must want the connections
1755@subsubsection ATS must want the connections
1756
1757When TESTBED sets up connections, it only offers the respective HELLO
1758information to the TRANSPORT service. It is then up to the ATS service to
1759@strong{decide} to use the connection. The ATS service will typically
1760eagerly establish any connection if the number of total connections is
1761low (relative to bandwidth). Details may further depend on the
1762specific ATS backend that was configured. If ATS decides to NOT establish
1763a connection (even though TESTBED provided the required information), then
1764that connection will count as failed for TESTBED. Note that you can
1765configure TESTBED to tolerate a certain number of connection failures
1766(see '-e' option of gnunet-testbed-profiler). This issue largely arises
1767for dense overlay topologies, especially if you try to create cliques
1768with more than 20 peers.
1769
1770@c ***********************************************************************
1771@node libgnunetutil
1772@section libgnunetutil
1773
1774libgnunetutil is the fundamental library that all GNUnet code builds upon.
1775Ideally, this library should contain most of the platform dependent code
1776(except for user interfaces and really special needs that only few
1777applications have). It is also supposed to offer basic services that most
1778if not all GNUnet binaries require. The code of libgnunetutil is in the
1779@file{src/util/} directory. The public interface to the library is in the
1780gnunet_util.h header. The functions provided by libgnunetutil fall
1781roughly into the following categories (in roughly the order of importance
1782for new developers):
1783
1784@itemize @bullet
1785@item logging (common_logging.c)
1786@item memory allocation (common_allocation.c)
1787@item endianess conversion (common_endian.c)
1788@item internationalization (common_gettext.c)
1789@item String manipulation (string.c)
1790@item file access (disk.c)
1791@item buffered disk IO (bio.c)
1792@item time manipulation (time.c)
1793@item configuration parsing (configuration.c)
1794@item command-line handling (getopt*.c)
1795@item cryptography (crypto_*.c)
1796@item data structures (container_*.c)
1797@item CPS-style scheduling (scheduler.c)
1798@item Program initialization (program.c)
1799@item Networking (network.c, client.c, server*.c, service.c)
1800@item message queueing (mq.c)
1801@item bandwidth calculations (bandwidth.c)
1802@item Other OS-related (os*.c, plugin.c, signal.c)
1803@item Pseudonym management (pseudonym.c)
1804@end itemize
1805
1806It should be noted that only developers that fully understand this entire
1807API will be able to write good GNUnet code.
1808
1809Ideally, porting GNUnet should only require porting the gnunetutil
1810library. More testcases for the gnunetutil APIs are therefore a great
1811way to make porting of GNUnet easier.
1812
1813@menu
1814* Logging::
1815* Interprocess communication API (IPC)::
1816* Cryptography API::
1817* Message Queue API::
1818* Service API::
1819* Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
1820* The CONTAINER_MDLL API::
1821@end menu
1822
1823@c ***********************************************************************
1824@node Logging
1825@subsection Logging
1826
1827GNUnet is able to log its activity, mostly for the purposes of debugging
1828the program at various levels.
1829
1830@file{gnunet_common.h} defines several @strong{log levels}:
1831@table @asis
1832
1833@item ERROR for errors (really problematic situations, often leading to
1834crashes)
1835@item WARNING for warnings (troubling situations that might have
1836negative consequences, although not fatal)
1837@item INFO for various information.
1838Used somewhat rarely, as GNUnet statistics is used to hold and display
1839most of the information that users might find interesting.
1840@item DEBUG for debugging.
1841Does not produce much output on normal builds, but when extra logging is
1842enabled at compile time, a staggering amount of data is outputted under
1843this log level.
1844@end table
1845
1846
1847Normal builds of GNUnet (configured with @code{--enable-logging[=yes]})
1848are supposed to log nothing under DEBUG level. The
1849@code{--enable-logging=verbose} configure option can be used to create a
1850build with all logging enabled. However, such build will produce large
1851amounts of log data, which is inconvenient when one tries to hunt down a
1852specific problem.
1853
1854To mitigate this problem, GNUnet provides facilities to apply a filter to
1855reduce the logs:
1856@table @asis
1857
1858@item Logging by default When no log levels are configured in any other
1859way (see below), GNUnet will default to the WARNING log level. This
1860mostly applies to GNUnet command line utilities, services and daemons;
1861tests will always set log level to WARNING or, if
1862@code{--enable-logging=verbose} was passed to configure, to DEBUG. The
1863default level is suggested for normal operation.
1864@item The -L option Most GNUnet executables accept an "-L loglevel" or
1865"--log=loglevel" option. If used, it makes the process set a global log
1866level to "loglevel". Thus it is possible to run some processes
1867with -L DEBUG, for example, and others with -L ERROR to enable specific
1868settings to diagnose problems with a particular process.
1869@item Configuration files. Because GNUnet
1870service and deamon processes are usually launched by gnunet-arm, it is not
1871possible to pass different custom command line options directly to every
1872one of them. The options passed to @code{gnunet-arm} only affect
1873gnunet-arm and not the rest of GNUnet. However, one can specify a
1874configuration key "OPTIONS" in the section that corresponds to a service
1875or a daemon, and put a value of "-L loglevel" there. This will make the
1876respective service or daemon set its log level to "loglevel" (as the
1877value of OPTIONS will be passed as a command-line argument).
1878
1879To specify the same log level for all services without creating separate
1880"OPTIONS" entries in the configuration for each one, the user can specify
1881a config key "GLOBAL_POSTFIX" in the [arm] section of the configuration
1882file. The value of GLOBAL_POSTFIX will be appended to all command lines
1883used by the ARM service to run other services. It can contain any option
1884valid for all GNUnet commands, thus in particular the "-L loglevel"
1885option. The ARM service itself is, however, unaffected by GLOBAL_POSTFIX;
1886to set log level for it, one has to specify "OPTIONS" key in the [arm]
1887section.
1888@item Environment variables.
1889Setting global per-process log levels with "-L loglevel" does not offer
1890sufficient log filtering granularity, as one service will call interface
1891libraries and supporting libraries of other GNUnet services, potentially
1892producing lots of debug log messages from these libraries. Also, changing
1893the config file is not always convenient (especially when running the
1894GNUnet test suite).@ To fix that, and to allow GNUnet to use different
1895log filtering at runtime without re-compiling the whole source tree, the
1896log calls were changed to be configurable at run time. To configure them
1897one has to define environment variables "GNUNET_FORCE_LOGFILE",
1898"GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
1899@itemize @bullet
1900
1901@item "GNUNET_LOG" only affects the logging when no global log level is
1902configured by any other means (that is, the process does not explicitly
1903set its own log level, there are no "-L loglevel" options on command line
1904or in configuration files), and can be used to override the default
1905WARNING log level.
1906
1907@item "GNUNET_FORCE_LOG" will completely override any other log
1908configuration options given.
1909
1910@item "GNUNET_FORCE_LOGFILE" will completely override the location of the
1911file to log messages to. It should contain a relative or absolute file
1912name. Setting GNUNET_FORCE_LOGFILE is equivalent to passing
1913"--log-file=logfile" or "-l logfile" option (see below). It supports "[]"
1914format in file names, but not "@{@}" (see below).
1915@end itemize
1916
1917
1918Because environment variables are inherited by child processes when they
1919are launched, starting or re-starting the ARM service with these
1920variables will propagate them to all other services.
1921
1922"GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
1923formatted @strong{logging definition} string, which looks like this:@
1924
1925@example
1926[component];[file];[function];[from_line[-to_line]];loglevel[/component...]
1927@end example
1928
1929That is, a logging definition consists of definition entries, separated by
1930slashes ('/'). If only one entry is present, there is no need to add a
1931slash to its end (although it is not forbidden either).@ All definition
1932fields (component, file, function, lines and loglevel) are mandatory, but
1933(except for the loglevel) they can be empty. An empty field means
1934"match anything". Note that even if fields are empty, the semicolon (';')
1935separators must be present.@ The loglevel field is mandatory, and must
1936contain one of the log level names (ERROR, WARNING, INFO or DEBUG).@
1937The lines field might contain one non-negative number, in which case it
1938matches only one line, or a range "from_line-to_line", in which case it
1939matches any line in the interval [from_line;to_line] (that is, including
1940both start and end line).@ GNUnet mostly defaults component name to the
1941name of the service that is implemented in a process ('transport',
1942'core', 'peerinfo', etc), but logging calls can specify custom component
1943names using @code{GNUNET_log_from}.@ File name and function name are
1944provided by the compiler (__FILE__ and __FUNCTION__ built-ins).
1945
1946Component, file and function fields are interpreted as non-extended
1947regular expressions (GNU libc regex functions are used). Matching is
1948case-sensitive, "^" and "$" will match the beginning and the end of the
1949text. If a field is empty, its contents are automatically replaced with
1950a ".*" regular expression, which matches anything. Matching is done in
1951the default way, which means that the expression matches as long as it's
1952contained anywhere in the string. Thus "GNUNET_" will match both
1953"GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$' to make sure that
1954the expression matches at the start and/or at the end of the string.
1955The semicolon (';') can't be escaped, and GNUnet will not use it in
1956component names (it can't be used in function names and file names
1957anyway).
1958
1959@end table
1960
1961
1962Every logging call in GNUnet code will be (at run time) matched against
1963the log definitions passed to the process. If a log definition fields are
1964matching the call arguments, then the call log level is compared the the
1965log level of that definition. If the call log level is less or equal to
1966the definition log level, the call is allowed to proceed. Otherwise the
1967logging call is forbidden, and nothing is logged. If no definitions
1968matched at all, GNUnet will use the global log level or (if a global log
1969level is not specified) will default to WARNING (that is, it will allow
1970the call to proceed, if its level is less or equal to the global log
1971level or to WARNING).
1972
1973That is, definitions are evaluated from left to right, and the first
1974matching definition is used to allow or deny the logging call. Thus it is
1975advised to place narrow definitions at the beginning of the logdef
1976string, and generic definitions - at the end.
1977
1978Whether a call is allowed or not is only decided the first time this
1979particular call is made. The evaluation result is then cached, so that
1980any attempts to make the same call later will be allowed or disallowed
1981right away. Because of that runtime log level evaluation should not
1982significantly affect the process performance.
1983Log definition parsing is only done once, at the first call to
1984GNUNET_log_setup () made by the process (which is usually done soon after
1985it starts).
1986
1987At the moment of writing there is no way to specify logging definitions
1988from configuration files, only via environment variables.
1989
1990At the moment GNUnet will stop processing a log definition when it
1991encounters an error in definition formatting or an error in regular
1992expression syntax, and will not report the failure in any way.
1993
1994
1995@c ***********************************************************************
1996@menu
1997* Examples::
1998* Log files::
1999* Updated behavior of GNUNET_log::
2000@end menu
2001
2002@node Examples
2003@subsubsection Examples
2004
2005@table @asis
2006
2007@item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet
2008process tree, running all processes with DEBUG level (one should be
2009careful with it, as log files will grow at alarming rate!)
2010@item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet
2011process tree, running the core service under DEBUG level (everything else
2012will use configured or default level).
2013
2014@item Start GNUnet process tree, allowing any logging calls from
2015gnunet-service-transport_validation.c (everything else will use
2016configured or default level).
2017
2018@example
2019GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;; DEBUG" \
2020gnunet-arm -s
2021@end example
2022
2023@item Start GNUnet process tree, allowing any logging calls from
2024gnunet-gnunet-service-fs_push.c (everything else will use configured or
2025default level).
2026
2027@example
2028GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s
2029@end example
2030
2031@item Start GNUnet process tree, allowing any logging calls from the
2032GNUNET_NETWORK_socket_select function (everything else will use
2033configured or default level).
2034
2035@example
2036GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s
2037@end example
2038
2039@item Start GNUnet process tree, allowing any logging calls from the
2040components that have "transport" in their names, and are made from
2041function that have "send" in their names. Everything else will be allowed
2042to be logged only if it has WARNING level.
2043
2044@example
2045GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s
2046@end example
2047
2048@end table
2049
2050
2051On Windows, one can use batch files to run GNUnet processes with special
2052environment variables, without affecting the whole system. Such batch
2053file will look like this:
2054
2055@example
2056set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm -s
2057@end example
2058
2059(note the absence of double quotes in the environment variable definition,
2060as opposed to earlier examples, which use the shell).
2061Another limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set
2062in order to GNUNET_FORCE_LOG to work.
2063
2064
2065@c ***********************************************************************
2066@node Log files
2067@subsubsection Log files
2068
2069GNUnet can be told to log everything into a file instead of stderr (which
2070is the default) using the "--log-file=logfile" or "-l logfile" option.
2071This option can also be passed via command line, or from the "OPTION" and
2072"GLOBAL_POSTFIX" configuration keys (see above). The file name passed
2073with this option is subject to GNUnet filename expansion. If specified in
2074"GLOBAL_POSTFIX", it is also subject to ARM service filename expansion,
2075in particular, it may contain "@{@}" (left and right curly brace)
2076sequence, which will be replaced by ARM with the name of the service.
2077This is used to keep logs from more than one service separate, while only
2078specifying one template containing "@{@}" in GLOBAL_POSTFIX.
2079
2080As part of a secondary file name expansion, the first occurrence of "[]"
2081sequence ("left square brace" followed by "right square brace") in the
2082file name will be replaced with a process identifier or the process when
2083it initializes its logging subsystem. As a result, all processes will log
2084into different files. This is convenient for isolating messages of a
2085particular process, and prevents I/O races when multiple processes try to
2086write into the file at the same time. This expansion is done
2087independently of "@{@}" expansion that ARM service does (see above).
2088
2089The log file name that is specified via "-l" can contain format characters
2090from the 'strftime' function family. For example, "%Y" will be replaced
2091with the current year. Using "basename-%Y-%m-%d.log" would include the
2092current year, month and day in the log file. If a GNUnet process runs for
2093long enough to need more than one log file, it will eventually clean up
2094old log files. Currently, only the last three log files (plus the current
2095log file) are preserved. So once the fifth log file goes into use (so
2096after 4 days if you use "%Y-%m-%d" as above), the first log file will be
2097automatically deleted. Note that if your log file name only contains "%Y",
2098then log files would be kept for 4 years and the logs from the first year
2099would be deleted once year 5 begins. If you do not use any date-related
2100string format codes, logs would never be automatically deleted by GNUnet.
2101
2102
2103@c ***********************************************************************
2104
2105@node Updated behavior of GNUNET_log
2106@subsubsection Updated behavior of GNUNET_log
2107
2108It's currently quite common to see constructions like this all over the
2109code:
2110
2111@example
2112#if MESH_DEBUG
2113GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client disconnected\n");
2114#endif
2115@end example
2116
2117The reason for the #if is not to avoid displaying the message when
2118disabled (GNUNET_ERROR_TYPE takes care of that), but to avoid the
2119compiler including it in the binary at all, when compiling GNUnet for
2120platforms with restricted storage space / memory (MIPS routers,
2121ARM plug computers / dev boards, etc).
2122
2123This presents several problems: the code gets ugly, hard to write and it
2124is very easy to forget to include the #if guards, creating non-consistent
2125code. A new change in GNUNET_log aims to solve these problems.
2126
2127@strong{This change requires to @file{./configure} with at least
2128@code{--enable-logging=verbose} to see debug messages.}
2129
2130Here is an example of code with dense debug statements:
2131
2132@example
2133switch (restrict_topology) @{
2134case GNUNET_TESTING_TOPOLOGY_CLIQUE:#if VERBOSE_TESTING
2135GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
2136topology\n")); #endif unblacklisted_connections = create_clique (pg,
2137&remove_connections, BLACKLIST, GNUNET_NO); break; case
2138GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
2139(GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
2140topology\n")); #endif unblacklisted_connections = create_small_world_ring
2141(pg,&remove_connections, BLACKLIST); break;
2142@end example
2143
2144
2145Pretty hard to follow, huh?
2146
2147From now on, it is not necessary to include the #if / #endif statements to
2148achieve the same behavior. The GNUNET_log and GNUNET_log_from macros take
2149care of it for you, depending on the configure option:
2150
2151@itemize @bullet
2152@item If @code{--enable-logging} is set to @code{no}, the binary will
2153contain no log messages at all.
2154@item If @code{--enable-logging} is set to @code{yes}, the binary will
2155contain no DEBUG messages, and therefore running with -L DEBUG will have
2156no effect. Other messages (ERROR, WARNING, INFO, etc) will be included.
2157@item If @code{--enable-logging} is set to @code{verbose}, or
2158@code{veryverbose} the binary will contain DEBUG messages (still, it will
2159be neccessary to run with -L DEBUG or set the DEBUG config option to show
2160them).
2161@end itemize
2162
2163
2164If you are a developer:
2165@itemize @bullet
2166@item please make sure that you @code{./configure
2167--enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
2168@item please remove the @code{#if} statements around @code{GNUNET_log
2169(GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your
2170code.
2171@end itemize
2172
2173Since now activating DEBUG automatically makes it VERBOSE and activates
2174@strong{all} debug messages by default, you probably want to use the
2175https://gnunet.org/logging functionality to filter only relevant messages.
2176A suitable configuration could be:
2177
2178@example
2179$ export GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"
2180@end example
2181
2182Which will behave almost like enabling DEBUG in that subsytem before the
2183change. Of course you can adapt it to your particular needs, this is only
2184a quick example.
2185
2186@c ***********************************************************************
2187@node Interprocess communication API (IPC)
2188@subsection Interprocess communication API (IPC)
2189
2190In GNUnet a variety of new message types might be defined and used in
2191interprocess communication, in this tutorial we use the
2192@code{struct AddressLookupMessage} as a example to introduce how to
2193construct our own message type in GNUnet and how to implement the message
2194communication between service and client.
2195(Here, a client uses the @code{struct AddressLookupMessage} as a request
2196to ask the server to return the address of any other peer connecting to
2197the service.)
2198
2199
2200@c ***********************************************************************
2201@menu
2202* Define new message types::
2203* Define message struct::
2204* Client - Establish connection::
2205* Client - Initialize request message::
2206* Client - Send request and receive response::
2207* Server - Startup service::
2208* Server - Add new handles for specified messages::
2209* Server - Process request message::
2210* Server - Response to client::
2211* Server - Notification of clients::
2212* Conversion between Network Byte Order (Big Endian) and Host Byte Order::
2213@end menu
2214
2215@node Define new message types
2216@subsubsection Define new message types
2217
2218First of all, you should define the new message type in
2219@file{gnunet_protocols.h}:
2220
2221@example
2222 // Request to look addresses of peers in server.
2223#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
2224 // Response to the address lookup request.
2225#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
2226@end example
2227
2228@c ***********************************************************************
2229@node Define message struct
2230@subsubsection Define message struct
2231
2232After the type definition, the specified message structure should also be
2233described in the header file, e.g. transport.h in our case.
2234@example
2235GNUNET_NETWORK_STRUCT_BEGIN
2236
2237struct AddressLookupMessage @{ struct GNUNET_MessageHeader header; int32_t
2238numeric_only GNUNET_PACKED; struct GNUNET_TIME_AbsoluteNBO timeout; uint32_t
2239addrlen GNUNET_PACKED;
2240 /* followed by 'addrlen' bytes of the actual address, then
2241 followed by the 0-terminated name of the transport */ @};
2242 GNUNET_NETWORK_STRUCT_END
2243@end example
2244
2245
2246Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED}
2247which both ensure correct alignment when sending structs over the network.
2248
2249@menu
2250@end menu
2251
2252@c ***********************************************************************
2253@node Client - Establish connection
2254@subsubsection Client - Establish connection
2255@c %**end of header
2256
2257
2258At first, on the client side, the underlying API is employed to create a
2259new connection to a service, in our example the transport service would be
2260connected.
2261
2262@example
2263struct GNUNET_CLIENT_Connection *client; client =
2264GNUNET_CLIENT_connect ("transport", cfg);
2265@end example
2266
2267@c ***********************************************************************
2268@node Client - Initialize request message
2269@subsubsection Client - Initialize request message
2270@c %**end of header
2271
2272When the connection is ready, we initialize the message. In this step,
2273all the fields of the message should be properly initialized, namely the
2274size, type, and some extra user-defined data, such as timeout, name of
2275transport, address and name of transport.
2276
2277@example
2278struct AddressLookupMessage *msg; size_t len =
2279sizeof (struct AddressLookupMessage) + addressLen + strlen (nameTrans) + 1;
2280msg->header->size = htons (len); msg->header->type = htons
2281(GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP); msg->timeout =
2282GNUNET_TIME_absolute_hton (abs_timeout); msg->addrlen = htonl (addressLen);
2283char *addrbuf = (char *) &msg[1]; memcpy (addrbuf, address, addressLen); char
2284*tbuf = &addrbuf[addressLen]; memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
2285@end example
2286
2287Note that, here the functions @code{htonl}, @code{htons} and
2288@code{GNUNET_TIME_absolute_hton} are applied to convert little endian
2289into big endian, about the usage of the big/small edian order and the
2290corresponding conversion function please refer to Introduction of
2291Big Endian and Little Endian.
2292
2293@c ***********************************************************************
2294@node Client - Send request and receive response
2295@subsubsection Client - Send request and receive response
2296@c %**end of header
2297
2298@b{FIXME: This is very outdated, see the tutorial for the current API!}
2299
2300Next, the client would send the constructed message as a request to the
2301service and wait for the response from the service. To accomplish this
2302goal, there are a number of API calls that can be used. In this example,
2303@code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
2304appropriate function to use.
2305
2306@example
2307GNUNET_CLIENT_transmit_and_get_response
2308(client, msg->header, timeout, GNUNET_YES, &address_response_processor,
2309arp_ctx);
2310@end example
2311
2312the argument @code{address_response_processor} is a function with
2313@code{GNUNET_CLIENT_MessageHandler} type, which is used to process the
2314reply message from the service.
2315
2316@node Server - Startup service
2317@subsubsection Server - Startup service
2318
2319After receiving the request message, we run a standard GNUnet service
2320startup sequence using @code{GNUNET_SERVICE_run}, as follows,
2321
2322@example
2323int main(int
2324argc, char**argv) @{ GNUNET_SERVICE_run(argc, argv, "transport"
2325GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
2326@end example
2327
2328@c ***********************************************************************
2329@node Server - Add new handles for specified messages
2330@subsubsection Server - Add new handles for specified messages
2331@c %**end of header
2332
2333in the function above the argument @code{run} is used to initiate
2334transport service,and defined like this:
2335
2336@example
2337static void run (void *cls, struct
2338GNUNET_SERVER_Handle *serv, const struct GNUNET_CONFIGURATION_Handle *cfg) @{
2339GNUNET_SERVER_add_handlers (serv, handlers); @}
2340@end example
2341
2342
2343Here, @code{GNUNET_SERVER_add_handlers} must be called in the run
2344function to add new handlers in the service. The parameter
2345@code{handlers} is a list of @code{struct GNUNET_SERVER_MessageHandler}
2346to tell the service which function should be called when a particular
2347type of message is received, and should be defined in this way:
2348
2349@example
2350static struct GNUNET_SERVER_MessageHandler
2351handlers[] = @{ @{&handle_start, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_START,
23520@}, @{&handle_send, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_SEND, 0@},
2353@{&handle_try_connect, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT, sizeof
2354(struct TryConnectMessage)@}, @{&handle_address_lookup, NULL,
2355GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP, 0@}, @{NULL, NULL, 0, 0@} @};
2356@end example
2357
2358
2359As shown, the first member of the struct in the first area is a callback
2360function, which is called to process the specified message types, given
2361as the third member. The second parameter is the closure for the callback
2362function, which is set to @code{NULL} in most cases, and the last
2363parameter is the expected size of the message of this type, usually we
2364set it to 0 to accept variable size, for special cases the exact size of
2365the specified message also can be set. In addition, the terminator sign
2366depicted as @code{@{NULL, NULL, 0, 0@}} is set in the last aera.
2367
2368@c ***********************************************************************
2369@node Server - Process request message
2370@subsubsection Server - Process request message
2371@c %**end of header
2372
2373After the initialization of transport service, the request message would
2374be processed. Before handling the main message data, the validity of this
2375message should be checked out, e.g., to check whether the size of message
2376is correct.
2377
2378@example
2379size = ntohs (message->size); if (size < sizeof (struct
2380AddressLookupMessage)) @{ GNUNET_break_op (0); GNUNET_SERVER_receive_done
2381(client, GNUNET_SYSERR); return; @}
2382@end example
2383
2384
2385Note that, opposite to the construction method of the request message in
2386the client, in the server the function @code{nothl} and @code{ntohs}
2387should be employed during the extraction of the data from the message, so
2388that the data in big endian order can be converted back into little
2389endian order. See more in detail please refer to Introduction of
2390Big Endian and Little Endian.
2391
2392Moreover in this example, the name of the transport stored in the message
2393is a 0-terminated string, so we should also check whether the name of the
2394transport in the received message is 0-terminated:
2395
2396@example
2397nameTransport = (const char *)
2398&address[addressLen]; if (nameTransport[size - sizeof (struct
2399AddressLookupMessage)
2400 - addressLen - 1] != '\0') @{ GNUNET_break_op
2401 (0); GNUNET_SERVER_receive_done (client,
2402 GNUNET_SYSERR); return; @}
2403@end example
2404
2405Here, @code{GNUNET_SERVER_receive_done} should be called to tell the
2406service that the request is done and can receive the next message. The
2407argument @code{GNUNET_SYSERR} here indicates that the service didn't
2408understand the request message, and the processing of this request would
2409be terminated.
2410
2411In comparison to the aforementioned situation, when the argument is equal
2412to @code{GNUNET_OK}, the service would continue to process the requst
2413message.
2414
2415@c ***********************************************************************
2416@node Server - Response to client
2417@subsubsection Server - Response to client
2418@c %**end of header
2419
2420Once the processing of current request is done, the server should give the
2421response to the client. A new @code{struct AddressLookupMessage} would be
2422produced by the server in a similar way as the client did and sent to the
2423client, but here the type should be
2424@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
2425@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
2426@example
2427struct
2428AddressLookupMessage *msg; size_t len = sizeof (struct AddressLookupMessage) +
2429addressLen + strlen (nameTrans) + 1; msg->header->size = htons (len);
2430msg->header->type = htons (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2431
2432// ...
2433
2434struct GNUNET_SERVER_TransmitContext *tc; tc =
2435GNUNET_SERVER_transmit_context_create (client);
2436GNUNET_SERVER_transmit_context_append_data (tc, NULL, 0,
2437GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
2438GNUNET_SERVER_transmit_context_run (tc, rtimeout);
2439@end example
2440
2441
2442Note that, there are also a number of other APIs provided to the service
2443to send the message.
2444
2445@c ***********************************************************************
2446@node Server - Notification of clients
2447@subsubsection Server - Notification of clients
2448@c %**end of header
2449
2450Often a service needs to (repeatedly) transmit notifications to a client
2451or a group of clients. In these cases, the client typically has once
2452registered for a set of events and then needs to receive a message
2453whenever such an event happens (until the client disconnects). The use of
2454a notification context can help manage message queues to clients and
2455handle disconnects. Notification contexts can be used to send
2456individualized messages to a particular client or to broadcast messages
2457to a group of clients. An individualized notification might look like
2458this:
2459
2460@example
2461 GNUNET_SERVER_notification_context_unicast(nc,
2462 client, msg, GNUNET_YES);
2463@end example
2464
2465
2466Note that after processing the original registration message for
2467notifications, the server code still typically needs to call
2468@code{GNUNET_SERVER_receive_done} so that the client can transmit further
2469messages to the server.
2470
2471@c ***********************************************************************
2472@node Conversion between Network Byte Order (Big Endian) and Host Byte Order
2473@subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
2474@c %** subsub? it's a referenced page on the ipc document.
2475@c %**end of header
2476
2477Here we can simply comprehend big endian and little endian as Network Byte
2478Order and Host Byte Order respectively. What is the difference between
2479both two?
2480
2481Usually in our host computer we store the data byte as Host Byte Order,
2482for example, we store a integer in the RAM which might occupies 4 Byte,
2483as Host Byte Order the higher Byte would be stored at the lower address
2484of RAM, and the lower Byte would be stored at the higher address of RAM.
2485However, contrast to this, Network Byte Order just take the totally
2486opposite way to store the data, says, it will store the lower Byte at the
2487lower address, and the higher Byte will stay at higher address.
2488
2489For the current communication of network, we normally exchange the
2490information by surveying the data package, every two host wants to
2491communicate with each other must send and receive data package through
2492network. In order to maintain the identity of data through the
2493transmission in the network, the order of the Byte storage must changed
2494before sending and after receiving the data.
2495
2496There ten convenient functions to realize the conversion of Byte Order in
2497GNUnet, as following:
2498
2499@table @asis
2500
2501@item uint16_t htons(uint16_t hostshort) Convert host byte order to net
2502byte order with short int
2503@item uint32_t htonl(uint32_t hostlong) Convert host byte
2504order to net byte order with long int
2505@item uint16_t ntohs(uint16_t netshort)
2506Convert net byte order to host byte order with short int
2507@item uint32_t
2508ntohl(uint32_t netlong) Convert net byte order to host byte order with
2509long int
2510@item unsigned long long GNUNET_ntohll (unsigned long long netlonglong)
2511Convert net byte order to host byte order with long long int
2512@item unsigned long long GNUNET_htonll (unsigned long long hostlonglong)
2513Convert host byte order to net byte order with long long int
2514@item struct GNUNET_TIME_RelativeNBO GNUNET_TIME_relative_hton
2515(struct GNUNET_TIME_Relative a) Convert relative time to network byte
2516order.
2517@item struct GNUNET_TIME_Relative GNUNET_TIME_relative_ntoh
2518(struct GNUNET_TIME_RelativeNBO a) Convert relative time from network
2519byte order.
2520@item struct GNUNET_TIME_AbsoluteNBO GNUNET_TIME_absolute_hton
2521(struct GNUNET_TIME_Absolute a) Convert relative time to network byte
2522order.
2523@item struct GNUNET_TIME_Absolute GNUNET_TIME_absolute_ntoh
2524(struct GNUNET_TIME_AbsoluteNBO a) Convert relative time from network
2525byte order.
2526@end table
2527
2528@c ***********************************************************************
2529
2530@node Cryptography API
2531@subsection Cryptography API
2532@c %**end of header
2533
2534The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
2535GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
2536messages by peers and most other public-key operations. Most researchers
2537in cryptography consider 2048 bit RSA keys as secure and practically
2538unbreakable for a long time. The API provides functions to create a fresh
2539key pair, read a private key from a file (or create a new file if the
2540file does not exist), encrypt, decrypt, sign, verify and extraction of
2541the public key into a format suitable for network transmission.
2542
2543For the encryption of files and the actual data exchanged between peers
2544GNUnet uses 256-bit AES encryption. Fresh, session keys are negotiated
2545for every new connection.@ Again, there is no published technique to
2546break this cipher in any realistic amount of time. The API provides
2547functions for generation of keys, validation of keys (important for
2548checking that decryptions using RSA succeeded), encryption and decryption.
2549
2550GNUnet uses SHA-512 for computing one-way hash codes. The API provides
2551functions to compute a hash over a block in memory or over a file on disk.
2552
2553The crypto API also provides functions for randomizing a block of memory,
2554obtaining a single random number and for generating a permuation of the
2555numbers 0 to n-1. Random number generation distinguishes between WEAK and
2556STRONG random number quality; WEAK random numbers are pseudo-random
2557whereas STRONG random numbers use entropy gathered from the operating
2558system.
2559
2560Finally, the crypto API provides a means to deterministically generate a
25611024-bit RSA key from a hash code. These functions should most likely not
2562be used by most applications; most importantly,
2563GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that
2564should be considered secure for traditional applications of RSA.
2565
2566@c ***********************************************************************
2567@node Message Queue API
2568@subsection Message Queue API
2569@c %**end of header
2570
2571@strong{ Introduction }@
2572Often, applications need to queue messages that
2573are to be sent to other GNUnet peers, clients or services. As all of
2574GNUnet's message-based communication APIs, by design, do not allow
2575messages to be queued, it is common to implement custom message queues
2576manually when they are needed. However, writing very similar code in
2577multiple places is tedious and leads to code duplication.
2578
2579MQ (for Message Queue) is an API that provides the functionality to
2580implement and use message queues. We intend to eventually replace all of
2581the custom message queue implementations in GNUnet with MQ.
2582
2583@strong{ Basic Concepts }@
2584The two most important entities in MQ are queues and envelopes.
2585
2586Every queue is backed by a specific implementation (e.g. for mesh, stream,
2587connection, server client, etc.) that will actually deliver the queued
2588messages. For convenience,@ some queues also allow to specify a list of
2589message handlers. The message queue will then also wait for incoming
2590messages and dispatch them appropriately.
2591
2592An envelope holds the the memory for a message, as well as metadata
2593(Where is the envelope queued? What should happen after it has been
2594sent?). Any envelope can only be queued in one message queue.
2595
2596@strong{ Creating Queues }@
2597The following is a list of currently available message queues. Note that
2598to avoid layering issues, message queues for higher level APIs are not
2599part of @code{libgnunetutil}, but@ the respective API itself provides the
2600queue implementation.
2601
2602@table @asis
2603
2604@item @code{GNUNET_MQ_queue_for_connection_client}
2605Transmits queued messages over a @code{GNUNET_CLIENT_Connection} handle.
2606Also supports receiving with message handlers.
2607
2608@item @code{GNUNET_MQ_queue_for_server_client}
2609Transmits queued messages over a @code{GNUNET_SERVER_Client} handle. Does
2610not support incoming message handlers.
2611
2612@item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
2613@code{GNUNET_MESH_Tunnel} handle. Does not support incoming message
2614handlers.
2615
2616@item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
2617implementation. Instead of delivering and receiving messages with one of
2618GNUnet's communication APIs, implementation callbacks are called. Refer to
2619"Implementing Queues" for a more detailed explanation.
2620@end table
2621
2622
2623@strong{ Allocating Envelopes }@
2624A GNUnet message (as defined by the GNUNET_MessageHeader) has three
2625parts: The size, the type, and the body.
2626
2627MQ provides macros to allocate an envelope containing a message
2628conveniently, automatically setting the size and type fields of the
2629message.
2630
2631Consider the following simple message, with the body consisting of a
2632single number value.
2633@c why the empy code function?
2634@code{}
2635
2636@example
2637struct NumberMessage @{
2638 /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
2639 struct GNUNET_MessageHeader header; uint32_t number GNUNET_PACKED; @};
2640@end example
2641
2642An envelope containing an instance of the NumberMessage can be
2643constructed like this:
2644
2645@example
2646struct GNUNET_MQ_Envelope *ev; struct NumberMessage *msg; ev =
2647GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1); msg->number = htonl (42);
2648@end example
2649
2650In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is
2651the newly allocated envelope. The first argument must be a pointer to some
2652@code{struct} containing a @code{struct GNUNET_MessageHeader header}
2653field, while the second argument is the desired message type, in host
2654byte order.
2655
2656The @code{msg} pointer now points to an allocated message, where the
2657message type and the message size are already set. The message's size is
2658inferred from the type of the @code{msg} pointer: It will be set to
2659'sizeof(*msg)', properly converted to network byte order.
2660
2661If the message body's size is dynamic, the the macro
2662@code{GNUNET_MQ_msg_extra} can be used to allocate an envelope whose
2663message has additional space allocated after the @code{msg} structure.
2664
2665If no structure has been defined for the message,
2666@code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
2667after the message header. The first argument then must be a pointer to a
2668@code{GNUNET_MessageHeader}.
2669
2670@strong{Envelope Properties}@
2671A few functions in MQ allow to set additional properties on envelopes:
2672
2673@table @asis
2674
2675@item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will
2676be called once the envelope's message@ has been sent irrevocably.
2677An envelope can be canceled precisely up to the@ point where the notify
2678sent callback has been called.
2679
2680@item @code{GNUNET_MQ_disable_corking} No corking will be used when
2681sending the message. Not every@ queue supports this flag, per default,
2682envelopes are sent with corking.@
2683
2684@end table
2685
2686
2687@strong{Sending Envelopes}@
2688Once an envelope has been constructed, it can be queued for sending with
2689@code{GNUNET_MQ_send}.
2690
2691Note that in order to avoid memory leaks, an envelope must either be sent
2692(the queue will free it) or destroyed explicitly with
2693@code{GNUNET_MQ_discard}.
2694
2695@strong{Canceling Envelopes}@
2696An envelope queued with @code{GNUNET_MQ_send} can be canceled with
2697@code{GNUNET_MQ_cancel}. Note that after the notify sent callback has
2698been called, canceling a message results in undefined behavior.
2699Thus it is unsafe to cancel an envelope that does not have a notify sent
2700callback. When canceling an envelope, it is not necessary@ to call
2701@code{GNUNET_MQ_discard}, and the envelope can't be sent again.
2702
2703@strong{ Implementing Queues }@
2704@code{TODO}
2705
2706@c ***********************************************************************
2707@node Service API
2708@subsection Service API
2709@c %**end of header
2710
2711Most GNUnet code lives in the form of services. Services are processes
2712that offer an API for other components of the system to build on. Those
2713other components can be command-line tools for users, graphical user
2714interfaces or other services. Services provide their API using an IPC
2715protocol. For this, each service must listen on either a TCP port or a
2716UNIX domain socket; for this, the service implementation uses the server
2717API. This use of server is exposed directly to the users of the service
2718API. Thus, when using the service API, one is usually also often using
2719large parts of the server API. The service API provides various
2720convenience functions, such as parsing command-line arguments and the
2721configuration file, which are not found in the server API.
2722The dual to the service/server API is the client API, which can be used to
2723access services.
2724
2725The most common way to start a service is to use the GNUNET_SERVICE_run
2726function from the program's main function. GNUNET_SERVICE_run will then
2727parse the command line and configuration files and, based on the options
2728found there, start the server. It will then give back control to the main
2729program, passing the server and the configuration to the
2730GNUNET_SERVICE_Main callback. GNUNET_SERVICE_run will also take care of
2731starting the scheduler loop. If this is inappropriate (for example,
2732because the scheduler loop is already running), GNUNET_SERVICE_start and
2733related functions provide an alternative to GNUNET_SERVICE_run.
2734
2735When starting a service, the service_name option is used to determine
2736which sections in the configuration file should be used to configure the
2737service. A typical value here is the name of the src/ sub-directory, for
2738example "statistics". The same string would also be given to
2739GNUNET_CLIENT_connect to access the service.
2740
2741Once a service has been initialized, the program should use the
2742GNUNET_SERVICE_Main callback to register message handlers using
2743GNUNET_SERVER_add_handlers. The service will already have registered a
2744handler for the "TEST" message.
2745
2746The option bitfield (enum GNUNET_SERVICE_Options) determines how a service
2747should behave during shutdown. There are three key strategies:
2748
2749@table @asis
2750
2751@item instant (GNUNET_SERVICE_OPTION_NONE) Upon receiving the shutdown
2752signal from the scheduler, the service immediately terminates the server,
2753closing all existing connections with clients.
2754@item manual
2755(GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN) The service does nothing by itself
2756during shutdown. The main program will need to take the appropriate
2757action by calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending
2758on how the service was initialized) to terminate the service. This method
2759is used by gnunet-service-arm and rather uncommon.
2760@item soft
2761(GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN) Upon receiving the shutdown signal
2762from the scheduler, the service immediately tells the server to stop
2763listening for incoming clients. Requests from normal existing clients are
2764still processed and the server/service terminates once all normal clients
2765have disconnected. Clients that are not expected to ever disconnect (such
2766as clients that monitor performance values) can be marked as 'monitor'
2767clients using GNUNET_SERVER_client_mark_monitor. Those clients will
2768continue to be processed until all 'normal' clients have disconnected.
2769Then, the server will terminate, closing the monitor connections.
2770This mode is for example used by 'statistics', allowing existing 'normal'
2771clients to set (possibly persistent) statistic values before terminating.
2772
2773@end table
2774
2775@c ***********************************************************************
2776@node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2777@subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
2778@c %**end of header
2779
2780A commonly used data structure in GNUnet is a (multi-)hash map. It is most
2781often used to map a peer identity to some data structure, but also to map
2782arbitrary keys to values (for example to track requests in the distributed
2783hash table or in file-sharing). As it is commonly used, the DHT is
2784actually sometimes responsible for a large share of GNUnet's overall
2785memory consumption (for some processes, 30% is not uncommon). The
2786following text documents some API quirks (and their implications for
2787applications) that were recently introduced to minimize the footprint of
2788the hash map.
2789
2790
2791@c ***********************************************************************
2792@menu
2793* Analysis::
2794* Solution::
2795* Migration::
2796* Conclusion::
2797* Availability::
2798@end menu
2799
2800@node Analysis
2801@subsubsection Analysis
2802@c %**end of header
2803
2804The main reason for the "excessive" memory consumption by the hash map is
2805that GNUnet uses 512-bit cryptographic hash codes --- and the
2806(multi-)hash map also uses the same 512-bit 'struct GNUNET_HashCode'. As
2807a result, storing just the keys requires 64 bytes of memory for each key.
2808As some applications like to keep a large number of entries in the hash
2809map (after all, that's what maps are good for), 64 bytes per hash is
2810significant: keeping a pointer to the value and having a linked list for
2811collisions consume between 8 and 16 bytes, and 'malloc' may add about the
2812same overhead per allocation, putting us in the 16 to 32 byte per entry
2813ballpark. Adding a 64-byte key then triples the overall memory
2814requirement for the hash map.
2815
2816To make things "worse", most of the time storing the key in the hash map
2817is not required: it is typically already in memory elsewhere! In most
2818cases, the values stored in the hash map are some application-specific
2819struct that _also_ contains the hash. Here is a simplified example:
2820
2821@example
2822struct MyValue @{
2823struct GNUNET_HashCode key; unsigned int my_data; @};
2824
2825// ...
2826val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
282742; GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
2828@end example
2829
2830This is a common pattern as later the entries might need to be removed,
2831and at that time it is convenient to have the key immediately at hand:
2832
2833@example
2834GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
2835@end example
2836
2837
2838Note that here we end up with two times 64 bytes for the key, plus maybe
283964 bytes total for the rest of the 'struct MyValue' and the map entry in
2840the hash map. The resulting redundant storage of the key increases
2841overall memory consumption per entry from the "optimal" 128 bytes to 192
2842bytes. This is not just an extreme example: overheads in practice are
2843actually sometimes close to those highlighted in this example. This is
2844especially true for maps with a significant number of entries, as there
2845we tend to really try to keep the entries small.
2846
2847@c ***********************************************************************
2848@node Solution
2849@subsubsection Solution
2850@c %**end of header
2851
2852The solution that has now been implemented is to @strong{optionally}
2853allow the hash map to not make a (deep) copy of the hash but instead have
2854a pointer to the hash/key in the entry. This reduces the memory
2855consumption for the key from 64 bytes to 4 to 8 bytes. However, it can
2856also only work if the key is actually stored in the entry (which is the
2857case most of the time) and if the entry does not modify the key (which in
2858all of the code I'm aware of has been always the case if there key is
2859stored in the entry). Finally, when the client stores an entry in the
2860hash map, it @strong{must} provide a pointer to the key within the entry,
2861not just a pointer to a transient location of the key. If
2862the client code does not meet these requirements, the result is a dangling
2863pointer and undefined behavior of the (multi-)hash map API.
2864
2865@c ***********************************************************************
2866@node Migration
2867@subsubsection Migration
2868@c %**end of header
2869
2870To use the new feature, first check that the values contain the respective
2871key (and never modify it). Then, all calls to
2872@code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be
2873audited and most likely changed to pass a pointer into the value's struct.
2874For the initial example, the new code would look like this:
2875
2876@example
2877struct MyValue @{
2878struct GNUNET_HashCode key; unsigned int my_data; @};
2879
2880// ...
2881val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
288242; GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
2883@end example
2884
2885
2886Note that @code{&val} was changed to @code{&val->key} in the argument to
2887the @code{put} call. This is critical as often @code{key} is on the stack
2888or in some other transient data structure and thus having the hash map
2889keep a pointer to @code{key} would not work. Only the key inside of
2890@code{val} has the same lifetime as the entry in the map (this must of
2891course be checked as well). Naturally, @code{val->key} must be
2892intiialized before the @code{put} call. Once all @code{put} calls have
2893been converted and double-checked, you can change the call to create the
2894hash map from
2895
2896@example
2897map =
2898GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
2899@end example
2900
2901to
2902
2903@example
2904map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
2905@end example
2906
2907If everything was done correctly, you now use about 60 bytes less memory
2908per entry in @code{map}. However, if now (or in the future) any call to
2909@code{put} does not ensure that the given key is valid until the entry is
2910removed from the map, undefined behavior is likely to be observed.
2911
2912@c ***********************************************************************
2913@node Conclusion
2914@subsubsection Conclusion
2915@c %**end of header
2916
2917The new optimization can is often applicable and can result in a
2918reduction in memory consumption of up to 30% in practice. However, it
2919makes the code less robust as additional invariants are imposed on the
2920multi hash map client. Thus applications should refrain from enabling the
2921new mode unless the resulting performance increase is deemed significant
2922enough. In particular, it should generally not be used in new code (wait
2923at least until benchmarks exist).
2924
2925@c ***********************************************************************
2926@node Availability
2927@subsubsection Availability
2928@c %**end of header
2929
2930The new multi hash map code was committed in SVN 24319 (will be in GNUnet
29310.9.4). Various subsystems (transport, core, dht, file-sharing) were
2932previously audited and modified to take advantage of the new capability.
2933In particular, memory consumption of the file-sharing service is expected
2934to drop by 20-30% due to this change.
2935
2936@c ***********************************************************************
2937@node The CONTAINER_MDLL API
2938@subsection The CONTAINER_MDLL API
2939@c %**end of header
2940
2941This text documents the GNUNET_CONTAINER_MDLL API. The
2942GNUNET_CONTAINER_MDLL API is similar to the GNUNET_CONTAINER_DLL API in
2943that it provides operations for the construction and manipulation of
2944doubly-linked lists. The key difference to the (simpler) DLL-API is that
2945the MDLL-version allows a single element (instance of a "struct") to be
2946in multiple linked lists at the same time.
2947
2948Like the DLL API, the MDLL API stores (most of) the data structures for
2949the doubly-linked list with the respective elements; only the 'head' and
2950'tail' pointers are stored "elsewhere" --- and the application needs to
2951provide the locations of head and tail to each of the calls in the
2952MDLL API. The key difference for the MDLL API is that the "next" and
2953"previous" pointers in the struct can no longer be simply called "next"
2954and "prev" --- after all, the element may be in multiple doubly-linked
2955lists, so we cannot just have one "next" and one "prev" pointer!
2956
2957The solution is to have multiple fields that must have a name of the
2958format "next_XX" and "prev_XX" where "XX" is the name of one of the
2959doubly-linked lists. Here is a simple example:
2960
2961@example
2962struct MyMultiListElement @{
2963 struct MyMultiListElement *next_ALIST;
2964 struct MyMultiListElement *prev_ALIST;
2965 struct MyMultiListElement *next_BLIST;
2966 struct MyMultiListElement *prev_BLIST;
2967 void
2968 *data;
2969@};
2970@end example
2971
2972
2973Note that by convention, we use all-uppercase letters for the list names.
2974In addition, the program needs to have a location for the head and tail
2975pointers for both lists, for example:
2976
2977@example
2978static struct MyMultiListElement *head_ALIST;
2979static struct MyMultiListElement *tail_ALIST;
2980static struct MyMultiListElement *head_BLIST;
2981static struct MyMultiListElement *tail_BLIST;
2982@end example
2983
2984
2985Using the MDLL-macros, we can now insert an element into the ALIST:
2986
2987@example
2988GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
2989@end example
2990
2991
2992Passing "ALIST" as the first argument to MDLL specifies which of the
2993next/prev fields in the 'struct MyMultiListElement' should be used. The
2994extra "ALIST" argument and the "_ALIST" in the names of the
2995next/prev-members are the only differences between the MDDL and DLL-API.
2996Like the DLL-API, the MDLL-API offers functions for inserting (at head,
2997at tail, after a given element) and removing elements from the list.
2998Iterating over the list should be done by directly accessing the
2999"next_XX" and/or "prev_XX" members.
3000
3001@c ***********************************************************************
3002@node The Automatic Restart Manager (ARM)
3003@section The Automatic Restart Manager (ARM)
3004@c %**end of header
3005
3006GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible
3007for system initialization and service babysitting. ARM starts and halts
3008services, detects configuration changes and restarts services impacted by
3009the changes as needed. It's also responsible for restarting services in
3010case of crashes and is planned to incorporate automatic debugging for
3011diagnosing service crashes providing developers insights about crash
3012reasons. The purpose of this document is to give GNUnet developer an idea
3013about how ARM works and how to interact with it.
3014
3015@menu
3016* Basic functionality::
3017* Key configuration options::
3018* Availability2::
3019* Reliability::
3020@end menu
3021
3022@c ***********************************************************************
3023@node Basic functionality
3024@subsection Basic functionality
3025@c %**end of header
3026
3027@itemize @bullet
3028@item ARM source code can be found under "src/arm".@ Service processes are
3029managed by the functions in "gnunet-service-arm.c" which is controlled
3030with "gnunet-arm.c" (main function in that file is ARM's entry point).
3031
3032@item The functions responsible for communicating with ARM , starting and
3033stopping services -including ARM service itself- are provided by the
3034ARM API "arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller
3035an ARM handle after setting it to the caller's context (configuration and
3036scheduler in use). This handle can be used afterwards by the caller to
3037communicate with ARM. Functions GNUNET_ARM_start_service() and
3038GNUNET_ARM_stop_service() are used for starting and stopping services
3039respectively.
3040
3041@item A typical example of using these basic ARM services can be found in
3042file test_arm_api.c. The test case connects to ARM, starts it, then uses
3043it to start a service "resolver", stops the "resolver" then stops "ARM".
3044@end itemize
3045
3046@c ***********************************************************************
3047@node Key configuration options
3048@subsection Key configuration options
3049@c %**end of header
3050
3051Configurations for ARM and services should be available in a .conf file
3052(As an example, see test_arm_api_data.conf). When running ARM, the
3053configuration file to use should be passed to the command:@
3054@code{@ $ gnunet-arm -s -c configuration_to_use.conf@ }@
3055If no configuration is passed, the default configuration file will be used
3056(see GNUNET_PREFIX/share/gnunet/defaults.conf which is created from
3057contrib/defaults.conf).@ Each of the services is having a section starting
3058by the service name between square brackets, for example: "[arm]".
3059The following options configure how ARM configures or interacts with the
3060various services:
3061
3062@table @asis
3063
3064@item PORT Port number on which the service is listening for incoming TCP
3065connections. ARM will start the services should it notice a request at
3066this port.
3067
3068@item HOSTNAME Specifies on which host the service is deployed. Note
3069that ARM can only start services that are running on the local system
3070(but will not check that the hostname matches the local machine name).
3071This option is used by the @code{gnunet_client_lib.h} implementation to
3072determine which system to connect to. The default is "localhost".
3073
3074@item BINARY The name of the service binary file.
3075
3076@item OPTIONS To be passed to the service.
3077
3078@item PREFIX A command to pre-pend to the actual command, for example,
3079running a service with "valgrind" or "gdb"
3080
3081@item DEBUG Run in debug mode (much verbosity).
3082
3083@item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of
3084the service and start the service on-demand.
3085
3086@item FORCESTART ARM will always start this service when the peer
3087is started.
3088
3089@item ACCEPT_FROM IPv4 addresses the service accepts connections from.
3090
3091@item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
3092
3093@end table
3094
3095
3096Options that impact the operation of ARM overall are in the "[arm]"
3097section. ARM is a normal service and has (except for AUTOSTART) all of the
3098options that other services do. In addition, ARM has the
3099following options:
3100
3101@table @asis
3102
3103@item GLOBAL_PREFIX Command to be pre-pended to all services that are
3104going to run.
3105
3106@item GLOBAL_POSTFIX Global option that will be supplied to all the
3107services that are going to run.
3108
3109@end table
3110
3111@c ***********************************************************************
3112@node Availability2
3113@subsection Availability2
3114@c %**end of header
3115
3116As mentioned before, one of the features provided by ARM is starting
3117services on demand. Consider the example of one service "client" that
3118wants to connect to another service a "server". The "client" will ask ARM
3119to run the "server". ARM starts the "server". The "server" starts
3120listening to incoming connections. The "client" will establish a
3121connection with the "server". And then, they will start to communicate
3122together.@ One problem with that scheme is that it's slow!@
3123The "client" service wants to communicate with the "server" service at
3124once and is not willing wait for it to be started and listening to
3125incoming connections before serving its request.@ One solution for that
3126problem will be that ARM starts all services as default services. That
3127solution will solve the problem, yet, it's not quite practical, for some
3128services that are going to be started can never be used or are going to
3129be used after a relatively long time.@
3130The approach followed by ARM to solve this problem is as follows:
3131
3132@itemize @bullet
3133
3134@item For each service having a PORT field in the configuration file and
3135that is not one of the default services ( a service that accepts incoming
3136connections from clients), ARM creates listening sockets for all addresses
3137associated with that service.
3138
3139@item The "client" will immediately establish a connection with
3140the "server".
3141
3142@item ARM --- pretending to be the "server" --- will listen on the
3143respective port and notice the incoming connection from the "client"
3144(but not accept it), instead
3145
3146@item Once there is an incoming connection, ARM will start the "server",
3147passing on the listen sockets (now, the service is started and can do its
3148work).
3149
3150@item Other client services now can directly connect directly to the
3151"server".
3152
3153@end itemize
3154
3155@c ***********************************************************************
3156@node Reliability
3157@subsection Reliability
3158
3159One of the features provided by ARM, is the automatic restart of crashed
3160services.@ ARM needs to know which of the running services died. Function
3161"gnunet-service-arm.c/maint_child_death()" is responsible for that. The
3162function is scheduled to run upon receiving a SIGCHLD signal. The
3163function, then, iterates ARM's list of services running and monitors
3164which service has died (crashed). For all crashing services, ARM restarts
3165them.@
3166Now, considering the case of a service having a serious problem causing it
3167to crash each time it's started by ARM. If ARM keeps blindly restarting
3168such a service, we are going to have the pattern:
3169start-crash-restart-crash-restart-crash and so forth!! Which is of course
3170not practical.@
3171For that reason, ARM schedules the service to be restarted after waiting
3172for some delay that grows exponentially with each crash/restart of that
3173service.@ To clarify the idea, considering the following example:
3174
3175@itemize @bullet
3176
3177@item Service S crashed.
3178
3179@item ARM receives the SIGCHLD and inspects its list of services to find
3180the dead one(s).
3181
3182@item ARM finds S dead and schedules it for restarting after "backoff"
3183time which is initially set to 1ms. ARM will double the backoff time
3184correspondent to S (now backoff(S) = 2ms)
3185
3186@item Because there is a severe problem with S, it crashed again.
3187
3188@item Again ARM receives the SIGCHLD and detects that it's S again that's
3189crashed. ARM schedules it for restarting but after its new backoff time
3190(which became 2ms), and doubles its backoff time (now backoff(S) = 4).
3191
3192@item and so on, until backoff(S) reaches a certain threshold
3193(EXPONENTIAL_BACKOFF_THRESHOLD is set to half an hour), after reaching it,
3194backoff(S) will remain half an hour, hence ARM won't be busy for a lot of
3195time trying to restart a problematic service.
3196@end itemize
3197
3198@c ***********************************************************************
3199@node GNUnet's TRANSPORT Subsystem
3200@section GNUnet's TRANSPORT Subsystem
3201@c %**end of header
3202
3203This chapter documents how the GNUnet transport subsystem works. The
3204GNUnet transport subsystem consists of three main components: the
3205transport API (the interface used by the rest of the system to access the
3206transport service), the transport service itself (most of the interesting
3207functions, such as choosing transports, happens here) and the transport
3208plugins. A transport plugin is a concrete implementation for how two
3209GNUnet peers communicate; many plugins exist, for example for
3210communication via TCP, UDP, HTTP, HTTPS and others. Finally, the
3211transport subsystem uses supporting code, especially the NAT/UPnP
3212library to help with tasks such as NAT traversal.
3213
3214Key tasks of the transport service include:
3215
3216@itemize @bullet
3217
3218@item Create our HELLO message, notify clients and neighbours if our HELLO
3219changes (using NAT library as necessary)
3220
3221@item Validate HELLOs from other peers (send PING), allow other peers to
3222validate our HELLO's addresses (send PONG)
3223
3224@item Upon request, establish connections to other peers (using address
3225selection from ATS subsystem) and maintain them (again using PINGs and
3226PONGs) as long as desired
3227
3228@item Accept incoming connections, give ATS service the opportunity to
3229switch communication channels
3230
3231@item Notify clients about peers that have connected to us or that have
3232been disconnected from us
3233
3234@item If a (stateful) connection goes down unexpectedly (without explicit
3235DISCONNECT), quickly attempt to recover (without notifying clients) but do
3236notify clients quickly if reconnecting fails
3237
3238@item Send (payload) messages arriving from clients to other peers via
3239transport plugins and receive messages from other peers, forwarding
3240those to clients
3241
3242@item Enforce inbound traffic limits (using flow-control if it is
3243applicable); outbound traffic limits are enforced by CORE, not by us (!)
3244
3245@item Enforce restrictions on P2P connection as specified by the blacklist
3246configuration and blacklisting clients
3247@end itemize
3248
3249
3250Note that the term "clients" in the list above really refers to the
3251GNUnet-CORE service, as CORE is typically the only client of the
3252transport service.
3253
3254@menu
3255* Address validation protocol::
3256@end menu
3257
3258@node Address validation protocol
3259@subsection Address validation protocol
3260@c %**end of header
3261
3262This section documents how the GNUnet transport service validates
3263connections with other peers. It is a high-level description of the
3264protocol necessary to understand the details of the implementation. It
3265should be noted that when we talk about PING and PONG messages in this
3266section, we refer to transport-level PING and PONG messages, which are
3267different from core-level PING and PONG messages (both in implementation
3268and function).
3269
3270The goal of transport-level address validation is to minimize the chances
3271of a successful man-in-the-middle attack against GNUnet peers on the
3272transport level. Such an attack would not allow the adversary to decrypt
3273the P2P transmissions, but a successful attacker could at least measure
3274traffic volumes and latencies (raising the adversaries capablities by
3275those of a global passive adversary in the worst case). The scenarios we
3276are concerned about is an attacker, Mallory, giving a HELLO to Alice that
3277claims to be for Bob, but contains Mallory's IP address instead of Bobs
3278(for some transport). Mallory would then forward the traffic to Bob (by
3279initiating a connection to Bob and claiming to be Alice). As a further
3280complication, the scheme has to work even if say Alice is behind a NAT
3281without traversal support and hence has no address of her own (and thus
3282Alice must always initiate the connection to Bob).
3283
3284An additional constraint is that HELLO messages do not contain a
3285cryptographic signature since other peers must be able to edit
3286(i.e. remove) addresses from the HELLO at any time (this was not true in
3287GNUnet 0.8.x). A basic @strong{assumption} is that each peer knows the
3288set of possible network addresses that it @strong{might} be reachable
3289under (so for example, the external IP address of the NAT plus the LAN
3290address(es) with the respective ports).
3291
3292The solution is the following. If Alice wants to validate that a given
3293address for Bob is valid (i.e. is actually established @strong{directly}
3294with the intended target), it sends a PING message over that connection
3295to Bob. Note that in this case, Alice initiated the connection so only
3296she knows which address was used for sure (Alice maybe behind NAT, so
3297whatever address Bob sees may not be an address Alice knows she has). Bob
3298checks that the address given in the PING is actually one of his addresses
3299(does not belong to Mallory), and if it is, sends back a PONG (with a
3300signature that says that Bob owns/uses the address from the PING). Alice
3301checks the signature and is happy if it is valid and the address in the
3302PONG is the address she used. This is similar to the 0.8.x protocol where
3303the HELLO contained a signature from Bob for each address used by Bob.
3304Here, the purpose code for the signature is
3305@code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
3306remember Bob's address and consider the address valid for a while (12h in
3307the current implementation). Note that after this exchange, Alice only
3308considers Bob's address to be valid, the connection itself is not
3309considered 'established'. In particular, Alice may have many addresses
3310for Bob that she considers valid.
3311
3312The PONG message is protected with a nonce/challenge against replay
3313attacks and uses an expiration time for the signature (but those are
3314almost implementation details).
3315
3316@node NAT library
3317@section NAT library
3318@c %**end of header
3319
3320The goal of the GNUnet NAT library is to provide a general-purpose API for
3321NAT traversal @strong{without} third-party support. So protocols that
3322involve contacting a third peer to help establish a connection between
3323two peers are outside of the scope of this API. That does not mean that
3324GNUnet doesn't support involving a third peer (we can do this with the
3325distance-vector transport or using application-level protocols), it just
3326means that the NAT API is not concerned with this possibility. The API is
3327written so that it will work for IPv6-NAT in the future as well as
3328current IPv4-NAT. Furthermore, the NAT API is always used, even for peers
3329that are not behind NAT --- in that case, the mapping provided is simply
3330the identity.
3331
3332NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a
3333set of addresses that the peer has locally bound to (TCP or UDP), the NAT
3334library will return (via callback) a (possibly longer) list of addresses
3335the peer @strong{might} be reachable under. Internally, depending on the
3336configuration, the NAT library will try to punch a hole (using UPnP) or
3337just "know" that the NAT was manually punched and generate the respective
3338external IP address (the one that should be globally visible) based on
3339the given information.
3340
3341The NAT library also supports ICMP-based NAT traversal. Here, the other
3342peer can request connection-reversal by this peer (in this special case,
3343the peer is even allowed to configure a port number of zero). If the NAT
3344library detects a connection-reversal request, it returns the respective
3345target address to the client as well. It should be noted that
3346connection-reversal is currently only intended for TCP, so other plugins
3347@strong{must} pass @code{NULL} for the reversal callback. Naturally, the
3348NAT library also supports requesting connection reversal from a remote
3349peer (@code{GNUNET_NAT_run_client}).
3350
3351Once initialized, the NAT handle can be used to test if a given address is
3352possibly a valid address for this peer (@code{GNUNET_NAT_test_address}).
3353This is used for validating our addresses when generating PONGs.
3354
3355Finally, the NAT library contains an API to test if our NAT configuration
3356is correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to
3357the respective port, the NAT library can be used to test if the
3358configuration works. The test function act as a local client, initialize
3359the NAT traversal and then contact a @code{gnunet-nat-server} (running by
3360default on @code{gnunet.org}) and ask for a connection to be established.
3361This way, it is easy to test if the current NAT configuration is valid.
3362
3363@node Distance-Vector plugin
3364@section Distance-Vector plugin
3365@c %**end of header
3366
3367The Distance Vector (DV) transport is a transport mechanism that allows
3368peers to act as relays for each other, thereby connecting peers that would
3369otherwise be unable to connect. This gives a larger connection set to
3370applications that may work better with more peers to choose from (for
3371example, File Sharing and/or DHT).
3372
3373The Distance Vector transport essentially has two functions. The first is
3374"gossiping" connection information about more distant peers to directly
3375connected peers. The second is taking messages intended for non-directly
3376connected peers and encapsulating them in a DV wrapper that contains the
3377required information for routing the message through forwarding peers. Via
3378gossiping, optimal routes through the known DV neighborhood are discovered
3379and utilized and the message encapsulation provides some benefits in
3380addition to simply getting the message from the correct source to the
3381proper destination.
3382
3383The gossiping function of DV provides an up to date routing table of
3384peers that are available up to some number of hops. We call this a
3385fisheye view of the network (like a fish, nearby objects are known while
3386more distant ones unknown). Gossip messages are sent only to directly
3387connected peers, but they are sent about other knowns peers within the
3388"fisheye distance". Whenever two peers connect, they immediately gossip
3389to each other about their appropriate other neighbors. They also gossip
3390about the newly connected peer to previously
3391connected neighbors. In order to keep the routing tables up to date,
3392disconnect notifications are propogated as gossip as well (because
3393disconnects may not be sent/received, timeouts are also used remove
3394stagnant routing table entries).
3395
3396Routing of messages via DV is straightforward. When the DV transport is
3397notified of a message destined for a non-direct neighbor, the appropriate
3398forwarding peer is selected, and the base message is encapsulated in a DV
3399message which contains information about the initial peer and the intended
3400recipient. At each forwarding hop, the initial peer is validated (the
3401forwarding peer ensures that it has the initial peer in its neighborhood,
3402otherwise the message is dropped). Next the base message is
3403re-encapsulated in a new DV message for the next hop in the forwarding
3404chain (or delivered to the current peer, if it has arrived at the
3405destination).
3406
3407Assume a three peer network with peers Alice, Bob and Carol. Assume that
3408Alice <-> Bob and Bob <-> Carol are direct (e.g. over TCP or UDP
3409transports) connections, but that Alice cannot directly connect to Carol.
3410This may be the case due to NAT or firewall restrictions, or perhaps
3411based on one of the peers respective configurations. If the Distance
3412Vector transport is enabled on all three peers, it will automatically
3413discover (from the gossip protocol) that Alice and Carol can connect via
3414Bob and provide a "virtual" Alice <-> Carol connection. Routing between
3415Alice and Carol happens as follows; Alice creates a message destined for
3416Carol and notifies the DV transport about it. The DV transport at Alice
3417looks up Carol in the routing table and finds that the message must be
3418sent through Bob for Carol. The message is encapsulated setting Alice as
3419the initiator and Carol as the destination and sent to Bob. Bob receives
3420the messages, verifies both Alice and Carol are known to Bob, and re-wraps
3421the message in a new DV message for Carol. The DV transport at Carol
3422receives this message, unwraps the original message, and delivers it to
3423Carol as though it came directly from Alice.
3424
3425@node SMTP plugin
3426@section SMTP plugin
3427@c %**end of header
3428
3429This section describes the new SMTP transport plugin for GNUnet as it
3430exists in the 0.7.x and 0.8.x branch. SMTP support is currently not
3431available in GNUnet 0.9.x. This page also describes the transport layer
3432abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives
3433some benchmarking results. The performance results presented are quite
3434old and maybe outdated at this point.
3435
3436@itemize @bullet
3437@item Why use SMTP for a peer-to-peer transport?
3438@item SMTPHow does it work?
3439@item How do I configure my peer?
3440@item How do I test if it works?
3441@item How fast is it?
3442@item Is there any additional documentation?
3443@end itemize
3444
3445
3446@menu
3447* Why use SMTP for a peer-to-peer transport?::
3448* How does it work?::
3449* How do I configure my peer?::
3450* How do I test if it works?::
3451* How fast is it?::
3452@end menu
3453
3454@node Why use SMTP for a peer-to-peer transport?
3455@subsection Why use SMTP for a peer-to-peer transport?
3456@c %**end of header
3457
3458There are many reasons why one would not want to use SMTP:
3459
3460@itemize @bullet
3461@item SMTP is using more bandwidth than TCP, UDP or HTTP
3462@item SMTP has a much higher latency.
3463@item SMTP requires significantly more computation (encoding and decoding
3464time) for the peers.
3465@item SMTP is significantly more complicated to configure.
3466@item SMTP may be abused by tricking GNUnet into sending mail to@
3467non-participating third parties.
3468@end itemize
3469
3470So why would anybody want to use SMTP?
3471@itemize @bullet
3472@item SMTP can be used to contact peers behind NAT boxes (in virtual
3473private networks).
3474@item SMTP can be used to circumvent policies that limit or prohibit
3475peer-to-peer traffic by masking as "legitimate" traffic.
3476@item SMTP uses E-mail addresses which are independent of a specific IP,
3477which can be useful to address peers that use dynamic IP addresses.
3478@item SMTP can be used to initiate a connection (e.g. initial address
3479exchange) and peers can then negotiate the use of a more efficient
3480protocol (e.g. TCP) for the actual communication.
3481@end itemize
3482
3483In summary, SMTP can for example be used to send a message to a peer
3484behind a NAT box that has a dynamic IP to tell the peer to establish a
3485TCP connection to a peer outside of the private network. Even an
3486extraordinary overhead for this first message would be irrelevant in this
3487type of situation.
3488
3489@node How does it work?
3490@subsection How does it work?
3491@c %**end of header
3492
3493When a GNUnet peer needs to send a message to another GNUnet peer that has
3494advertised (only) an SMTP transport address, GNUnet base64-encodes the
3495message and sends it in an E-mail to the advertised address. The
3496advertisement contains a filter which is placed in the E-mail header,
3497such that the receiving host can filter the tagged E-mails and forward it
3498to the GNUnet peer process. The filter can be specified individually by
3499each peer and be changed over time. This makes it impossible to censor
3500GNUnet E-mail messages by searching for a generic filter.
3501
3502@node How do I configure my peer?
3503@subsection How do I configure my peer?
3504@c %**end of header
3505
3506First, you need to configure @code{procmail} to filter your inbound E-mail
3507for GNUnet traffic. The GNUnet messages must be delivered into a pipe, for
3508example @code{/tmp/gnunet.smtp}. You also need to define a filter that is
3509used by @command{procmail} to detect GNUnet messages. You are free to
3510choose whichever filter you like, but you should make sure that it does
3511not occur in your other E-mail. In our example, we will use
3512@code{X-mailer: GNUnet}. The @code{~/.procmailrc} configuration file then
3513looks like this:
3514
3515@example
3516:0:
3517* ^X-mailer: GNUnet
3518/tmp/gnunet.smtp
3519# where do you want your other e-mail delivered to (default: /var/spool/mail/)
3520:0: /var/spool/mail/
3521@end example
3522
3523After adding this file, first make sure that your regular E-mail still
3524works (e.g. by sending an E-mail to yourself). Then edit the GNUnet
3525configuration. In the section @code{SMTP} you need to specify your E-mail
3526address under @code{EMAIL}, your mail server (for outgoing mail) under
3527@code{SERVER}, the filter (X-mailer: GNUnet in the example) under
3528@code{FILTER} and the name of the pipe under @code{PIPE}.@ The completed
3529section could then look like this:
3530
3531@example
3532EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
3533"X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
3534@end example
3535
3536Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in
3537the @code{GNUNETD} section. GNUnet peers will use the E-mail address that
3538you specified to contact your peer until the advertisement times out.
3539Thus, if you are not sure if everything works properly or if you are not
3540planning to be online for a long time, you may want to configure this
3541timeout to be short, e.g. just one hour. For this, set
3542@code{HELLOEXPIRES} to @code{1} in the @code{GNUNETD} section.
3543
3544This should be it, but you may probably want to test it first.
3545
3546@node How do I test if it works?
3547@subsection How do I test if it works?
3548@c %**end of header
3549
3550Any transport can be subjected to some rudimentary tests using the
3551@code{gnunet-transport-check} tool. The tool sends a message to the local
3552node via the transport and checks that a valid message is received. While
3553this test does not involve other peers and can not check if firewalls or
3554other network obstacles prohibit proper operation, this is a great
3555testcase for the SMTP transport since it tests pretty much nearly all of
3556the functionality.
3557
3558@code{gnunet-transport-check} should only be used without running
3559@code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
3560tests all transports that are specified in the configuration file. But
3561you can specifically test SMTP by giving the option
3562@code{--transport=smtp}.
3563
3564Note that this test always checks if a transport can receive and send.
3565While you can configure most transports to only receive or only send
3566messages, this test will only work if you have configured the transport
3567to send and receive messages.
3568
3569@node How fast is it?
3570@subsection How fast is it?
3571@c %**end of header
3572
3573We have measured the performance of the UDP, TCP and SMTP transport layer
3574directly and when used from an application using the GNUnet core.
3575Measureing just the transport layer gives the better view of the actual
3576overhead of the protocol, whereas evaluating the transport from the
3577application puts the overhead into perspective from a practical point of
3578view.
3579
3580The loopback measurements of the SMTP transport were performed on three
3581different machines spanning a range of modern SMTP configurations. We
3582used a PIII-800 running RedHat 7.3 with the Purdue Computer Science
3583configuration which includes filters for spam. We also used a Xenon 2 GHZ
3584with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we used
3585qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers for
3586UDP and TCP are provided using the SGL configuration. The qmail benchmark
3587uses qmail's internal filtering whereas the sendmail benchmarks relies on
3588procmail to filter and deliver the mail. We used the transport layer to
3589send a message of b bytes (excluding transport protocol headers) directly
3590to the local machine. This way, network latency and packet loss on the
3591wire have no impact on the timings. n messages were sent sequentially over
3592the transport layer, sending message i+1 after the i-th message was
3593received. All messages were sent over the same connection and the time to
3594establish the connection was not taken into account since this overhead is
3595miniscule in practice --- as long as a connection is used for a
3596significant number of messages.
3597
3598@multitable @columnfractions .20 .15 .15 .15 .15 .15
3599@headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail) @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
3600@item 11 bytes @tab 31 ms @tab 55 ms @tab 781 s @tab 77 s @tab 24 s
3601@item 407 bytes @tab 37 ms @tab 62 ms @tab 789 s @tab 78 s @tab 25 s
3602@item 1,221 bytes @tab 46 ms @tab 73 ms @tab 804 s @tab 78 s @tab 25 s
3603@end multitable
3604
3605The benchmarks show that UDP and TCP are, as expected, both significantly
3606faster compared with any of the SMTP services. Among the SMTP
3607implementations, there can be significant differences depending on the
3608SMTP configuration. Filtering with an external tool like procmail that
3609needs to re-parse its configuration for each mail can be very expensive.
3610Applying spam filters can also significantly impact the performance of
3611the underlying SMTP implementation. The microbenchmark shows that SMTP
3612can be a viable solution for initiating peer-to-peer sessions: a couple of
3613seconds to connect to a peer are probably not even going to be noticed by
3614users. The next benchmark measures the possible throughput for a
3615transport. Throughput can be measured by sending multiple messages in
3616parallel and measuring packet loss. Note that not only UDP but also the
3617TCP transport can actually loose messages since the TCP implementation
3618drops messages if the @code{write} to the socket would block. While the
3619SMTP protocol never drops messages itself, it is often so
3620slow that only a fraction of the messages can be sent and received in the
3621given time-bounds. For this benchmark we report the message loss after
3622allowing t time for sending m messages. If messages were not sent (or
3623received) after an overall timeout of t, they were considered lost. The
3624benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0
3625with sendmail. The machines were connected with a direct 100 MBit ethernet
3626connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the
3627throughput for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps
3628and 6 kbps for UDP, TCP and SMTP respectively. The high per-message
3629overhead of SMTP can be improved by increasing the MTU, for example, an
3630MTU of 12,000 octets improves the throughput to 13 kbps as figure
3631smtp-MTUs shows. Our research paper) has some more details on the
3632benchmarking results.
3633
3634@node Bluetooth plugin
3635@section Bluetooth plugin
3636@c %**end of header
3637
3638This page describes the new Bluetooth transport plugin for GNUnet. The
3639plugin is still in the testing stage so don't expect it to work
3640perfectly. If you have any questions or problems just post them here or
3641ask on the IRC channel.
3642
3643@itemize @bullet
3644@item What do I need to use the Bluetooth plugin transport?
3645@item BluetoothHow does it work?
3646@item What possible errors should I be aware of?
3647@item How do I configure my peer?
3648@item How can I test it?
3649@end itemize
3650
3651
3652
3653@menu
3654* What do I need to use the Bluetooth plugin transport?::
3655* How does it work2?::
3656* What possible errors should I be aware of?::
3657* How do I configure my peer2?::
3658* How can I test it?::
3659* The implementation of the Bluetooth transport plugin::
3660@end menu
3661
3662@node What do I need to use the Bluetooth plugin transport?
3663@subsection What do I need to use the Bluetooth plugin transport?
3664@c %**end of header
3665
3666If you are a Linux user and you want to use the Bluetooth transport plugin
3667you should install the BlueZ development libraries (if they aren't already
3668installed). For instructions about how to install the libraries you should
3669check out the BlueZ site
3670(@uref{http://www.bluez.org/, http://www.bluez.org}). If you don't know if
3671you have the necesarry libraries, don't worry, just run the GNUnet
3672configure script and you will be able to see a notification at the end
3673which will warn you if you don't have the necessary libraries.
3674
3675If you are a Windows user you should have installed the
3676@emph{MinGW}/@emph{MSys2} with the latest updates (especially the
3677@emph{ws2bth} header). If this is your first build of GNUnet on Windows
3678you should check out the SBuild repository. It will semi-automatically
3679assembles a @emph{MinGW}/@emph{MSys2} installation with a lot of extra
3680packages which are needed for the GNUnet build. So this will ease your
3681work!@ Finally you just have to be sure that you have the correct drivers
3682for your Bluetooth device installed and that your device is on and in a
3683discoverable mode. The Windows Bluetooth Stack supports only the RFCOMM
3684protocol so we cannot turn on your device programatically!
3685
3686@c FIXME: Change to unique title
3687@node How does it work2?
3688@subsection How does it work2?
3689@c %**end of header
3690
3691The Bluetooth transport plugin uses virtually the same code as the WLAN
3692plugin and only the helper binary is different. The helper takes a single
3693argument, which represents the interface name and is specified in the
3694configuration file. Here are the basic steps that are followed by the
3695helper binary used on Linux:
3696
3697@itemize @bullet
3698@item it verifies if the name corresponds to a Bluetooth interface name
3699@item it verifies if the iterface is up (if it is not, it tries to bring
3700it up)
3701@item it tries to enable the page and inquiry scan in order to make the
3702device discoverable and to accept incoming connection requests
3703@emph{The above operations require root access so you should start the
3704transport plugin with root privileges.}
3705@item it finds an available port number and registers a SDP service which
3706will be used to find out on which port number is the server listening on
3707and switch the socket in listening mode
3708@item it sends a HELLO message with its address
3709@item finally it forwards traffic from the reading sockets to the STDOUT
3710and from the STDIN to the writing socket
3711@end itemize
3712
3713Once in a while the device will make an inquiry scan to discover the
3714nearby devices and it will send them randomly HELLO messages for peer
3715discovery.
3716
3717@node What possible errors should I be aware of?
3718@subsection What possible errors should I be aware of?
3719@c %**end of header
3720
3721@emph{This section is dedicated for Linux users}
3722
3723Well there are many ways in which things could go wrong but I will try to
3724present some tools that you could use to debug and some scenarios.
3725
3726@itemize @bullet
3727
3728@item @code{bluetoothd -n -d} : use this command to enable logging in the
3729foreground and to print the logging messages
3730
3731@item @code{hciconfig}: can be used to configure the Bluetooth devices.
3732If you run it without any arguments it will print information about the
3733state of the interfaces. So if you receive an error that the device
3734couldn't be brought up you should try to bring it manually and to see if
3735it works (use @code{hciconfig -a hciX up}). If you can't and the
3736Bluetooth address has the form 00:00:00:00:00:00 it means that there is
3737something wrong with the D-Bus daemon or with the Bluetooth daemon. Use
3738@code{bluetoothd} tool to see the logs
3739
3740@item @code{sdptool} can be used to control and interogate SDP servers.
3741If you encounter problems regarding the SDP server (like the SDP server is
3742down) you should check out if the D-Bus daemon is running correctly and to
3743see if the Bluetooth daemon started correctly(use @code{bluetoothd} tool).
3744Also, sometimes the SDP service could work but somehow the device couldn't
3745register his service. Use @code{sdptool browse [dev-address]} to see if
3746the service is registered. There should be a service with the name of the
3747interface and GNUnet as provider.
3748
3749@item @code{hcitool} : another useful tool which can be used to configure
3750the device and to send some particular commands to it.
3751
3752@item @code{hcidump} : could be used for low level debugging
3753@end itemize
3754
3755@c FIXME: A more unique name
3756@node How do I configure my peer2?
3757@subsection How do I configure my peer2?
3758@c %**end of header
3759
3760On Linux, you just have to be sure that the interface name corresponds to
3761the one that you want to use. Use the @code{hciconfig} tool to check that.
3762By default it is set to hci0 but you can change it.
3763
3764A basic configuration looks like this:
3765
3766@example
3767[transport-bluetooth]
3768# Name of the interface (typically hciX)
3769INTERFACE = hci0
3770# Real hardware, no testing
3771TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
3772@end example
3773
3774In order to use the Bluetooth transport plugin when the transport service
3775is started, you must add the plugin name to the default transport service
3776plugins list. For example:
3777
3778@example
3779[transport] ... PLUGINS = dns bluetooth ...
3780@end example
3781
3782If you want to use only the Bluetooth plugin set
3783@emph{PLUGINS = bluetooth}
3784
3785On Windows, you cannot specify which device to use. The only thing that
3786you should do is to add @emph{bluetooth} on the plugins list of the
3787transport service.
3788
3789@node How can I test it?
3790@subsection How can I test it?
3791@c %**end of header
3792
3793If you have two Bluetooth devices on the same machine which use Linux you
3794must:
3795
3796@itemize @bullet
3797
3798@item create two different file configuration (one which will use the
3799first interface (@emph{hci0}) and the other which will use the second
3800interface (@emph{hci1})). Let's name them @emph{peer1.conf} and
3801@emph{peer2.conf}.
3802
3803@item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
3804peers private keys. The @strong{X} must be replace with 1 or 2.
3805
3806@item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to
3807start the transport service. (Make sure that you have "bluetooth" on the
3808transport plugins list if the Bluetooth transport service doesn't start.)
3809
3810@item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's
3811ID. If you already know your peer ID (you saved it from the first
3812command), this can be skipped.
3813
3814@item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start
3815sending data for benchmarking to the other peer.
3816
3817@end itemize
3818
3819
3820This scenario will try to connect the second peer to the first one and
3821then start sending data for benchmarking.
3822
3823On Windows you cannot test the plugin functionality using two Bluetooth
3824devices from the same machine because after you install the drivers there
3825will occur some conflicts between the Bluetooth stacks. (At least that is
3826what happend on my machine : I wasn't able to use the Bluesoleil stack and
3827the WINDCOMM one in the same time).
3828
3829If you have two different machines and your configuration files are good
3830you can use the same scenario presented on the begining of this section.
3831
3832Another way to test the plugin functionality is to create your own
3833application which will use the GNUnet framework with the Bluetooth
3834transport service.
3835
3836@node The implementation of the Bluetooth transport plugin
3837@subsection The implementation of the Bluetooth transport plugin
3838@c %**end of header
3839
3840This page describes the implementation of the Bluetooth transport plugin.
3841
3842First I want to remind you that the Bluetooth transport plugin uses
3843virtually the same code as the WLAN plugin and only the helper binary is
3844different. Also the scope of the helper binary from the Bluetooth
3845transport plugin is the same as the one used for the wlan transport
3846plugin: it acceses the interface and then it forwards traffic in both
3847directions between the Bluetooth interface and stdin/stdout of the
3848process involved.
3849
3850The Bluetooth plugin transport could be used both on Linux and Windows
3851platforms.
3852
3853@itemize @bullet
3854@item Linux functionality
3855@item Windows functionality
3856@item Pending Features
3857@end itemize
3858
3859
3860
3861@menu
3862* Linux functionality::
3863* THE INITIALIZATION::
3864* THE LOOP::
3865* Details about the broadcast implementation::
3866* Windows functionality::
3867* Pending features::
3868@end menu
3869
3870@node Linux functionality
3871@subsubsection Linux functionality
3872@c %**end of header
3873
3874In order to implement the plugin functionality on Linux I used the BlueZ
3875stack. For the communication with the other devices I used the RFCOMM
3876protocol. Also I used the HCI protocol to gain some control over the
3877device. The helper binary takes a single argument (the name of the
3878Bluetooth interface) and is separated in two stages:
3879
3880@c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
3881@c %** starting a new section?
3882@node THE INITIALIZATION
3883@subsubsection THE INITIALIZATION
3884
3885@itemize @bullet
3886@item first, it checks if we have root privilegies
3887(@emph{Remember that we need to have root privilegies in order to be able
3888to bring the interface up if it is down or to change its state.}).
3889
3890@item second, it verifies if the interface with the given name exists.
3891
3892@strong{If the interface with that name exists and it is a Bluetooth
3893interface:}
3894
3895@item it creates a RFCOMM socket which will be used for listening and call
3896the @emph{open_device} method
3897
3898On the @emph{open_device} method:
3899@itemize @bullet
3900@item creates a HCI socket used to send control events to the the device
3901@item searches for the device ID using the interface name
3902@item saves the device MAC address
3903@item checks if the interface is down and tries to bring it UP
3904@item checks if the interface is in discoverable mode and tries to make it
3905discoverable
3906@item closes the HCI socket and binds the RFCOMM one
3907@item switches the RFCOMM socket in listening mode
3908@item registers the SDP service (the service will be used by the other
3909devices to get the port on which this device is listening on)
3910@end itemize
3911
3912@item drops the root privilegies
3913
3914@strong{If the interface is not a Bluetooth interface the helper exits
3915with a suitable error}
3916@end itemize
3917
3918@c %** Same as for @node entry above
3919@node THE LOOP
3920@subsubsection THE LOOP
3921
3922The helper binary uses a list where it saves all the connected neighbour
3923devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
3924@emph{write_std}). The first message which is send is a control message
3925with the device's MAC address in order to announce the peer presence to
3926the neighbours. Here are a short description of what happens in the main
3927loop:
3928
3929@itemize @bullet
3930@item Every time when it receives something from the STDIN it processes
3931the data and saves the message in the first buffer (@emph{write_pout}).
3932When it has something in the buffer, it gets the destination address from
3933the buffer, searches the destination address in the list (if there is no
3934connection with that device, it creates a new one and saves it to the
3935list) and sends the message.
3936@item Every time when it receives something on the listening socket it
3937accepts the connection and saves the socket on a list with the reading
3938sockets. @item Every time when it receives something from a reading
3939socket it parses the message, verifies the CRC and saves it in the
3940@emph{write_std} buffer in order to be sent later to the STDOUT.
3941@end itemize
3942
3943So in the main loop we use the select function to wait until one of the
3944file descriptor saved in one of the two file descriptors sets used is
3945ready to use. The first set (@emph{rfds}) represents the reading set and
3946it could contain the list with the reading sockets, the STDIN file
3947descriptor or the listening socket. The second set (@emph{wfds}) is the
3948writing set and it could contain the sending socket or the STDOUT file
3949descriptor. After the select function returns, we check which file
3950descriptor is ready to use and we do what is supposed to do on that kind
3951of event. @emph{For example:} if it is the listening socket then we
3952accept a new connection and save the socket in the reading list; if it is
3953the STDOUT file descriptor, then we write to STDOUT the message from the
3954@emph{write_std} buffer.
3955
3956To find out on which port a device is listening on we connect to the local
3957SDP server and searche the registered service for that device.
3958
3959@emph{You should be aware of the fact that if the device fails to connect
3960to another one when trying to send a message it will attempt one more
3961time. If it fails again, then it skips the message.}
3962@emph{Also you should know that the transport Bluetooth plugin has
3963support for @strong{broadcast messages}.}
3964
3965@node Details about the broadcast implementation
3966@subsubsection Details about the broadcast implementation
3967@c %**end of header
3968
3969First I want to point out that the broadcast functionality for the CONTROL
3970messages is not implemented in a conventional way. Since the inquiry scan
3971time is too big and it will take some time to send a message to all the
3972discoverable devices I decided to tackle the problem in a different way.
3973Here is how I did it:
3974
3975@itemize @bullet
3976@item If it is the first time when I have to broadcast a message I make an
3977inquiry scan and save all the devices' addresses to a vector.
3978@item After the inquiry scan ends I take the first address from the list
3979and I try to connect to it. If it fails, I try to connect to the next one.
3980If it succeeds, I save the socket to a list and send the message to the
3981device.
3982@item When I have to broadcast another message, first I search on the list
3983for a new device which I'm not connected to. If there is no new device on
3984the list I go to the beginning of the list and send the message to the
3985old devices. After 5 cycles I make a new inquiry scan to check out if
3986there are new discoverable devices and save them to the list. If there
3987are no new discoverable devices I reset the cycling counter and go again
3988through the old list and send messages to the devices saved in it.
3989@end itemize
3990
3991@strong{Therefore}:
3992
3993@itemize @bullet
3994@item every time when I have a broadcast message I look up on the list
3995for a new device and send the message to it
3996@item if I reached the end of the list for 5 times and I'm connected to
3997all the devices from the list I make a new inquiry scan.
3998@emph{The number of the list's cycles after an inquiry scan could be
3999increased by redefining the MAX_LOOPS variable}
4000@item when there are no new devices I send messages to the old ones.
4001@end itemize
4002
4003Doing so, the broadcast control messages will reach the devices but with
4004delay.
4005
4006@emph{NOTICE:} When I have to send a message to a certain device first I
4007check on the broadcast list to see if we are connected to that device. If
4008not we try to connect to it and in case of success we save the address and
4009the socket on the list. If we are already connected to that device we
4010simply use the socket.
4011
4012@node Windows functionality
4013@subsubsection Windows functionality
4014@c %**end of header
4015
4016For Windows I decided to use the Microsoft Bluetooth stack which has the
4017advantage of coming standard from Windows XP SP2. The main disadvantage is
4018that it only supports the RFCOMM protocol so we will not be able to have
4019a low level control over the Bluetooth device. Therefore it is the user
4020responsability to check if the device is up and in the discoverable mode.
4021Also there are no tools which could be used for debugging in order to read
4022the data coming from and going to a Bluetooth device, which obviously
4023hindered my work. Another thing that slowed down the implementation of the
4024plugin (besides that I wasn't too accomodated with the win32 API) was that
4025there were some bugs on MinGW regarding the Bluetooth. Now they are solved
4026but you should keep in mind that you should have the latest updates
4027(especially the @emph{ws2bth} header).
4028
4029Besides the fact that it uses the Windows Sockets, the Windows
4030implemenation follows the same principles as the Linux one:
4031
4032@itemize @bullet
4033@item It has a initalization part where it initializes the
4034Windows Sockets, creates a RFCOMM socket which will be binded and switched
4035to the listening mode and registers a SDP service. In the Microsoft
4036Bluetooth API there are two ways to work with the SDP:
4037@itemize @bullet
4038@item an easy way which works with very simple service records
4039@item a hard way which is useful when you need to update or to delete the
4040record
4041@end itemize
4042@end itemize
4043
4044Since I only needed the SDP service to find out on which port the device
4045is listening on and that did not change, I decided to use the easy way.
4046In order to register the service I used the @emph{WSASetService} function
4047and I generated the @emph{Universally Unique Identifier} with the
4048@emph{guidgen.exe} Windows's tool.
4049
4050In the loop section the only difference from the Linux implementation is
4051that I used the GNUNET_NETWORK library for functions like @emph{accept},
4052@emph{bind}, @emph{connect} or @emph{select}. I decided to use the
4053GNUNET_NETWORK library because I also needed to interact with the STDIN
4054and STDOUT handles and on Windows the select function is only defined for
4055sockets, and it will not work for arbitrary file handles.
4056
4057Another difference between Linux and Windows implementation is that in
4058Linux, the Bluetooth address is represented in 48 bits while in Windows is
4059represented in 64 bits. Therefore I had to do some changes on
4060@emph{plugin_transport_wlan} header.
4061
4062Also, currently on Windows the Bluetooth plugin doesn't have support for
4063broadcast messages. When it receives a broadcast message it will skip it.
4064
4065@node Pending features
4066@subsubsection Pending features
4067@c %**end of header
4068
4069@itemize @bullet
4070@item Implement the broadcast functionality on Windows @emph{(currently
4071working on)}
4072@item Implement a testcase for the helper :@ @emph{The testcase
4073consists of a program which emaluates the plugin and uses the helper. It
4074will simulate connections, disconnections and data transfers.}
4075@end itemize
4076
4077If you have a new idea about a feature of the plugin or suggestions about
4078how I could improve the implementation you are welcome to comment or to
4079contact me.
4080
4081@node WLAN plugin
4082@section WLAN plugin
4083@c %**end of header
4084
4085This section documents how the wlan transport plugin works. Parts which
4086are not implemented yet or could be better implemented are described at
4087the end.
4088
4089@cindex ats subsystem
4090@node The ATS Subsystem
4091@section The ATS Subsystem
4092@c %**end of header
4093
4094ATS stands for "automatic transport selection", and the function of ATS in
4095GNUnet is to decide on which address (and thus transport plugin) should
4096be used for two peers to communicate, and what bandwidth limits should be
4097imposed on such an individual connection. To help ATS make an informed
4098decision, higher-level services inform the ATS service about their
4099requirements and the quality of the service rendered. The ATS service
4100also interacts with the transport service to be appraised of working
4101addresses and to communicate its resource allocation decisions. Finally,
4102the ATS service's operation can be observed using a monitoring API.
4103
4104The main logic of the ATS service only collects the available addresses,
4105their performance characteristics and the applications requirements, but
4106does not make the actual allocation decision. This last critical step is
4107left to an ATS plugin, as we have implemented (currently three) different
4108allocation strategies which differ significantly in their performance and
4109maturity, and it is still unclear if any particular plugin is generally
4110superior.
4111
4112@cindex core subsystem
4113@cindex CORE subsystem
4114@node GNUnet's CORE Subsystem
4115@section GNUnet's CORE Subsystem
4116@c %**end of header
4117
4118The CORE subsystem in GNUnet is responsible for securing link-layer
4119communications between nodes in the GNUnet overlay network. CORE builds
4120on the TRANSPORT subsystem which provides for the actual, insecure,
4121unreliable link-layer communication (for example, via UDP or WLAN), and
4122then adds fundamental security to the connections:
4123
4124@itemize @bullet
4125@item confidentiality with so-called perfect forward secrecy; we use
4126ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/Elliptic_curve_
4127Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4128powered by Curve25519
4129@footnote{@uref{http://cr.yp.to/ecdh.html, Curve25519}} for the key
4130exchange and then use symmetric encryption, encrypting with both AES-256
4131@footnote{@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256}} and
4132Twofish @footnote{@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}}
4133@item @uref{http://en.wikipedia.org/wiki/Authentication, authentication}
4134is achieved by signing the ephemeral keys using Ed25519
4135@footnote{@uref{http://ed25519.cr.yp.to/, Ed25519}}, a deterministic
4136variant of ECDSA
4137@footnote{@uref{http://en.wikipedia.org/wiki/ECDSA, ECDSA}}
4138@item integrity protection (using SHA-512
4139@footnote{@uref{http://en.wikipedia.org/wiki/SHA-2, SHA-512}} to do
4140encrypt-then-MAC
4141@footnote{@uref{http://en.wikipedia.org/wiki/Authenticated_encryption,
4142encrypt-then-MAC}})
4143@item Replay
4144@footnote{@uref{http://en.wikipedia.org/wiki/Replay_attack, replay}}
4145protection (using nonces, timestamps, challenge-response,
4146message counters and ephemeral keys)
4147@item liveness (keep-alive messages, timeout)
4148@end itemize
4149
4150@menu
4151* Limitations::
4152* When is a peer "connected"?::
4153* libgnunetcore::
4154* The CORE Client-Service Protocol::
4155* The CORE Peer-to-Peer Protocol::
4156@end menu
4157
4158@cindex core subsystem limitations
4159@node Limitations
4160@subsection Limitations
4161@c %**end of header
4162
4163CORE does not perform
4164@uref{http://en.wikipedia.org/wiki/Routing, routing}; using CORE it is
4165only possible to communicate with peers that happen to already be
4166"directly" connected with each other. CORE also does not have an
4167API to allow applications to establish such "direct" connections --- for
4168this, applications can ask TRANSPORT, but TRANSPORT might not be able to
4169establish a "direct" connection. The TOPOLOGY subsystem is responsible for
4170trying to keep a few "direct" connections open at all times. Applications
4171that need to talk to particular peers should use the CADET subsystem, as
4172it can establish arbitrary "indirect" connections.
4173
4174Because CORE does not perform routing, CORE must only be used directly by
4175applications that either perform their own routing logic (such as
4176anonymous file-sharing) or that do not require routing, for example
4177because they are based on flooding the network. CORE communication is
4178unreliable and delivery is possibly out-of-order. Applications that
4179require reliable communication should use the CADET service. Each
4180application can only queue one message per target peer with the CORE
4181service at any time; messages cannot be larger than approximately
418263 kilobytes. If messages are small, CORE may group multiple messages
4183(possibly from different applications) prior to encryption. If permitted
4184by the application (using the @uref{http://baus.net/on-tcp_cork/, cork}
4185option), CORE may delay transmissions to facilitate grouping of multiple
4186small messages. If cork is not enabled, CORE will transmit the message as
4187soon as TRANSPORT allows it (TRANSPORT is responsible for limiting
4188bandwidth and congestion control). CORE does not allow flow control;
4189applications are expected to process messages at line-speed. If flow
4190control is needed, applications should use the CADET service.
4191
4192@cindex when is a peer connected
4193@node When is a peer "connected"?
4194@subsection When is a peer "connected"?
4195@c %**end of header
4196
4197In addition to the security features mentioned above, CORE also provides
4198one additional key feature to applications using it, and that is a
4199limited form of protocol-compatibility checking. CORE distinguishes
4200between TRANSPORT-level connections (which enable communication with other
4201peers) and application-level connections. Applications using the CORE API
4202will (typically) learn about application-level connections from CORE, and
4203not about TRANSPORT-level connections. When a typical application uses
4204CORE, it will specify a set of message types
4205(from @code{gnunet_protocols.h}) that it understands. CORE will then
4206notify the application about connections it has with other peers if and
4207only if those applications registered an intersecting set of message
4208types with their CORE service. Thus, it is quite possible that CORE only
4209exposes a subset of the established direct connections to a particular
4210application --- and different applications running above CORE might see
4211different sets of connections at the same time.
4212
4213A special case are applications that do not register a handler for any
4214message type.
4215CORE assumes that these applications merely want to monitor connections
4216(or "all" messages via other callbacks) and will notify those applications
4217about all connections. This is used, for example, by the
4218@code{gnunet-core} command-line tool to display the active connections.
4219Note that it is also possible that the TRANSPORT service has more active
4220connections than the CORE service, as the CORE service first has to
4221perform a key exchange with connecting peers before exchanging information
4222about supported message types and notifying applications about the new
4223connection.
4224
4225@cindex libgnunetcore
4226@node libgnunetcore
4227@subsection libgnunetcore
4228@c %**end of header
4229
4230The CORE API (defined in @file{gnunet_core_service.h}) is the basic
4231messaging API used by P2P applications built using GNUnet. It provides
4232applications the ability to send and receive encrypted messages to the
4233peer's "directly" connected neighbours.
4234
4235As CORE connections are generally "direct" connections,@ applications must
4236not assume that they can connect to arbitrary peers this way, as "direct"
4237connections may not always be possible. Applications using CORE are
4238notified about which peers are connected. Creating new "direct"
4239connections must be done using the TRANSPORT API.
4240
4241The CORE API provides unreliable, out-of-order delivery. While the
4242implementation tries to ensure timely, in-order delivery, both message
4243losses and reordering are not detected and must be tolerated by the
4244application. Most important, the core will NOT perform retransmission if
4245messages could not be delivered.
4246
4247Note that CORE allows applications to queue one message per connected
4248peer. The rate at which each connection operates is influenced by the
4249preferences expressed by local application as well as restrictions
4250imposed by the other peer. Local applications can express their
4251preferences for particular connections using the "performance" API of the
4252ATS service.
4253
4254Applications that require more sophisticated transmission capabilities
4255such as TCP-like behavior, or if you intend to send messages to arbitrary
4256remote peers, should use the CADET API.
4257
4258The typical use of the CORE API is to connect to the CORE service using
4259@code{GNUNET_CORE_connect}, process events from the CORE service (such as
4260peers connecting, peers disconnecting and incoming messages) and send
4261messages to connected peers using
4262@code{GNUNET_CORE_notify_transmit_ready}. Note that applications must
4263cancel pending transmission requests if they receive a disconnect event
4264for a peer that had a transmission pending; furthermore, queueing more
4265than one transmission request per peer per application using the
4266service is not permitted.
4267
4268The CORE API also allows applications to monitor all communications of the
4269peer prior to encryption (for outgoing messages) or after decryption (for
4270incoming messages). This can be useful for debugging, diagnostics or to
4271establish the presence of cover traffic (for anonymity). As monitoring
4272applications are often not interested in the payload, the monitoring
4273callbacks can be configured to only provide the message headers (including
4274the message type and size) instead of copying the full data stream to the
4275monitoring client.
4276
4277The init callback of the @code{GNUNET_CORE_connect} function is called
4278with the hash of the public key of the peer. This public key is used to
4279identify the peer globally in the GNUnet network. Applications are
4280encouraged to check that the provided hash matches the hash that they are
4281using (as theoretically the application may be using a different
4282configuration file with a different private key, which would result in
4283hard to find bugs).
4284
4285As with most service APIs, the CORE API isolates applications from crashes
4286of the CORE service. If the CORE service crashes, the application will see
4287disconnect events for all existing connections. Once the connections are
4288re-established, the applications will be receive matching connect events.
4289
4290@cindex core clinet-service protocol
4291@node The CORE Client-Service Protocol
4292@subsection The CORE Client-Service Protocol
4293@c %**end of header
4294
4295This section describes the protocol between an application using the CORE
4296service (the client) and the CORE service process itself.
4297
4298
4299@menu
4300* Setup2::
4301* Notifications::
4302* Sending::
4303@end menu
4304
4305@node Setup2
4306@subsubsection Setup2
4307@c %**end of header
4308
4309When a client connects to the CORE service, it first sends a
4310@code{InitMessage} which specifies options for the connection and a set of
4311message type values which are supported by the application. The options
4312bitmask specifies which events the client would like to be notified about.
4313The options include:
4314
4315@table @asis
4316@item GNUNET_CORE_OPTION_NOTHING No notifications
4317@item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
4318@item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after
4319decryption) with full payload
4320@item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
4321of all inbound messages
4322@item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
4323messages (prior to encryption) with full payload
4324@item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all
4325outbound messages
4326@end table
4327
4328Typical applications will only monitor for connection status changes.
4329
4330The CORE service responds to the @code{InitMessage} with an
4331@code{InitReplyMessage} which contains the peer's identity. Afterwards,
4332both CORE and the client can send messages.
4333
4334@node Notifications
4335@subsubsection Notifications
4336@c %**end of header
4337
4338The CORE will send @code{ConnectNotifyMessage}s and
4339@code{DisconnectNotifyMessage}s whenever peers connect or disconnect from
4340the CORE (assuming their type maps overlap with the message types
4341registered by the client). When the CORE receives a message that matches
4342the set of message types specified during the @code{InitMessage} (or if
4343monitoring is enabled in for inbound messages in the options), it sends a
4344@code{NotifyTrafficMessage} with the peer identity of the sender and the
4345decrypted payload. The same message format (except with
4346@code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND} for the message type) is
4347used to notify clients monitoring outbound messages; here, the peer
4348identity given is that of the receiver.
4349
4350@node Sending
4351@subsubsection Sending
4352@c %**end of header
4353
4354When a client wants to transmit a message, it first requests a
4355transmission slot by sending a @code{SendMessageRequest} which specifies
4356the priority, deadline and size of the message. Note that these values
4357may be ignored by CORE. When CORE is ready for the message, it answers
4358with a @code{SendMessageReady} response. The client can then transmit the
4359payload with a @code{SendMessage} message. Note that the actual message
4360size in the @code{SendMessage} is allowed to be smaller than the size in
4361the original request. A client may at any time send a fresh
4362@code{SendMessageRequest}, which then superceeds the previous
4363@code{SendMessageRequest}, which is then no longer valid. The client can
4364tell which @code{SendMessageRequest} the CORE service's
4365@code{SendMessageReady} message is for as all of these messages contain a
4366"unique" request ID (based on a counter incremented by the client
4367for each request).
4368
4369@node The CORE Peer-to-Peer Protocol
4370@subsection The CORE Peer-to-Peer Protocol
4371@c %**end of header
4372
4373
4374@menu
4375* Creating the EphemeralKeyMessage::
4376* Establishing a connection::
4377* Encryption and Decryption::
4378* Type maps::
4379@end menu
4380
4381@cindex EphemeralKeyMessage creation
4382@node Creating the EphemeralKeyMessage
4383@subsubsection Creating the EphemeralKeyMessage
4384@c %**end of header
4385
4386When the CORE service starts, each peer creates a fresh ephemeral (ECC)
4387public-private key pair and signs the corresponding
4388@code{EphemeralKeyMessage} with its long-term key (which we usually call
4389the peer's identity; the hash of the public long term key is what results
4390in a @code{struct GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral
4391key is ONLY used for an ECDHE@footnote{@uref{http://en.wikipedia.org/wiki/
4392Elliptic_curve_Diffie%E2%80%93Hellman, Elliptic-curve Diffie---Hellman}}
4393exchange by the CORE service to establish symmetric session keys. A peer
4394will use the same @code{EphemeralKeyMessage} for all peers for
4395@code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it
4396will create a fresh ephemeral key (forgetting the old one) and broadcast
4397the new @code{EphemeralKeyMessage} to all connected peers, resulting in
4398fresh symmetric session keys. Note that peers independently decide on
4399when to discard ephemeral keys; it is not a protocol violation to discard
4400keys more often. Ephemeral keys are also never stored to disk; restarting
4401a peer will thus always create a fresh ephemeral key. The use of ephemeral
4402keys is what provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy,
4403forward secrecy}.
4404
4405Just before transmission, the @code{EphemeralKeyMessage} is patched to
4406reflect the current sender_status, which specifies the current state of
4407the connection from the point of view of the sender. The possible values
4408are:
4409
4410@itemize @bullet
4411@item @code{KX_STATE_DOWN} Initial value, never used on the network
4412@item @code{KX_STATE_KEY_SENT} We sent our ephemeral key, do not know the
4413key of the other peer
4414@item @code{KX_STATE_KEY_RECEIVED} This peer has received a valid
4415ephemeral key of the other peer, but we are waiting for the other peer to
4416confirm it's authenticity (ability to decode) via challenge-response.
4417@item @code{KX_STATE_UP} The connection is fully up from the point of
4418view of the sender (now performing keep-alives)
4419@item @code{KX_STATE_REKEY_SENT} The sender has initiated a rekeying
4420operation; the other peer has so far failed to confirm a working
4421connection using the new ephemeral key
4422@end itemize
4423
4424@node Establishing a connection
4425@subsubsection Establishing a connection
4426@c %**end of header
4427
4428Peers begin their interaction by sending a @code{EphemeralKeyMessage} to
4429the other peer once the TRANSPORT service notifies the CORE service about
4430the connection.
4431A peer receiving an @code{EphemeralKeyMessage} with a status
4432indicating that the sender does not have the receiver's ephemeral key, the
4433receiver's @code{EphemeralKeyMessage} is sent in response.
4434Additionally, if the receiver has not yet confirmed the authenticity of
4435the sender, it also sends an (encrypted)@code{PingMessage} with a
4436challenge (and the identity of the target) to the other peer. Peers
4437receiving a @code{PingMessage} respond with an (encrypted)
4438@code{PongMessage} which includes the challenge. Peers receiving a
4439@code{PongMessage} check the challenge, and if it matches set the
4440connection to @code{KX_STATE_UP}.
4441
4442@node Encryption and Decryption
4443@subsubsection Encryption and Decryption
4444@c %**end of header
4445
4446All functions related to the key exchange and encryption/decryption of
4447messages can be found in @file{gnunet-service-core_kx.c} (except for the
4448cryptographic primitives, which are in @file{util/crypto*.c}).
4449Given the key material from ECDHE, a Key derivation function
4450@footnote{@uref{https://en.wikipedia.org/wiki/Key_derivation_function, Key
4451derivation function}} is used to derive two pairs of encryption and
4452decryption keys for AES-256 and TwoFish, as well as initialization vectors
4453and authentication keys (for HMAC@footnote{@uref{https://en.wikipedia.org/
4454wiki/HMAC, HMAC}}). The HMAC is computed over the encrypted payload.
4455Encrypted messages include an iv_seed and the HMAC in the header.
4456
4457Each encrypted message in the CORE service includes a sequence number and
4458a timestamp in the encrypted payload. The CORE service remembers the
4459largest observed sequence number and a bit-mask which represents which of
4460the previous 32 sequence numbers were already used.
4461Messages with sequence numbers lower than the largest observed sequence
4462number minus 32 are discarded. Messages with a timestamp that is less
4463than @code{REKEY_TOLERANCE} off (5 minutes) are also discarded. This of
4464course means that system clocks need to be reasonably synchronized for
4465peers to be able to communicate. Additionally, as the ephemeral key
4466changes every 12 hours, a peer would not even be able to decrypt messages
4467older than 12 hours.
4468
4469@node Type maps
4470@subsubsection Type maps
4471@c %**end of header
4472
4473Once an encrypted connection has been established, peers begin to exchange
4474type maps. Type maps are used to allow the CORE service to determine which
4475(encrypted) connections should be shown to which applications. A type map
4476is an array of 65536 bits representing the different types of messages
4477understood by applications using the CORE service. Each CORE service
4478maintains this map, simply by setting the respective bit for each message
4479type supported by any of the applications using the CORE service. Note
4480that bits for message types embedded in higher-level protocols (such as
4481MESH) will not be included in these type maps.
4482
4483Typically, the type map of a peer will be sparse. Thus, the CORE service
4484attempts to compress its type map using @code{gzip}-style compression
4485("deflate") prior to transmission. However, if the compression fails to
4486compact the map, the map may also be transmitted without compression
4487(resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
4488@code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively).
4489Upon receiving a type map, the respective CORE service notifies
4490applications about the connection to the other peer if they support any
4491message type indicated in the type map (or no message type at all).
4492If the CORE service experience a connect or disconnect event from an
4493application, it updates its type map (setting or unsetting the respective
4494bits) and notifies its neighbours about the change.
4495The CORE services of the neighbours then in turn generate connect and
4496disconnect events for the peer that sent the type map for their respective
4497applications. As CORE messages may be lost, the CORE service confirms
4498receiving a type map by sending back a
4499@code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation
4500(with the correct hash of the type map) is not received, the sender will
4501retransmit the type map (with exponential back-off).
4502
4503@cindex cadet subsystem
4504@cindex CADET
4505@node GNUnet's CADET subsystem
4506@section GNUnet's CADET subsystem
4507
4508The CADET subsystem in GNUnet is responsible for secure end-to-end
4509communications between nodes in the GNUnet overlay network. CADET builds
4510on the CORE subsystem which provides for the link-layer communication and
4511then adds routing, forwarding and additional security to the connections.
4512CADET offers the same cryptographic services as CORE, but on an
4513end-to-end level. This is done so peers retransmitting traffic on behalf
4514of other peers cannot access the payload data.
4515
4516@itemize @bullet
4517@item CADET provides confidentiality with so-called perfect forward
4518secrecy; we use ECDHE powered by Curve25519 for the key exchange and then
4519use symmetric encryption, encrypting with both AES-256 and Twofish
4520@item authentication is achieved by signing the ephemeral keys using
4521Ed25519, a deterministic variant of ECDSA
4522@item integrity protection (using SHA-512 to do encrypt-then-MAC, although
4523only 256 bits are sent to reduce overhead)
4524@item replay protection (using nonces, timestamps, challenge-response,
4525message counters and ephemeral keys)
4526@item liveness (keep-alive messages, timeout)
4527@end itemize
4528
4529Additional to the CORE-like security benefits, CADET offers other
4530properties that make it a more universal service than CORE.
4531
4532@itemize @bullet
4533@item CADET can establish channels to arbitrary peers in GNUnet. If a
4534peer is not immediately reachable, CADET will find a path through the
4535network and ask other peers to retransmit the traffic on its behalf.
4536@item CADET offers (optional) reliability mechanisms. In a reliable
4537channel traffic is guaranteed to arrive complete, unchanged and in-order.
4538@item CADET takes care of flow and congestion control mechanisms, not
4539allowing the sender to send more traffic than the receiver or the network
4540are able to process.
4541@end itemize
4542
4543@menu
4544* libgnunetcadet::
4545@end menu
4546
4547@cindex libgnunetcadet
4548@node libgnunetcadet
4549@subsection libgnunetcadet
4550
4551
4552The CADET API (defined in @file{gnunet_cadet_service.h}) is the
4553messaging API used by P2P applications built using GNUnet.
4554It provides applications the ability to send and receive encrypted
4555messages to any peer participating in GNUnet.
4556The API is heavily base on the CORE API.
4557
4558CADET delivers messages to other peers in "channels".
4559A channel is a permanent connection defined by a destination peer
4560(identified by its public key) and a port number.
4561Internally, CADET tunnels all channels towards a destiantion peer
4562using one session key and relays the data on multiple "connections",
4563independent from the channels.
4564
4565Each channel has optional paramenters, the most important being the
4566reliability flag.
4567Should a message get lost on TRANSPORT/CORE level, if a channel is
4568created with as reliable, CADET will retransmit the lost message and
4569deliver it in order to the destination application.
4570
4571To communicate with other peers using CADET, it is necessary to first
4572connect to the service using @code{GNUNET_CADET_connect}.
4573This function takes several parameters in form of callbacks, to allow the
4574client to react to various events, like incoming channels or channels that
4575terminate, as well as specify a list of ports the client wishes to listen
4576to (at the moment it is not possible to start listening on further ports
4577once connected, but nothing prevents a client to connect several times to
4578CADET, even do one connection per listening port).
4579The function returns a handle which has to be used for any further
4580interaction with the service.
4581
4582To connect to a remote peer a client has to call the
4583@code{GNUNET_CADET_channel_create} function. The most important parameters
4584given are the remote peer's identity (it public key) and a port, which
4585specifies which application on the remote peer to connect to, similar to
4586TCP/UDP ports. CADET will then find the peer in the GNUnet network and
4587establish the proper low-level connections and do the necessary key
4588exchanges to assure and authenticated, secure and verified communication.
4589Similar to @code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel}
4590returns a handle to interact with the created channel.
4591
4592For every message the client wants to send to the remote application,
4593@code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
4594channel on which the message should be sent and the size of the message
4595(but not the message itself!). Once CADET is ready to send the message,
4596the provided callback will fire, and the message contents are provided to
4597this callback.
4598
4599Please note the CADET does not provide an explicit notification of when a
4600channel is connected. In loosely connected networks, like big wireless
4601mesh networks, this can take several seconds, even minutes in the worst
4602case. To be alerted when a channel is online, a client can call
4603@code{GNUNET_CADET_notify_transmit_ready} immediately after
4604@code{GNUNET_CADET_create_channel}. When the callback is activated, it
4605means that the channel is online. The callback can give 0 bytes to CADET
4606if no message is to be sent, this is ok.
4607
4608If a transmission was requested but before the callback fires it is no
4609longer needed, it can be cancelled with
4610@code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle
4611given back by @code{GNUNET_CADET_notify_transmit_ready}.
4612As in the case of CORE, only one message can be requested at a time: a
4613client must not call @code{GNUNET_CADET_notify_transmit_ready} again until
4614the callback is called or the request is cancelled.
4615
4616When a channel is no longer needed, a client can call
4617@code{GNUNET_CADET_channel_destroy} to get rid of it.
4618Note that CADET will try to transmit all pending traffic before notifying
4619the remote peer of the destruction of the channel, including
4620retransmitting lost messages if the channel was reliable.
4621
4622Incoming channels, channels being closed by the remote peer, and traffic
4623on any incoming or outgoing channels are given to the client when CADET
4624executes the callbacks given to it at the time of
4625@code{GNUNET_CADET_connect}.
4626
4627Finally, when an application no longer wants to use CADET, it should call
4628@code{GNUNET_CADET_disconnect}, but first all channels and pending
4629transmissions must be closed (otherwise CADET will complain).
4630
4631@cindex nse subsystem
4632@cindex NSE
4633@node GNUnet's NSE subsystem
4634@section GNUnet's NSE subsystem
4635
4636
4637NSE stands for @dfn{Network Size Estimation}. The NSE subsystem provides
4638other subsystems and users with a rough estimate of the number of peers
4639currently participating in the GNUnet overlay.
4640The computed value is not a precise number as producing a precise number
4641in a decentralized, efficient and secure way is impossible.
4642While NSE's estimate is inherently imprecise, NSE also gives the expected
4643range. For a peer that has been running in a stable network for a
4644while, the real network size will typically (99.7% of the time) be in the
4645range of [2/3 estimate, 3/2 estimate]. We will now give an overview of the
4646algorithm used to calculate the estimate;
4647all of the details can be found in this technical report.
4648
4649@c FIXME: link to the report.
4650
4651@menu
4652* Motivation::
4653* Principle::
4654* libgnunetnse::
4655* The NSE Client-Service Protocol::
4656* The NSE Peer-to-Peer Protocol::
4657@end menu
4658
4659@node Motivation
4660@subsection Motivation
4661
4662
4663Some subsytems, like DHT, need to know the size of the GNUnet network to
4664optimize some parameters of their own protocol. The decentralized nature
4665of GNUnet makes efficient and securely counting the exact number of peers
4666infeasable. Although there are several decentralized algorithms to count
4667the number of peers in a system, so far there is none to do so securely.
4668Other protocols may allow any malicious peer to manipulate the final
4669result or to take advantage of the system to perform
4670@dfn{Denial of Service} (DoS) attacks against the network.
4671GNUnet's NSE protocol avoids these drawbacks.
4672
4673
4674
4675@menu
4676* Security::
4677@end menu
4678
4679@cindex NSE security
4680@cindex nse security
4681@node Security
4682@subsubsection Security
4683
4684
4685The NSE subsystem is designed to be resilient against these attacks.
4686It uses @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs
4687of work} to prevent one peer from impersonating a large number of
4688participants, which would otherwise allow an adversary to artifically
4689inflate the estimate.
4690The DoS protection comes from the time-based nature of the protocol:
4691the estimates are calculated periodically and out-of-time traffic is
4692either ignored or stored for later retransmission by benign peers.
4693In particular, peers cannot trigger global network communication at will.
4694
4695@cindex NSE principle
4696@cindex nse principle
4697@node Principle
4698@subsection Principle
4699
4700
4701The algorithm calculates the estimate by finding the globally closest
4702peer ID to a random, time-based value.
4703
4704The idea is that the closer the ID is to the random value, the more
4705"densely packed" the ID space is, and therefore, more peers are in the
4706network.
4707
4708
4709
4710@menu
4711* Example::
4712* Algorithm::
4713* Target value::
4714* Timing::
4715* Controlled Flooding::
4716* Calculating the estimate::
4717@end menu
4718
4719@node Example
4720@subsubsection Example
4721
4722
4723Suppose all peers have IDs between 0 and 100 (our ID space), and the
4724random value is 42.
4725If the closest peer has the ID 70 we can imagine that the average
4726"distance" between peers is around 30 and therefore the are around 3
4727peers in the whole ID space. On the other hand, if the closest peer has
4728the ID 44, we can imagine that the space is rather packed with peers,
4729maybe as much as 50 of them.
4730Naturally, we could have been rather unlucky, and there is only one peer
4731and happens to have the ID 44. Thus, the current estimate is calculated
4732as the average over multiple rounds, and not just a single sample.
4733
4734@node Algorithm
4735@subsubsection Algorithm
4736
4737
4738Given that example, one can imagine that the job of the subsystem is to
4739efficiently communicate the ID of the closest peer to the target value
4740to all the other peers, who will calculate the estimate from it.
4741
4742@node Target value
4743@subsubsection Target value
4744
4745@c %**end of header
4746
4747The target value itself is generated by hashing the current time, rounded
4748down to an agreed value. If the rounding amount is 1h (default) and the
4749time is 12:34:56, the time to hash would be 12:00:00. The process is
4750repeated each rouning amount (in this example would be every hour).
4751Every repetition is called a round.
4752
4753@node Timing
4754@subsubsection Timing
4755@c %**end of header
4756
4757The NSE subsystem has some timing control to avoid everybody broadcasting
4758its ID all at one. Once each peer has the target random value, it
4759compares its own ID to the target and calculates the hypothetical size of
4760the network if that peer were to be the closest.
4761Then it compares the hypothetical size with the estimate from the previous
4762rounds. For each value there is an assiciated point in the period,
4763let's call it "broadcast time". If its own hypothetical estimate
4764is the same as the previous global estimate, its "broadcast time" will be
4765in the middle of the round. If its bigger it will be earlier and if its
4766smaller (the most likely case) it will be later. This ensures that the
4767peers closests to the target value start broadcasting their ID the first.
4768
4769@node Controlled Flooding
4770@subsubsection Controlled Flooding
4771
4772@c %**end of header
4773
4774When a peer receives a value, first it verifies that it is closer than the
4775closest value it had so far, otherwise it answers the incoming message
4776with a message containing the better value. Then it checks a proof of
4777work that must be included in the incoming message, to ensure that the
4778other peer's ID is not made up (otherwise a malicious peer could claim to
4779have an ID of exactly the target value every round). Once validated, it
4780compares the brodcast time of the received value with the current time
4781and if it's not too early, sends the received value to its neighbors.
4782Otherwise it stores the value until the correct broadcast time comes.
4783This prevents unnecessary traffic of sub-optimal values, since a better
4784value can come before the broadcast time, rendering the previous one
4785obsolete and saving the traffic that would have been used to broadcast it
4786to the neighbors.
4787
4788@node Calculating the estimate
4789@subsubsection Calculating the estimate
4790
4791@c %**end of header
4792
4793Once the closest ID has been spread across the network each peer gets the
4794exact distance betweed this ID and the target value of the round and
4795calculates the estimate with a mathematical formula described in the tech
4796report. The estimate generated with this method for a single round is not
4797very precise. Remember the case of the example, where the only peer is the
4798ID 44 and we happen to generate the target value 42, thinking there are
479950 peers in the network. Therefore, the NSE subsystem remembers the last
480064 estimates and calculates an average over them, giving a result of which
4801usually has one bit of uncertainty (the real size could be half of the
4802estimate or twice as much). Note that the actual network size is
4803calculated in powers of two of the raw input, thus one bit of uncertainty
4804means a factor of two in the size estimate.
4805
4806@cindex libgnunetnse
4807@node libgnunetnse
4808@subsection libgnunetnse
4809
4810@c %**end of header
4811
4812The NSE subsystem has the simplest API of all services, with only two
4813calls: @code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
4814
4815The connect call gets a callback function as a parameter and this function
4816is called each time the network agrees on an estimate. This usually is
4817once per round, with some exceptions: if the closest peer has a late
4818local clock and starts spreading his ID after everyone else agreed on a
4819value, the callback might be activated twice in a round, the second value
4820being always bigger than the first. The default round time is set to
48211 hour.
4822
4823The disconnect call disconnects from the NSE subsystem and the callback
4824is no longer called with new estimates.
4825
4826
4827
4828@menu
4829* Results::
4830* Examples2::
4831@end menu
4832
4833@node Results
4834@subsubsection Results
4835
4836@c %**end of header
4837
4838The callback provides two values: the average and the
4839@uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation}
4840of the last 64 rounds. The values provided by the callback function are
4841logarithmic, this means that the real estimate numbers can be obtained by
4842calculating 2 to the power of the given value (2average). From a
4843statistics point of view this means that:
4844
4845@itemize @bullet
4846@item 68% of the time the real size is included in the interval
4847[(2average-stddev), 2]
4848@item 95% of the time the real size is included in the interval
4849[(2average-2*stddev, 2^average+2*stddev]
4850@item 99.7% of the time the real size is included in the interval
4851[(2average-3*stddev, 2average+3*stddev]
4852@end itemize
4853
4854The expected standard variation for 64 rounds in a network of stable size
4855is 0.2. Thus, we can say that normally:
4856
4857@itemize @bullet
4858@item 68% of the time the real size is in the range [-13%, +15%]
4859@item 95% of the time the real size is in the range [-24%, +32%]
4860@item 99.7% of the time the real size is in the range [-34%, +52%]
4861@end itemize
4862
4863As said in the introduction, we can be quite sure that usually the real
4864size is between one third and three times the estimate. This can of
4865course vary with network conditions.
4866Thus, applications may want to also consider the provided standard
4867deviation value, not only the average (in particular, if the standard
4868veriation is very high, the average maybe meaningless: the network size is
4869changing rapidly).
4870
4871@node Examples2
4872@subsubsection Examples2
4873
4874@c %**end of header
4875
4876Let's close with a couple examples.
4877
4878@table @asis
4879
4880@item Average: 10, std dev: 1 Here the estimate would be
48812^10 = 1024 peers. @footnote{The range in which we can be 95% sure is:
4882[2^8, 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network
4883is not a hundred peers and absolutely sure that it is not a million peers,
4884but somewhere around a thousand.}
4885
4886@item Average 22, std dev: 0.2 Here the estimate would be
48872^22 = 4 Million peers. @footnote{The range in which we can be 99.7% sure
4888is: [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size
4889is around four million, with absolutely way of it being 1 million.}
4890
4891@end table
4892
4893To put this in perspective, if someone remembers the LHC Higgs boson
4894results, were announced with "5 sigma" and "6 sigma" certainties. In this
4895case a 5 sigma minimum would be 2 million and a 6 sigma minimum,
48961.8 million.
4897
4898@node The NSE Client-Service Protocol
4899@subsection The NSE Client-Service Protocol
4900
4901@c %**end of header
4902
4903As with the API, the client-service protocol is very simple, only has 2
4904different messages, defined in @code{src/nse/nse.h}:
4905
4906@itemize @bullet
4907@item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters
4908and is sent from the client to the service upon connection.
4909@item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from
4910the service to the client for every new estimate and upon connection.
4911Contains a timestamp for the estimate, the average and the standard
4912deviation for the respective round.
4913@end itemize
4914
4915When the @code{GNUNET_NSE_disconnect} API call is executed, the client
4916simply disconnects from the service, with no message involved.
4917
4918@node The NSE Peer-to-Peer Protocol
4919@subsection The NSE Peer-to-Peer Protocol
4920
4921@c %**end of header
4922
4923The NSE subsystem only has one message in the P2P protocol, the
4924@code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
4925
4926This message key contents are the timestamp to identify the round
4927(differences in system clocks may cause some peers to send messages way
4928too early or way too late, so the timestamp allows other peers to
4929identify such messages easily), the
4930@uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
4931used to make it difficult to mount a
4932@uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the
4933public key, which is used to verify the signature on the message.
4934
4935Every peer stores a message for the previous, current and next round. The
4936messages for the previous and current round are given to peers that
4937connect to us. The message for the next round is simply stored until our
4938system clock advances to the next round. The message for the current round
4939is what we are flooding the network with right now.
4940At the beginning of each round the peer does the following:
4941
4942@itemize @bullet
4943@item calculates his own distance to the target value
4944@item creates, signs and stores the message for the current round (unless
4945it has a better message in the "next round" slot which came early in the
4946previous round)
4947@item calculates, based on the stored round message (own or received) when
4948to stard flooding it to its neighbors
4949@end itemize
4950
4951Upon receiving a message the peer checks the validity of the message
4952(round, proof of work, signature). The next action depends on the
4953contents of the incoming message:
4954
4955@itemize @bullet
4956@item if the message is worse than the current stored message, the peer
4957sends the current message back immediately, to stop the other peer from
4958spreading suboptimal results
4959@item if the message is better than the current stored message, the peer
4960stores the new message and calculates the new target time to start
4961spreading it to its neighbors (excluding the one the message came from)
4962@item if the message is for the previous round, it is compared to the
4963message stored in the "previous round slot", which may then be updated
4964@item if the message is for the next round, it is compared to the message
4965stored in the "next round slot", which again may then be updated
4966@end itemize
4967
4968Finally, when it comes to send the stored message for the current round to
4969the neighbors there is a random delay added for each neighbor, to avoid
4970traffic spikes and minimize cross-messages.
4971
4972@cindex HOSTLIST subsystem
4973@cindex hostlist subsystem
4974@node GNUnet's HOSTLIST subsystem
4975@section GNUnet's HOSTLIST subsystem
4976
4977@c %**end of header
4978
4979Peers in the GNUnet overlay network need address information so that they
4980can connect with other peers. GNUnet uses so called HELLO messages to
4981store and exchange peer addresses.
4982GNUnet provides several methods for peers to obtain this information:
4983
4984@itemize @bullet
4985@item out-of-band exchange of HELLO messages (manually, using for example
4986gnunet-peerinfo)
4987@item HELLO messages shipped with GNUnet (automatic with distribution)
4988@item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
4989@item topology gossiping (learning from other peers we already connected
4990to), and
4991@item the HOSTLIST daemon covered in this section, which is particularly
4992relevant for bootstrapping new peers.
4993@end itemize
4994
4995New peers have no existing connections (and thus cannot learn from gossip
4996among peers), may not have other peers in their LAN and might be started
4997with an outdated set of HELLO messages from the distribution.
4998In this case, getting new peers to connect to the network requires either
4999manual effort or the use of a HOSTLIST to obtain HELLOs.
5000
5001@menu
5002* HELLOs::
5003* Overview for the HOSTLIST subsystem::
5004* Interacting with the HOSTLIST daemon::
5005* Hostlist security address validation::
5006* The HOSTLIST daemon::
5007* The HOSTLIST server::
5008* The HOSTLIST client::
5009* Usage::
5010@end menu
5011
5012@node HELLOs
5013@subsection HELLOs
5014
5015@c %**end of header
5016
5017The basic information peers require to connect to other peers are
5018contained in so called HELLO messages you can think of as a business card.
5019Besides the identity of the peer (based on the cryptographic public key) a
5020HELLO message may contain address information that specifies ways to
5021contact a peer. By obtaining HELLO messages, a peer can learn how to
5022contact other peers.
5023
5024@node Overview for the HOSTLIST subsystem
5025@subsection Overview for the HOSTLIST subsystem
5026
5027@c %**end of header
5028
5029The HOSTLIST subsystem provides a way to distribute and obtain contact
5030information to connect to other peers using a simple HTTP GET request.
5031It's implementation is split in three parts, the main file for the daemon
5032itself (@file{gnunet-daemon-hostlist.c}), the HTTP client used to download
5033peer information (@file{hostlist-client.c}) and the server component used
5034to provide this information to other peers (@file{hostlist-server.c}).
5035The server is basically a small HTTP web server (based on GNU
5036libmicrohttpd) which provides a list of HELLOs known to the local peer for
5037download. The client component is basically a HTTP client
5038(based on libcurl) which can download hostlists from one or more websites.
5039The hostlist format is a binary blob containing a sequence of HELLO
5040messages. Note that any HTTP server can theoretically serve a hostlist,
5041the build-in hostlist server makes it simply convenient to offer this
5042service.
5043
5044
5045@menu
5046* Features::
5047* Limitations2::
5048@end menu
5049
5050@node Features
5051@subsubsection Features
5052
5053@c %**end of header
5054
5055The HOSTLIST daemon can:
5056
5057@itemize @bullet
5058@item provide HELLO messages with validated addresses obtained from
5059PEERINFO to download for other peers
5060@item download HELLO messages and forward these message to the TRANSPORT
5061subsystem for validation
5062@item advertises the URL of this peer's hostlist address to other peers
5063via gossip
5064@item automatically learn about hostlist servers from the gossip of other
5065peers
5066@end itemize
5067
5068@node Limitations2
5069@subsubsection Limitations2
5070
5071@c %**end of header
5072
5073The HOSTLIST daemon does not:
5074
5075@itemize @bullet
5076@item verify the cryptographic information in the HELLO messages
5077@item verify the address information in the HELLO messages
5078@end itemize
5079
5080@node Interacting with the HOSTLIST daemon
5081@subsection Interacting with the HOSTLIST daemon
5082
5083@c %**end of header
5084
5085The HOSTLIST subsystem is currently implemented as a daemon, so there is
5086no need for the user to interact with it and therefore there is no
5087command line tool and no API to communicate with the daemon. In the
5088future, we can envision changing this to allow users to manually trigger
5089the download of a hostlist.
5090
5091Since there is no command line interface to interact with HOSTLIST, the
5092only way to interact with the hostlist is to use STATISTICS to obtain or
5093modify information about the status of HOSTLIST:
5094
5095@example
5096$ gnunet-statistics -s hostlist
5097@end example
5098
5099@noindent
5100In particular, HOSTLIST includes a @strong{persistent} value in statistics
5101that specifies when the hostlist server might be queried next. As this
5102value is exponentially increasing during runtime, developers may want to
5103reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) needs
5104to be shutdown if changes to this value are to have any effect on the
5105daemon (as HOSTLIST does not monitor STATISTICS for changes to the
5106download frequency).
5107
5108@node Hostlist security address validation
5109@subsection Hostlist security address validation
5110
5111@c %**end of header
5112
5113Since information obtained from other parties cannot be trusted without
5114validation, we have to distinguish between @emph{validated} and
5115@emph{not validated} addresses. Before using (and so trusting)
5116information from other parties, this information has to be double-checked
5117(validated). Address validation is not done by HOSTLIST but by the
5118TRANSPORT service.
5119
5120The HOSTLIST component is functionally located between the PEERINFO and
5121the TRANSPORT subsystem. When acting as a server, the daemon obtains valid
5122(@emph{validated}) peer information (HELLO messages) from the PEERINFO
5123service and provides it to other peers. When acting as a client, it
5124contacts the HOSTLIST servers specified in the configuration, downloads
5125the (unvalidated) list of HELLO messages and forwards these information
5126to the TRANSPORT server to validate the addresses.
5127
5128@node The HOSTLIST daemon
5129@subsection The HOSTLIST daemon
5130
5131@c %**end of header
5132
5133The hostlist daemon is the main component of the HOSTLIST subsystem. It is
5134started by the ARM service and (if configured) starts the HOSTLIST client
5135and server components.
5136
5137If the daemon provides a hostlist itself it can advertise it's own
5138hostlist to other peers. To do so it sends a
5139@code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to other peers
5140when they connect to this peer on the CORE level. This hostlist
5141advertisement message contains the URL to access the HOSTLIST HTTP
5142server of the sender. The daemon may also subscribe to this type of
5143message from CORE service, and then forward these kind of message to the
5144HOSTLIST client. The client then uses all available URLs to download peer
5145information when necessary.
5146
5147When starting, the HOSTLIST daemon first connects to the CORE subsystem
5148and if hostlist learning is enabled, registers a CORE handler to receive
5149this kind of messages. Next it starts (if configured) the client and
5150server. It passes pointers to CORE connect and disconnect and receive
5151handlers where the client and server store their functions, so the daemon
5152can notify them about CORE events.
5153
5154To clean up on shutdown, the daemon has a cleaning task, shutting down all
5155subsystems and disconnecting from CORE.
5156
5157@node The HOSTLIST server
5158@subsection The HOSTLIST server
5159
5160@c %**end of header
5161
5162The server provides a way for other peers to obtain HELLOs. Basically it
5163is a small web server other peers can connect to and download a list of
5164HELLOs using standard HTTP; it may also advertise the URL of the hostlist
5165to other peers connecting on CORE level.
5166
5167
5168@menu
5169* The HTTP Server::
5170* Advertising the URL::
5171@end menu
5172
5173@node The HTTP Server
5174@subsubsection The HTTP Server
5175
5176@c %**end of header
5177
5178During startup, the server starts a web server listening on the port
5179specified with the HTTPPORT value (default 8080). In addition it connects
5180to the PEERINFO service to obtain peer information. The HOSTLIST server
5181uses the GNUNET_PEERINFO_iterate function to request HELLO information for
5182all peers and adds their information to a new hostlist if they are
5183suitable (expired addresses and HELLOs without addresses are both not
5184suitable) and the maximum size for a hostlist is not exceeded
5185(MAX_BYTES_PER_HOSTLISTS = 500000).
5186When PEERINFO finishes (with a last NULL callback), the server destroys
5187the previous hostlist response available for download on the web server
5188and replaces it with the updated hostlist. The hostlist format is
5189basically a sequence of HELLO messages (as obtained from PEERINFO) without
5190any special tokenization. Since each HELLO message contains a size field,
5191the response can easily be split into separate HELLO messages by the
5192client.
5193
5194A HOSTLIST client connecting to the HOSTLIST server will receive the
5195hostlist as a HTTP response and the the server will terminate the
5196connection with the result code @code{HTTP 200 OK}.
5197The connection will be closed immediately if no hostlist is available.
5198
5199@node Advertising the URL
5200@subsubsection Advertising the URL
5201
5202@c %**end of header
5203
5204The server also advertises the URL to download the hostlist to other peers
5205if hostlist advertisement is enabled.
5206When a new peer connects and has hostlist learning enabled, the server
5207sends a @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT} message to this
5208peer using the CORE service.
5209
5210@node The HOSTLIST client
5211@subsection The HOSTLIST client
5212
5213@c %**end of header
5214
5215The client provides the functionality to download the list of HELLOs from
5216a set of URLs.
5217It performs a standard HTTP request to the URLs configured and learned
5218from advertisement messages received from other peers. When a HELLO is
5219downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT
5220service for validation.
5221
5222The client supports two modes of operation:
5223
5224@itemize @bullet
5225@item download of HELLOs (bootstrapping)
5226@item learning of URLs
5227@end itemize
5228
5229@menu
5230* Bootstrapping::
5231* Learning::
5232@end menu
5233
5234@node Bootstrapping
5235@subsubsection Bootstrapping
5236
5237@c %**end of header
5238
5239For bootstrapping, it schedules a task to download the hostlist from the
5240set of known URLs.
5241The downloads are only performed if the number of current
5242connections is smaller than a minimum number of connections
5243(at the moment 4).
5244The interval between downloads increases exponentially; however, the
5245exponential growth is limited if it becomes longer than an hour.
5246At that point, the frequency growth is capped at
5247(#number of connections * 1h).
5248
5249Once the decision has been taken to download HELLOs, the daemon chooses a
5250random URL from the list of known URLs. URLs can be configured in the
5251configuration or be learned from advertisement messages.
5252The client uses a HTTP client library (libcurl) to initiate the download
5253using the libcurl multi interface.
5254Libcurl passes the data to the callback_download function which
5255stores the data in a buffer if space is available and the maximum size for
5256a hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000).
5257When a full HELLO was downloaded, the HOSTLIST client offers this
5258HELLO message to the TRANSPORT service for validation.
5259When the download is finished or failed, statistical information about the
5260quality of this URL is updated.
5261
5262@cindex HOSTLIST learning
5263@node Learning
5264@subsubsection Learning
5265
5266@c %**end of header
5267
5268The client also manages hostlist advertisements from other peers. The
5269HOSTLIST daemon forwards @code{GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT}
5270messages to the client subsystem, which extracts the URL from the message.
5271Next, a test of the newly obtained URL is performed by triggering a
5272download from the new URL. If the URL works correctly, it is added to the
5273list of working URLs.
5274
5275The size of the list of URLs is restricted, so if an additional server is
5276added and the list is full, the URL with the worst quality ranking
5277(determined through successful downloads and number of HELLOs e.g.) is
5278discarded. During shutdown the list of URLs is saved to a file for
5279persistance and loaded on startup. URLs from the configuration file are
5280never discarded.
5281
5282@node Usage
5283@subsection Usage
5284
5285@c %**end of header
5286
5287To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES
5288section for the ARM services. This is done in the default configuration.
5289
5290For more information on how to configure the HOSTLIST subsystem see the
5291installation handbook:@
5292Configuring the hostlist to bootstrap@
5293Configuring your peer to provide a hostlist
5294
5295@cindex IDENTITY
5296@cindex identity subsystem
5297@node GNUnet's IDENTITY subsystem
5298@section GNUnet's IDENTITY subsystem
5299
5300@c %**end of header
5301
5302Identities of "users" in GNUnet are called egos.
5303Egos can be used as pseudonyms ("fake names") or be tied to an
5304organization (for example, "GNU") or even the actual identity of a human.
5305GNUnet users are expected to have many egos. They might have one tied to
5306their real identity, some for organizations they manage, and more for
5307different domains where they want to operate under a pseudonym.
5308
5309The IDENTITY service allows users to manage their egos. The identity
5310service manages the private keys egos of the local user; it does not
5311manage identities of other users (public keys). Public keys for other
5312users need names to become manageable. GNUnet uses the
5313@dfn{GNU Name System} (GNS) to give names to other users and manage their
5314public keys securely. This chapter is about the IDENTITY service,
5315which is about the management of private keys.
5316
5317On the network, an ego corresponds to an ECDSA key (over Curve25519,
5318using RFC 6979, as required by GNS). Thus, users can perform actions
5319under a particular ego by using (signing with) a particular private key.
5320Other users can then confirm that the action was really performed by that
5321ego by checking the signature against the respective public key.
5322
5323The IDENTITY service allows users to associate a human-readable name with
5324each ego. This way, users can use names that will remind them of the
5325purpose of a particular ego.
5326The IDENTITY service will store the respective private keys and
5327allows applications to access key information by name.
5328Users can change the name that is locally (!) associated with an ego.
5329Egos can also be deleted, which means that the private key will be removed
5330and it thus will not be possible to perform actions with that ego in the
5331future.
5332
5333Additionally, the IDENTITY subsystem can associate service functions with
5334egos.
5335For example, GNS requires the ego that should be used for the shorten
5336zone. GNS will ask IDENTITY for an ego for the "gns-short" service.
5337The IDENTITY service has a mapping of such service strings to the name of
5338the ego that the user wants to use for this service, for example
5339"my-short-zone-ego".
5340
5341Finally, the IDENTITY API provides access to a special ego, the
5342anonymous ego. The anonymous ego is special in that its private key is not
5343really private, but fixed and known to everyone.
5344Thus, anyone can perform actions as anonymous. This can be useful as with
5345this trick, code does not have to contain a special case to distinguish
5346between anonymous and pseudonymous egos.
5347
5348@menu
5349* libgnunetidentity::
5350* The IDENTITY Client-Service Protocol::
5351@end menu
5352
5353@cindex libgnunetidentity
5354@node libgnunetidentity
5355@subsection libgnunetidentity
5356@c %**end of header
5357
5358
5359@menu
5360* Connecting to the service::
5361* Operations on Egos::
5362* The anonymous Ego::
5363* Convenience API to lookup a single ego::
5364* Associating egos with service functions::
5365@end menu
5366
5367@node Connecting to the service
5368@subsubsection Connecting to the service
5369
5370@c %**end of header
5371
5372First, typical clients connect to the identity service using
5373@code{GNUNET_IDENTITY_connect}. This function takes a callback as a
5374parameter.
5375If the given callback parameter is non-null, it will be invoked to notify
5376the application about the current state of the identities in the system.
5377
5378@itemize @bullet
5379@item First, it will be invoked on all known egos at the time of the
5380connection. For each ego, a handle to the ego and the user's name for the
5381ego will be passed to the callback. Furthermore, a @code{void **} context
5382argument will be provided which gives the client the opportunity to
5383associate some state with the ego.
5384@item Second, the callback will be invoked with NULL for the ego, the name
5385and the context. This signals that the (initial) iteration over all egos
5386has completed.
5387@item Then, the callback will be invoked whenever something changes about
5388an ego.
5389If an ego is renamed, the callback is invoked with the ego handle of the
5390ego that was renamed, and the new name. If an ego is deleted, the callback
5391is invoked with the ego handle and a name of NULL. In the deletion case,
5392the application should also release resources stored in the context.
5393@item When the application destroys the connection to the identity service
5394using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked
5395with the ego and a name of NULL (equivalent to deletion of the egos).
5396This should again be used to clean up the per-ego context.
5397@end itemize
5398
5399The ego handle passed to the callback remains valid until the callback is
5400invoked with a name of NULL, so it is safe to store a reference to the
5401ego's handle.
5402
5403@node Operations on Egos
5404@subsubsection Operations on Egos
5405
5406@c %**end of header
5407
5408Given an ego handle, the main operations are to get its associated private
5409key using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated
5410public key using @code{GNUNET_IDENTITY_ego_get_public_key}.
5411
5412The other operations on egos are pretty straightforward.
5413Using @code{GNUNET_IDENTITY_create}, an application can request the
5414creation of an ego by specifying the desired name.
5415The operation will fail if that name is
5416already in use. Using @code{GNUNET_IDENTITY_rename} the name of an
5417existing ego can be changed. Finally, egos can be deleted using
5418@code{GNUNET_IDENTITY_delete}. All of these operations will trigger
5419updates to the callback given to the @code{GNUNET_IDENTITY_connect}
5420function of all applications that are connected with the identity service
5421at the time. @code{GNUNET_IDENTITY_cancel} can be used to cancel the
5422operations before the respective continuations would be called.
5423It is not guaranteed that the operation will not be completed anyway,
5424only the continuation will no longer be called.
5425
5426@node The anonymous Ego
5427@subsubsection The anonymous Ego
5428
5429@c %**end of header
5430
5431A special way to obtain an ego handle is to call
5432@code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
5433"anonymous" user --- anyone knows and can get the private key for this
5434user, so it is suitable for operations that are supposed to be anonymous
5435but require signatures (for example, to avoid a special path in the code).
5436The anonymous ego is always valid and accessing it does not require a
5437connection to the identity service.
5438
5439@node Convenience API to lookup a single ego
5440@subsubsection Convenience API to lookup a single ego
5441
5442
5443As applications commonly simply have to lookup a single ego, there is a
5444convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
5445lookup a single ego by name. Note that this is the user's name for the
5446ego, not the service function. The resulting ego will be returned via a
5447callback and will only be valid during that callback. The operation can
5448be cancelled via @code{GNUNET_IDENTITY_ego_lookup_cancel}
5449(cancellation is only legal before the callback is invoked).
5450
5451@node Associating egos with service functions
5452@subsubsection Associating egos with service functions
5453
5454
5455The @code{GNUNET_IDENTITY_set} function is used to associate a particular
5456ego with a service function. The name used by the service and the ego are
5457given as arguments.
5458Afterwards, the service can use its name to lookup the associated ego
5459using @code{GNUNET_IDENTITY_get}.
5460
5461@node The IDENTITY Client-Service Protocol
5462@subsection The IDENTITY Client-Service Protocol
5463
5464@c %**end of header
5465
5466A client connecting to the identity service first sends a message with
5467type
5468@code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
5469client will receive information about changes to the egos by receiving
5470messages of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}.
5471Those messages contain the private key of the ego and the user's name of
5472the ego (or zero bytes for the name to indicate that the ego was deleted).
5473A special bit @code{end_of_list} is used to indicate the end of the
5474initial iteration over the identity service's egos.
5475
5476The client can trigger changes to the egos by sending @code{CREATE},
5477@code{RENAME} or @code{DELETE} messages.
5478The CREATE message contains the private key and the desired name.@
5479The RENAME message contains the old name and the new name.@
5480The DELETE message only needs to include the name of the ego to delete.@
5481The service responds to each of these messages with a @code{RESULT_CODE}
5482message which indicates success or error of the operation, and possibly
5483a human-readable error message.
5484
5485Finally, the client can bind the name of a service function to an ego by
5486sending a @code{SET_DEFAULT} message with the name of the service function
5487and the private key of the ego.
5488Such bindings can then be resolved using a @code{GET_DEFAULT} message,
5489which includes the name of the service function. The identity service
5490will respond to a GET_DEFAULT request with a SET_DEFAULT message
5491containing the respective information, or with a RESULT_CODE to
5492indicate an error.
5493
5494@cindex NAMESTORE
5495@cindex namestore subsystem
5496@node GNUnet's NAMESTORE Subsystem
5497@section GNUnet's NAMESTORE Subsystem
5498
5499The NAMESTORE subsystem provides persistent storage for local GNS zone
5500information. All local GNS zone information are managed by NAMESTORE. It
5501provides both the functionality to administer local GNS information (e.g.
5502delete and add records) as well as to retrieve GNS information (e.g to
5503list name information in a client).
5504NAMESTORE does only manage the persistent storage of zone information
5505belonging to the user running the service: GNS information from other
5506users obtained from the DHT are stored by the NAMECACHE subsystem.
5507
5508NAMESTORE uses a plugin-based database backend to store GNS information
5509with good performance. Here sqlite, MySQL and PostgreSQL are supported
5510database backends.
5511NAMESTORE clients interact with the IDENTITY subsystem to obtain
5512cryptographic information about zones based on egos as described with the
5513IDENTITY subsystem, but internally NAMESTORE refers to zones using the
5514ECDSA private key.
5515In addition, it collaborates with the NAMECACHE subsystem and
5516stores zone information when local information are modified in the
5517GNS cache to increase look-up performance for local information.
5518
5519NAMESTORE provides functionality to look-up and store records, to iterate
5520over a specific or all zones and to monitor zones for changes. NAMESTORE
5521functionality can be accessed using the NAMESTORE api or the NAMESTORE
5522command line tool.
5523
5524@menu
5525* libgnunetnamestore::
5526@end menu
5527
5528@cindex libgnunetnamestore
5529@node libgnunetnamestore
5530@subsection libgnunetnamestore
5531
5532To interact with NAMESTORE clients first connect to the NAMESTORE service
5533using the @code{GNUNET_NAMESTORE_connect} passing a configuration handle.
5534As a result they obtain a NAMESTORE handle, they can use for operations,
5535or NULL is returned if the connection failed.
5536
5537To disconnect from NAMESTORE, clients use
5538@code{GNUNET_NAMESTORE_disconnect} and specify the handle to disconnect.
5539
5540NAMESTORE internally uses the ECDSA private key to refer to zones. These
5541private keys can be obtained from the IDENTITY subsytem.
5542Here @emph{egos} @emph{can be used to refer to zones or the default ego
5543assigned to the GNS subsystem can be used to obtained the master zone's
5544private key.}
5545
5546
5547@menu
5548* Editing Zone Information::
5549* Iterating Zone Information::
5550* Monitoring Zone Information::
5551@end menu
5552
5553@node Editing Zone Information
5554@subsubsection Editing Zone Information
5555
5556@c %**end of header
5557
5558NAMESTORE provides functions to lookup records stored under a label in a
5559zone and to store records under a label in a zone.
5560
5561To store (and delete) records, the client uses the
5562@code{GNUNET_NAMESTORE_records_store} function and has to provide
5563namestore handle to use, the private key of the zone, the label to store
5564the records under, the records and number of records plus an callback
5565function.
5566After the operation is performed NAMESTORE will call the provided
5567callback function with the result GNUNET_SYSERR on failure
5568(including timeout/queue drop/failure to validate), GNUNET_NO if content
5569was already there or not found GNUNET_YES (or other positive value) on
5570success plus an additional error message.
5571
5572Records are deleted by using the store command with 0 records to store.
5573It is important to note, that records are not merged when records exist
5574with the label.
5575So a client has first to retrieve records, merge with existing records
5576and then store the result.
5577
5578To perform a lookup operation, the client uses the
5579@code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the
5580namestore handle, the private key of the zone and the label. He also has
5581to provide a callback function which will be called with the result of
5582the lookup operation:
5583the zone for the records, the label, and the records including the
5584number of records included.
5585
5586A special operation is used to set the preferred nickname for a zone.
5587This nickname is stored with the zone and is automatically merged with
5588all labels and records stored in a zone. Here the client uses the
5589@code{GNUNET_NAMESTORE_set_nick} function and passes the private key of
5590the zone, the nickname as string plus a the callback with the result of
5591the operation.
5592
5593@node Iterating Zone Information
5594@subsubsection Iterating Zone Information
5595
5596@c %**end of header
5597
5598A client can iterate over all information in a zone or all zones managed
5599by NAMESTORE.
5600Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
5601function and passes the namestore handle, the zone to iterate over and a
5602callback function to call with the result.
5603If the client wants to iterate over all the, he passes NULL for the zone.
5604A @code{GNUNET_NAMESTORE_ZoneIterator} handle is returned to be used to
5605continue iteration.
5606
5607NAMESTORE calls the callback for every result and expects the client to
5608call @code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
5609@code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration.
5610When NAMESTORE reached the last item it will call the callback with a
5611NULL value to indicate.
5612
5613@node Monitoring Zone Information
5614@subsubsection Monitoring Zone Information
5615
5616@c %**end of header
5617
5618Clients can also monitor zones to be notified about changes. Here the
5619clients uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and
5620passes the private key of the zone and and a callback function to call
5621with updates for a zone.
5622The client can specify to obtain zone information first by iterating over
5623the zone and specify a synchronization callback to be called when the
5624client and the namestore are synced.
5625
5626On an update, NAMESTORE will call the callback with the private key of the
5627zone, the label and the records and their number.
5628
5629To stop monitoring, the client calls
5630@code{GNUNET_NAMESTORE_zone_monitor_stop} and passes the handle obtained
5631from the function to start the monitoring.
5632
5633@cindex PEERINFO
5634@cindex peerinfo subsystem
5635@node GNUnet's PEERINFO subsystem
5636@section GNUnet's PEERINFO subsystem
5637
5638@c %**end of header
5639
5640The PEERINFO subsystem is used to store verified (validated) information
5641about known peers in a persistent way. It obtains these addresses for
5642example from TRANSPORT service which is in charge of address validation.
5643Validation means that the information in the HELLO message are checked by
5644connecting to the addresses and performing a cryptographic handshake to
5645authenticate the peer instance stating to be reachable with these
5646addresses.
5647Peerinfo does not validate the HELLO messages itself but only stores them
5648and gives them to interested clients.
5649
5650As future work, we think about moving from storing just HELLO messages to
5651providing a generic persistent per-peer information store.
5652More and more subsystems tend to need to store per-peer information in
5653persistent way.
5654To not duplicate this functionality we plan to provide a PEERSTORE
5655service providing this functionality.
5656
5657@menu
5658* Features2::
5659* Limitations3::
5660* DeveloperPeer Information::
5661* Startup::
5662* Managing Information::
5663* Obtaining Information::
5664* The PEERINFO Client-Service Protocol::
5665* libgnunetpeerinfo::
5666@end menu
5667
5668@node Features2
5669@subsection Features2
5670
5671@c %**end of header
5672
5673@itemize @bullet
5674@item Persistent storage
5675@item Client notification mechanism on update
5676@item Periodic clean up for expired information
5677@item Differentiation between public and friend-only HELLO
5678@end itemize
5679
5680@node Limitations3
5681@subsection Limitations3
5682
5683
5684@itemize @bullet
5685@item Does not perform HELLO validation
5686@end itemize
5687
5688@node DeveloperPeer Information
5689@subsection DeveloperPeer Information
5690
5691@c %**end of header
5692
5693The PEERINFO subsystem stores these information in the form of HELLO messages
5694you can think of as business cards. These HELLO messages contain the public key
5695of a peer and the addresses a peer can be reached under. The addresses include
5696an expiration date describing how long they are valid. This information is
5697updated regularly by the TRANSPORT service by revalidating the address. If an
5698address is expired and not renewed, it can be removed from the HELLO message.
5699
5700Some peer do not want to have their HELLO messages distributed to other peers ,
5701especially when GNUnet's friend-to-friend modus is enabled. To prevent this
5702undesired distribution. PEERINFO distinguishes between @emph{public} and
5703@emph{friend-only} HELLO messages. Public HELLO messages can be freely
5704distributed to other (possibly unknown) peers (for example using the hostlist,
5705gossiping, broadcasting), whereas friend-only HELLO messages may not be
5706distributed to other peers. Friend-only HELLO messages have an additional flag
5707@code{friend_only} set internally. For public HELLO message this flag is not
5708set. PEERINFO does and cannot not check if a client is allowed to obtain a
5709specific HELLO type.
5710
5711The HELLO messages can be managed using the GNUnet HELLO library. Other GNUnet
5712systems can obtain these information from PEERINFO and use it for their
5713purposes. Clients are for example the HOSTLIST component providing these
5714information to other peers in form of a hostlist or the TRANSPORT subsystem
5715using these information to maintain connections to other peers.
5716
5717@node Startup
5718@subsection Startup
5719
5720@c %**end of header
5721
5722During startup the PEERINFO services loads persistent HELLOs from disk. First
5723PEERINFO parses the directory configured in the HOSTS value of the
5724@code{PEERINFO} configuration section to store PEERINFO information.@ For all
5725files found in this directory valid HELLO messages are extracted. In addition
5726it loads HELLO messages shipped with the GNUnet distribution. These HELLOs are
5727used to simplify network bootstrapping by providing valid peer information with
5728the distribution. The use of these HELLOs can be prevented by setting the
5729@code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
5730@code{NO}. Files containing invalid information are removed.
5731
5732@node Managing Information
5733@subsection Managing Information
5734
5735@c %**end of header
5736
5737The PEERINFO services stores information about known PEERS and a single HELLO
5738message for every peer. A peer does not need to have a HELLO if no information
5739are available. HELLO information from different sources, for example a HELLO
5740obtained from a remote HOSTLIST and a second HELLO stored on disk, are combined
5741and merged into one single HELLO message per peer which will be given to
5742clients. During this merge process the HELLO is immediately written to disk to
5743ensure persistence.
5744
5745PEERINFO in addition periodically scans the directory where information are
5746stored for empty HELLO messages with expired TRANSPORT addresses.@ This
5747periodic task scans all files in the directory and recreates the HELLO messages
5748it finds. Expired TRANSPORT addresses are removed from the HELLO and if the
5749HELLO does not contain any valid addresses, it is discarded and removed from
5750disk.
5751
5752@node Obtaining Information
5753@subsection Obtaining Information
5754
5755@c %**end of header
5756
5757When a client requests information from PEERINFO, PEERINFO performs a lookup
5758for the respective peer or all peers if desired and transmits this information
5759to the client. The client can specify if friend-only HELLOs have to be included
5760or not and PEERINFO filters the respective HELLO messages before transmitting
5761information.
5762
5763To notify clients about changes to PEERINFO information, PEERINFO maintains a
5764list of clients interested in this notifications. Such a notification occurs if
5765a HELLO for a peer was updated (due to a merge for example) or a new peer was
5766added.
5767
5768@node The PEERINFO Client-Service Protocol
5769@subsection The PEERINFO Client-Service Protocol
5770
5771@c %**end of header
5772
5773To connect and disconnect to and from the PEERINFO Service PEERINFO utilizes
5774the util client/server infrastructure, so no special messages types are used
5775here.
5776
5777To add information for a peer, the plain HELLO message is transmitted to the
5778service without any wrapping. Alle information required are stored within the
5779HELLO message. The PEERINFO service provides a message handler accepting and
5780processing these HELLO messages.
5781
5782When obtaining PEERINFO information using the iterate functionality specific
5783messages are used. To obtain information for all peers, a @code{struct
5784ListAllPeersMessage} with message type
5785@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag include_friend_only to
5786indicate if friend-only HELLO messages should be included are transmitted. If
5787information for a specific peer is required a @code{struct ListAllPeersMessage}
5788with @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
5789used.
5790
5791For both variants the PEERINFO service replies for each HELLO message he wants
5792to transmit with a @code{struct ListAllPeersMessage} with type
5793@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO. The final
5794message is @code{struct GNUNET_MessageHeader} with type
5795@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this message,
5796he can proceed with the next request if any is pending
5797
5798@node libgnunetpeerinfo
5799@subsection libgnunetpeerinfo
5800
5801@c %**end of header
5802
5803The PEERINFO API consists mainly of three different functionalities:
5804maintaining a connection to the service, adding new information and retrieving
5805information form the PEERINFO service.
5806
5807
5808@menu
5809* Connecting to the Service::
5810* Adding Information::
5811* Obtaining Information2::
5812@end menu
5813
5814@node Connecting to the Service
5815@subsubsection Connecting to the Service
5816
5817@c %**end of header
5818
5819To connect to the PEERINFO service the function @code{GNUNET_PEERINFO_connect}
5820is used, taking a configuration handle as an argument, and to disconnect from
5821PEERINFO the function @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
5822handle returned from the connect function has to be called.
5823
5824@node Adding Information
5825@subsubsection Adding Information
5826
5827@c %**end of header
5828
5829@code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
5830storage. This function takes the PEERINFO handle as an argument, the HELLO
5831message to store and a continuation with a closure to be called with the result
5832of the operation. The @code{GNUNET_PEERINFO_add_peer} returns a handle to this
5833operation allowing to cancel the operation with the respective cancel function
5834@code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from PEERINFO
5835you can iterate over all information stored with PEERINFO or you can tell
5836PEERINFO to notify if new peer information are available.
5837
5838@node Obtaining Information2
5839@subsubsection Obtaining Information2
5840
5841@c %**end of header
5842
5843To iterate over information in PEERINFO you use @code{GNUNET_PEERINFO_iterate}.
5844This function expects the PEERINFO handle, a flag if HELLO messages intended
5845for friend only mode should be included, a timeout how long the operation
5846should take and a callback with a callback closure to be called for the
5847results. If you want to obtain information for a specific peer, you can specify
5848the peer identity, if this identity is NULL, information for all peers are
5849returned. The function returns a handle to allow to cancel the operation using
5850@code{GNUNET_PEERINFO_iterate_cancel}.
5851
5852To get notified when peer information changes, you can use
5853@code{GNUNET_PEERINFO_notify}. This function expects a configuration handle and
5854a flag if friend-only HELLO messages should be included. The PEERINFO service
5855will notify you about every change and the callback function will be called to
5856notify you about changes. The function returns a handle to cancel notifications
5857with @code{GNUNET_PEERINFO_notify_cancel}.
5858
5859
5860@node GNUnet's PEERSTORE subsystem
5861@section GNUnet's PEERSTORE subsystem
5862
5863@c %**end of header
5864
5865GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
5866GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently store
5867and retrieve arbitrary data. Each data record stored with PEERSTORE contains
5868the following fields:
5869
5870@itemize @bullet
5871@item subsystem: Name of the subsystem responsible for the record.
5872@item peerid: Identity of the peer this record is related to.
5873@item key: a key string identifying the record.
5874@item value: binary record value.
5875@item expiry: record expiry date.
5876@end itemize
5877
5878@menu
5879* Functionality::
5880* Architecture::
5881* libgnunetpeerstore::
5882@end menu
5883
5884@node Functionality
5885@subsection Functionality
5886
5887@c %**end of header
5888
5889Subsystems can store any type of value under a (subsystem, peerid, key)
5890combination. A "replace" flag set during store operations forces the PEERSTORE
5891to replace any old values stored under the same (subsystem, peerid, key)
5892combination with the new value. Additionally, an expiry date is set after which
5893the record is *possibly* deleted by PEERSTORE.
5894
5895Subsystems can iterate over all values stored under any of the following
5896combination of fields:
5897
5898@itemize @bullet
5899@item (subsystem)
5900@item (subsystem, peerid)
5901@item (subsystem, key)
5902@item (subsystem, peerid, key)
5903@end itemize
5904
5905Subsystems can also request to be notified about any new values stored under a
5906(subsystem, peerid, key) combination by sending a "watch" request to
5907PEERSTORE.
5908
5909@node Architecture
5910@subsection Architecture
5911
5912@c %**end of header
5913
5914PEERSTORE implements the following components:
5915
5916@itemize @bullet
5917@item PEERSTORE service: Handles store, iterate and watch operations.
5918@item PEERSTORE API: API to be used by other subsystems to communicate and
5919issue commands to the PEERSTORE service.
5920@item PEERSTORE plugins: Handles the persistent storage. At the moment, only an
5921"sqlite" plugin is implemented.
5922@end itemize
5923
5924@node libgnunetpeerstore
5925@subsection libgnunetpeerstore
5926
5927@c %**end of header
5928
5929libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
5930wishing to communicate with the PEERSTORE service use this API to open a
5931connection to PEERSTORE. This is done by calling
5932@code{GNUNET_PEERSTORE_connect} which returns a handle to the newly created
5933connection. This handle has to be used with any further calls to the API.
5934
5935To store a new record, the function @code{GNUNET_PEERSTORE_store} is to be used
5936which requires the record fields and a continuation function that will be
5937called by the API after the STORE request is sent to the PEERSTORE service.
5938Note that calling the continuation function does not mean that the record is
5939successfully stored, only that the STORE request has been successfully sent to
5940the PEERSTORE service. @code{GNUNET_PEERSTORE_store_cancel} can be called to
5941cancel the STORE request only before the continuation function has been called.
5942
5943To iterate over stored records, the function @code{GNUNET_PEERSTORE_iterate} is
5944to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
5945callback function will be called with each matching record found and a NULL
5946record at the end to signal the end of result set.
5947@code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
5948request before the iterator callback is called with a NULL record.
5949
5950To be notified with new values stored under a (subsystem, peerid, key)
5951combination, the function @code{GNUNET_PEERSTORE_watch} is to be used. This
5952will register the watcher with the PEERSTORE service, any new records matching
5953the given combination will trigger the callback function passed to
5954@code{GNUNET_PEERSTORE_watch}. This continues until
5955@code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the service
5956is destroyed.
5957
5958After the connection is no longer needed, the function
5959@code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
5960PEERSTORE service. Any pending ITERATE or WATCH requests will be destroyed. If
5961the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will delay the
5962disconnection until all pending STORE requests are sent to the PEERSTORE
5963service, otherwise, the pending STORE requests will be destroyed as well.
5964
5965@node GNUnet's SET Subsystem
5966@section GNUnet's SET Subsystem
5967
5968@c %**end of header
5969
5970The SET service implements efficient set operations between two peers over a
5971mesh tunnel. Currently, set union and set intersection are the only supported
5972operations. Elements of a set consist of an @emph{element type} and arbitrary
5973binary @emph{data}. The size of an element's data is limited to around 62
5974KB.
5975
5976@menu
5977* Local Sets::
5978* Set Modifications::
5979* Set Operations::
5980* Result Elements::
5981* libgnunetset::
5982* The SET Client-Service Protocol::
5983* The SET Intersection Peer-to-Peer Protocol::
5984* The SET Union Peer-to-Peer Protocol::
5985@end menu
5986
5987@node Local Sets
5988@subsection Local Sets
5989
5990@c %**end of header
5991
5992Sets created by a local client can be modified and reused for multiple
5993operations. As each set operation requires potentially expensive special
5994auxilliary data to be computed for each element of a set, a set can only
5995participate in one type of set operation (i.e. union or intersection). The type
5996of a set is determined upon its creation. If a the elements of a set are needed
5997for an operation of a different type, all of the set's element must be copied
5998to a new set of appropriate type.
5999
6000@node Set Modifications
6001@subsection Set Modifications
6002
6003@c %**end of header
6004
6005Even when set operations are active, one can add to and remove elements from a
6006set. However, these changes will only be visible to operations that have been
6007created after the changes have taken place. That is, every set operation only
6008sees a snapshot of the set from the time the operation was started. This
6009mechanism is @emph{not} implemented by copying the whole set, but by attaching
6010@emph{generation information} to each element and operation.
6011
6012@node Set Operations
6013@subsection Set Operations
6014
6015@c %**end of header
6016
6017Set operations can be started in two ways: Either by accepting an operation
6018request from a remote peer, or by requesting a set operation from a remote
6019peer. Set operations are uniquely identified by the involved @emph{peers}, an
6020@emph{application id} and the @emph{operation type}.
6021
6022The client is notified of incoming set operations by @emph{set listeners}. A
6023set listener listens for incoming operations of a specific operation type and
6024application id. Once notified of an incoming set request, the client can
6025accept the set request (providing a local set for the operation) or reject
6026it.
6027
6028@node Result Elements
6029@subsection Result Elements
6030
6031@c %**end of header
6032
6033The SET service has three @emph{result modes} that determine how an operation's
6034result set is delivered to the client:
6035
6036@itemize @bullet
6037@item @strong{Full Result Set.} All elements of set resulting from the set
6038operation are returned to the client.
6039@item @strong{Added Elements.} Only elements that result from the operation and
6040are not already in the local peer's set are returned. Note that for some
6041operations (like set intersection) this result mode will never return any
6042elements. This can be useful if only the remove peer is actually interested in
6043the result of the set operation.
6044@item @strong{Removed Elements.} Only elements that are in the local peer's
6045initial set but not in the operation's result set are returned. Note that for
6046some operations (like set union) this result mode will never return any
6047elements. This can be useful if only the remove peer is actually interested in
6048the result of the set operation.
6049@end itemize
6050
6051@node libgnunetset
6052@subsection libgnunetset
6053
6054@c %**end of header
6055
6056@menu
6057* Sets::
6058* Listeners::
6059* Operations::
6060* Supplying a Set::
6061* The Result Callback::
6062@end menu
6063
6064@node Sets
6065@subsubsection Sets
6066
6067@c %**end of header
6068
6069New sets are created with @code{GNUNET_SET_create}. Both the local peer's
6070configuration (as each set has its own client connection) and the operation
6071type must be specified. The set exists until either the client calls
6072@code{GNUNET_SET_destroy} or the client's connection to the service is
6073disrupted. In the latter case, the client is notified by the return value of
6074functions dealing with sets. This return value must always be checked.
6075
6076Elements are added and removed with @code{GNUNET_SET_add_element} and
6077@code{GNUNET_SET_remove_element}.
6078
6079@node Listeners
6080@subsubsection Listeners
6081
6082@c %**end of header
6083
6084Listeners are created with @code{GNUNET_SET_listen}. Each time time a remote
6085peer suggests a set operation with an application id and operation type
6086matching a listener, the listener's callack is invoked. The client then must
6087synchronously call either @code{GNUNET_SET_accept} or @code{GNUNET_SET_reject}.
6088Note that the operation will not be started until the client calls
6089@code{GNUNET_SET_commit} (see Section "Supplying a Set").
6090
6091@node Operations
6092@subsubsection Operations
6093
6094@c %**end of header
6095
6096Operations to be initiated by the local peer are created with
6097@code{GNUNET_SET_prepare}. Note that the operation will not be started until
6098the client calls @code{GNUNET_SET_commit} (see Section "Supplying a
6099Set").
6100
6101@node Supplying a Set
6102@subsubsection Supplying a Set
6103
6104@c %**end of header
6105
6106To create symmetry between the two ways of starting a set operation (accepting
6107and nitiating it), the operation handles returned by @code{GNUNET_SET_accept}
6108and @code{GNUNET_SET_prepare} do not yet have a set to operate on, thus they
6109can not do any work yet.
6110
6111The client must call @code{GNUNET_SET_commit} to specify a set to use for an
6112operation. @code{GNUNET_SET_commit} may only be called once per set
6113operation.
6114
6115@node The Result Callback
6116@subsubsection The Result Callback
6117
6118@c %**end of header
6119
6120Clients must specify both a result mode and a result callback with
6121@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result callback
6122with a status indicating either that an element was received, or the operation
6123failed or succeeded. The interpretation of the received element depends on the
6124result mode. The callback needs to know which result mode it is used in, as the
6125arguments do not indicate if an element is part of the full result set, or if
6126it is in the difference between the original set and the final set.
6127
6128@node The SET Client-Service Protocol
6129@subsection The SET Client-Service Protocol
6130
6131@c %**end of header
6132
6133@menu
6134* Creating Sets::
6135* Listeners2::
6136* Initiating Operations::
6137* Modifying Sets::
6138* Results and Operation Status::
6139* Iterating Sets::
6140@end menu
6141
6142@node Creating Sets
6143@subsubsection Creating Sets
6144
6145@c %**end of header
6146
6147For each set of a client, there exists a client connection to the service. Sets
6148are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message over a new
6149client connection. Multiple operations for one set are multiplexed over one
6150client connection, using a request id supplied by the client.
6151
6152@node Listeners2
6153@subsubsection Listeners2
6154
6155@c %**end of header
6156
6157Each listener also requires a seperate client connection. By sending the
6158@code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service of
6159the application id and operation type it is interested in. A client rejects an
6160incoming request by sending @code{GNUNET_SERVICE_SET_REJECT} on the listener's
6161client connection. In contrast, when accepting an incoming request, a a
6162@code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that is
6163supplied for the set operation.
6164
6165@node Initiating Operations
6166@subsubsection Initiating Operations
6167
6168@c %**end of header
6169
6170Operations with remote peers are initiated by sending a
6171@code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
6172connection that this message is sent by determines the set to use.
6173
6174@node Modifying Sets
6175@subsubsection Modifying Sets
6176
6177@c %**end of header
6178
6179Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
6180@code{GNUNET_SERVICE_SET_REMOVE} messages.
6181
6182
6183@c %@menu
6184@c %* Results and Operation Status::
6185@c %* Iterating Sets::
6186@c %@end menu
6187
6188@node Results and Operation Status
6189@subsubsection Results and Operation Status
6190@c %**end of header
6191
6192The service notifies the client of result elements and success/failure of a set
6193operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
6194
6195@node Iterating Sets
6196@subsubsection Iterating Sets
6197
6198@c %**end of header
6199
6200All elements of a set can be requested by sending
6201@code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
6202@code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the iteration
6203with @code{GNUNET_SERVICE_SET_ITER_DONE}. After each received element, the
6204client@ must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
6205iteration may be active for a set at any given time.
6206
6207@node The SET Intersection Peer-to-Peer Protocol
6208@subsection The SET Intersection Peer-to-Peer Protocol
6209
6210@c %**end of header
6211
6212The intersection protocol operates over CADET and starts with a
6213GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
6214the operation to the peer listening for inbound requests. It includes the
6215number of elements of the initiating peer, which is used to decide which side
6216will send a Bloom filter first.
6217
6218The listening peer checks if the operation type and application identifier are
6219acceptable for its current state. If not, it responds with a
6220GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
6221terminates the CADET channel).
6222
6223If the application accepts the request, the listener sends back a@
6224GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO if it has more elements
6225in the set than the client. Otherwise, it immediately starts with the Bloom
6226filter exchange. If the initiator receives a
6227GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO response, it beings the
6228Bloom filter exchange, unless the set size is indicated to be zero, in which
6229case the intersection is considered finished after just the initial
6230handshake.
6231
6232
6233@menu
6234* The Bloom filter exchange::
6235* Salt::
6236@end menu
6237
6238@node The Bloom filter exchange
6239@subsubsection The Bloom filter exchange
6240
6241@c %**end of header
6242
6243In this phase, each peer transmits a Bloom filter over the remaining keys of
6244the local set to the other peer using a
6245GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF message. This message additionally
6246includes the number of elements left in the sender's set, as well as the XOR
6247over all of the keys in that set.
6248
6249The number of bits 'k' set per element in the Bloom filter is calculated based
6250on the relative size of the two sets. Furthermore, the size of the Bloom filter
6251is calculated based on 'k' and the number of elements in the set to maximize
6252the amount of data filtered per byte transmitted on the wire (while avoiding an
6253excessively high number of iterations).
6254
6255The receiver of the message removes all elements from its local set that do not
6256pass the Bloom filter test. It then checks if the set size of the sender and
6257the XOR over the keys match what is left of his own set. If they do, he sends
6258a@ GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE back to indicate that the
6259latest set is the final result. Otherwise, the receiver starts another Bloom
6260fitler exchange, except this time as the sender.
6261
6262@node Salt
6263@subsubsection Salt
6264
6265@c %**end of header
6266
6267Bloomfilter operations are probablistic: With some non-zero probability the
6268test may incorrectly say an element is in the set, even though it is not.
6269
6270To mitigate this problem, the intersection protocol iterates exchanging Bloom
6271filters using a different random 32-bit salt in each iteration (the salt is
6272also included in the message). With different salts, set operations may fail
6273for different elements. Merging the results from the executions, the
6274probability of failure drops to zero.
6275
6276The iterations terminate once both peers have established that they have sets
6277of the same size, and where the XOR over all keys computes the same 512-bit
6278value (leaving a failure probability of 2-511).
6279
6280@node The SET Union Peer-to-Peer Protocol
6281@subsection The SET Union Peer-to-Peer Protocol
6282
6283@c %**end of header
6284
6285The SET union protocol is based on Eppstein's efficient set reconciliation
6286without prior context. You should read this paper first if you want to
6287understand the protocol.
6288
6289The union protocol operates over CADET and starts with a
6290GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
6291the operation to the peer listening for inbound requests. It includes the
6292number of elements of the initiating peer, which is currently not used.
6293
6294The listening peer checks if the operation type and application identifier are
6295acceptable for its current state. If not, it responds with a
6296GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
6297terminates the CADET channel).
6298
6299If the application accepts the request, it sends back a strata estimator using
6300a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The initiator evaluates
6301the strata estimator and initiates the exchange of invertible Bloom filters,
6302sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6303
6304During the IBF exchange, if the receiver cannot invert the Bloom filter or
6305detects a cycle, it sends a larger IBF in response (up to a defined maximum
6306limit; if that limit is reached, the operation fails). Elements decoded while
6307processing the IBF are transmitted to the other peer using
6308GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the other peer using
6309GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, depending on the sign
6310observed during decoding of the IBF. Peers respond to a
6311GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message with the respective
6312element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS message. If the IBF fully
6313decodes, the peer responds with a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE
6314message instead of another GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
6315
6316All Bloom filter operations use a salt to mingle keys before hasing them into
6317buckets, such that future iterations have a fresh chance of succeeding if they
6318failed due to collisions before.
6319
6320@node GNUnet's STATISTICS subsystem
6321@section GNUnet's STATISTICS subsystem
6322
6323@c %**end of header
6324
6325In GNUnet, the STATISTICS subsystem offers a central place for all subsystems
6326to publish unsigned 64-bit integer run-time statistics. Keeping this
6327information centrally means that there is a unified way for the user to obtain
6328data on all subsystems, and individual subsystems do not have to always include
6329a custom data export method for performance metrics and other statistics. For
6330example, the TRANSPORT system uses STATISTICS to update information about the
6331number of directly connected peers and the bandwidth that has been consumed by
6332the various plugins. This information is valuable for diagnosing connectivity
6333and performance issues.
6334
6335Following the GNUnet service architecture, the STATISTICS subsystem is divided
6336into an API which is exposed through the header
6337@strong{gnunet_statistics_service.h} and the STATISTICS service
6338@strong{gnunet-service-statistics}. The @strong{gnunet-statistics} command-line
6339tool can be used to obtain (and change) information about the values stored by
6340the STATISTICS service. The STATISTICS service does not communicate with other
6341peers.
6342
6343Data is stored in the STATISTICS service in the form of tuples
6344@strong{(subsystem, name, value, persistence)}. The subsystem determines to
6345which other GNUnet's subsystem the data belongs. name is the name through which
6346value is associated. It uniquely identifies the record from among other records
6347belonging to the same subsystem. In some parts of the code, the pair
6348@strong{(subsystem, name)} is called a @strong{statistic} as it identifies the
6349values stored in the STATISTCS service.The persistence flag determines if the
6350record has to be preserved across service restarts. A record is said to be
6351persistent if this flag is set for it; if not, the record is treated as a
6352non-persistent record and it is lost after service restart. Persistent records
6353are written to and read from the file @strong{statistics.data} before shutdown
6354and upon startup. The file is located in the HOME directory of the peer.
6355
6356An anomaly of the STATISTICS service is that it does not terminate immediately
6357upon receiving a shutdown signal if it has any clients connected to it. It
6358waits for all the clients that are not monitors to close their connections
6359before terminating itself. This is to prevent the loss of data during peer
6360shutdown --- delaying the STATISTICS service shutdown helps other services to
6361store important data to STATISTICS during shutdown.
6362
6363@menu
6364* libgnunetstatistics::
6365* The STATISTICS Client-Service Protocol::
6366@end menu
6367
6368@node libgnunetstatistics
6369@subsection libgnunetstatistics
6370
6371@c %**end of header
6372
6373@strong{libgnunetstatistics} is the library containing the API for the
6374STATISTICS subsystem. Any process requiring to use STATISTICS should use this
6375API by to open a connection to the STATISTICS service. This is done by calling
6376the function @code{GNUNET_STATISTICS_create()}. This function takes the
6377subsystem's name which is trying to use STATISTICS and a configuration. All
6378values written to STATISTICS with this connection will be placed in the section
6379corresponding to the given subsystem's name. The connection to STATISTICS can
6380be destroyed with the function GNUNET_STATISTICS_destroy(). This function
6381allows for the connection to be destroyed immediately or upon transferring all
6382pending write requests to the service.
6383
6384Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
6385under the @code{[STATISTICS]} section in the configuration. With such a
6386configuration all calls to @code{GNUNET_STATISTICS_create()} return @code{NULL}
6387as the STATISTICS subsystem is unavailable and no other functions from the API
6388can be used.
6389
6390
6391@menu
6392* Statistics retrieval::
6393* Setting statistics and updating them::
6394* Watches::
6395@end menu
6396
6397@node Statistics retrieval
6398@subsubsection Statistics retrieval
6399
6400@c %**end of header
6401
6402Once a connection to the statistics service is obtained, information about any
6403other system which uses statistics can be retrieved with the function
6404GNUNET_STATISTICS_get(). This function takes the connection handle, the name of
6405the subsystem whose information we are interested in (a @code{NULL} value will
6406retrieve information of all available subsystems using STATISTICS), the name of
6407the statistic we are interested in (a @code{NULL} value will retrieve all
6408available statistics), a continuation callback which is called when all of
6409requested information is retrieved, an iterator callback which is called for
6410each parameter in the retrieved information and a closure for the
6411aforementioned callbacks. The library then invokes the iterator callback for
6412each value matching the request.
6413
6414Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be canceled with
6415the function @code{GNUNET_STATISTICS_get_cancel()}. This is helpful when
6416retrieving statistics takes too long and especially when we want to shutdown
6417and cleanup everything.
6418
6419@node Setting statistics and updating them
6420@subsubsection Setting statistics and updating them
6421
6422@c %**end of header
6423
6424So far we have seen how to retrieve statistics, here we will learn how we can
6425set statistics and update them so that other subsystems can retrieve them.
6426
6427A new statistic can be set using the function @code{GNUNET_STATISTICS_set()}.
6428This function takes the name of the statistic and its value and a flag to make
6429the statistic persistent. The value of the statistic should be of the type
6430@code{uint64_t}. The function does not take the name of the subsystem; it is
6431determined from the previous @code{GNUNET_STATISTICS_create()} invocation. If
6432the given statistic is already present, its value is overwritten.
6433
6434An existing statistics can be updated, i.e its value can be increased or
6435decreased by an amount with the function @code{GNUNET_STATISTICS_update()}. The
6436parameters to this function are similar to @code{GNUNET_STATISTICS_set()},
6437except that it takes the amount to be changed as a type @code{int64_t} instead
6438of the value.
6439
6440The library will combine multiple set or update operations into one message if
6441the client performs requests at a rate that is faster than the available IPC
6442with the STATISTICS service. Thus, the client does not have to worry about
6443sending requests too quickly.
6444
6445@node Watches
6446@subsubsection Watches
6447
6448@c %**end of header
6449
6450As interesting feature of STATISTICS lies in serving notifications whenever a
6451statistic of our interest is modified. This is achieved by registering a watch
6452through the function @code{GNUNET_STATISTICS_watch()}. The parameters of this
6453function are similar to those of @code{GNUNET_STATISTICS_get()}. Changes to the
6454respective statistic's value will then cause the given iterator callback to be
6455called. Note: A watch can only be registered for a specific statistic. Hence
6456the subsystem name and the parameter name cannot be @code{NULL} in a call to
6457@code{GNUNET_STATISTICS_watch()}.
6458
6459A registered watch will keep notifying any value changes until
6460@code{GNUNET_STATISTICS_watch_cancel()} is called with the same parameters that
6461are used for registering the watch.
6462
6463@node The STATISTICS Client-Service Protocol
6464@subsection The STATISTICS Client-Service Protocol
6465@c %**end of header
6466
6467
6468@menu
6469* Statistics retrieval2::
6470* Setting and updating statistics::
6471* Watching for updates::
6472@end menu
6473
6474@node Statistics retrieval2
6475@subsubsection Statistics retrieval2
6476
6477@c %**end of header
6478
6479To retrieve statistics, the client transmits a message of type
6480@code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem name
6481and statistic parameter to the STATISTICS service. The service responds with a
6482message of type @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the
6483statistics parameters that match the client request for the client. The end of
6484information retrieved is signaled by the service by sending a message of type
6485@code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
6486
6487@node Setting and updating statistics
6488@subsubsection Setting and updating statistics
6489
6490@c %**end of header
6491
6492The subsystem name, parameter name, its value and the persistence flag are
6493communicated to the service through the message
6494@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
6495
6496When the service receives a message of type
6497@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem name and
6498checks for a statistic parameter with matching the name given in the message.
6499If a statistic parameter is found, the value is overwritten by the new value
6500from the message; if not found then a new statistic parameter is created with
6501the given name and value.
6502
6503In addition to just setting an absolute value, it is possible to perform a
6504relative update by sending a message of type
6505@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
6506(@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in the
6507message should be treated as an update value.
6508
6509@node Watching for updates
6510@subsubsection Watching for updates
6511
6512@c %**end of header
6513
6514The function registers the watch at the service by sending a message of type
6515@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
6516notifications through messages of type
6517@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
6518parameter's value is changed.
6519
6520@node GNUnet's Distributed Hash Table (DHT)
6521@section GNUnet's Distributed Hash Table (DHT)
6522
6523@c %**end of header
6524
6525GNUnet includes a generic distributed hash table that can be used by developers
6526building P2P applications in the framework. This section documents high-level
6527features and how developers are expected to use the DHT. We have a research
6528paper detailing how the DHT works. Also, Nate's thesis includes a detailed
6529description and performance analysis (in chapter 6).
6530
6531Key features of GNUnet's DHT include:
6532
6533@itemize @bullet
6534@item stores key-value pairs with values up to (approximately) 63k in size
6535@item works with many underlay network topologies (small-world, random graph),
6536underlay does not need to be a full mesh / clique
6537@item support for extended queries (more than just a simple 'key'), filtering
6538duplicate replies within the network (bloomfilter) and content validation (for
6539details, please read the subsection on the block library)
6540@item can (optionally) return paths taken by the PUT and GET operations to the
6541application
6542@item provides content replication to handle churn
6543@end itemize
6544
6545GNUnet's DHT is randomized and unreliable. Unreliable means that there is no
6546strict guarantee that a value stored in the DHT is always found --- values are
6547only found with high probability. While this is somewhat true in all P2P DHTs,
6548GNUnet developers should be particularly wary of this fact (this will help you
6549write secure, fault-tolerant code). Thus, when writing any application using
6550the DHT, you should always consider the possibility that a value stored in the
6551DHT by you or some other peer might simply not be returned, or returned with a
6552significant delay. Your application logic must be written to tolerate this
6553(naturally, some loss of performance or quality of service is expected in this
6554case).
6555
6556@menu
6557* Block library and plugins::
6558* libgnunetdht::
6559* The DHT Client-Service Protocol::
6560* The DHT Peer-to-Peer Protocol::
6561@end menu
6562
6563@node Block library and plugins
6564@subsection Block library and plugins
6565
6566@c %**end of header
6567
6568@menu
6569* What is a Block?::
6570* The API of libgnunetblock::
6571* Queries::
6572* Sample Code::
6573* Conclusion2::
6574@end menu
6575
6576@node What is a Block?
6577@subsubsection What is a Block?
6578
6579@c %**end of header
6580
6581Blocks are small (< 63k) pieces of data stored under a key (struct
6582GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
6583their data format. Blocks are used in GNUnet as units of static data exchanged
6584between peers and stored (or cached) locally. Uses of blocks include
6585file-sharing (the files are broken up into blocks), the VPN (DNS information is
6586stored in blocks) and the DHT (all information in the DHT and meta-information
6587for the maintenance of the DHT are both stored using blocks). The block
6588subsystem provides a few common functions that must be available for any type
6589of block.
6590
6591@node The API of libgnunetblock
6592@subsubsection The API of libgnunetblock
6593
6594@c %**end of header
6595
6596The block library requires for each (family of) block type(s) a block plugin
6597(implementing gnunet_block_plugin.h) that provides basic functions that are
6598needed by the DHT (and possibly other subsystems) to manage the block. These
6599block plugins are typically implemented within their respective subsystems.@
6600The main block library is then used to locate, load and query the appropriate
6601block plugin. Which plugin is appropriate is determined by the block type
6602(which is just a 32-bit integer). Block plugins contain code that specifies
6603which block types are supported by a given plugin. The block library loads all
6604block plugins that are installed at the local peer and forwards the application
6605request to the respective plugin.
6606
6607The central functions of the block APIs (plugin and main library) are to allow
6608the mapping of blocks to their respective key (if possible) and the ability to
6609check that a block is well-formed and matches a given request (again, if
6610possible). This way, GNUnet can avoid storing invalid blocks, storing blocks
6611under the wrong key and forwarding blocks in response to a query that they do
6612not answer.
6613
6614One key function of block plugins is that it allows GNUnet to detect duplicate
6615replies (via the Bloom filter). All plugins MUST support detecting duplicate
6616replies (by adding the current response to the Bloom filter and rejecting it if
6617it is encountered again). If a plugin fails to do this, responses may loop in
6618the network.
6619
6620@node Queries
6621@subsubsection Queries
6622@c %**end of header
6623
6624The query format for any block in GNUnet consists of four main components.
6625First, the type of the desired block must be specified. Second, the query must
6626contain a hash code. The hash code is used for lookups in hash tables and
6627databases and must not be unique for the block (however, if possible a unique
6628hash should be used as this would be best for performance). Third, an optional
6629Bloom filter can be specified to exclude known results; replies that hash to
6630the bits set in the Bloom filter are considered invalid. False-positives can be
6631eliminated by sending the same query again with a different Bloom filter
6632mutator value, which parameterizes the hash function that is used. Finally, an
6633optional application-specific "eXtended query" (xquery) can be specified to
6634further constrain the results. It is entirely up to the type-specific plugin to
6635determine whether or not a given block matches a query (type, hash, Bloom
6636filter, and xquery). Naturally, not all xquery's are valid and some types of
6637blocks may not support Bloom filters either, so the plugin also needs to check
6638if the query is valid in the first place.
6639
6640Depending on the results from the plugin, the DHT will then discard the
6641(invalid) query, forward the query, discard the (invalid) reply, cache the
6642(valid) reply, and/or forward the (valid and non-duplicate) reply.
6643
6644@node Sample Code
6645@subsubsection Sample Code
6646
6647@c %**end of header
6648
6649The source code in @strong{plugin_block_test.c} is a good starting point for
6650new block plugins --- it does the minimal work by implementing a plugin that
6651performs no validation at all. The respective @strong{Makefile.am} shows how to
6652build and install a block plugin.
6653
6654@node Conclusion2
6655@subsubsection Conclusion2
6656
6657@c %**end of header
6658
6659In conclusion, GNUnet subsystems that want to use the DHT need to define a
6660block format and write a plugin to match queries and replies. For testing, the
6661"GNUNET_BLOCK_TYPE_TEST" block type can be used; it accepts any query as valid
6662and any reply as matching any query. This type is also used for the DHT command
6663line tools. However, it should NOT be used for normal applications due to the
6664lack of error checking that results from this primitive implementation.
6665
6666@node libgnunetdht
6667@subsection libgnunetdht
6668
6669@c %**end of header
6670
6671The DHT API itself is pretty simple and offers the usual GET and PUT functions
6672that work as expected. The specified block type refers to the block library
6673which allows the DHT to run application-specific logic for data stored in the
6674network.
6675
6676
6677@menu
6678* GET::
6679* PUT::
6680* MONITOR::
6681* DHT Routing Options::
6682@end menu
6683
6684@node GET
6685@subsubsection GET
6686
6687@c %**end of header
6688
6689When using GET, the main consideration for developers (other than the block
6690library) should be that after issuing a GET, the DHT will continuously cause
6691(small amounts of) network traffic until the operation is explicitly canceled.
6692So GET does not simply send out a single network request once; instead, the
6693DHT will continue to search for data. This is needed to achieve good success
6694rates and also handles the case where the respective PUT operation happens
6695after the GET operation was started. Developers should not cancel an existing
6696GET operation and then explicitly re-start it to trigger a new round of
6697network requests; this is simply inefficient, especially as the internal
6698automated version can be more efficient, for example by filtering results in
6699the network that have already been returned.
6700
6701If an application that performs a GET request has a set of replies that it
6702already knows and would like to filter, it can call@
6703@code{GNUNET_DHT_get_filter_known_results} with an array of hashes over the
6704respective blocks to tell the DHT that these results are not desired (any
6705more). This way, the DHT will filter the respective blocks using the block
6706library in the network, which may result in a significant reduction in
6707bandwidth consumption.
6708
6709@node PUT
6710@subsubsection PUT
6711
6712@c %**end of header
6713
6714In contrast to GET operations, developers @strong{must} manually re-run PUT
6715operations periodically (if they intend the content to continue to be
6716available). Content stored in the DHT expires or might be lost due to churn.
6717Furthermore, GNUnet's DHT typically requires multiple rounds of PUT operations
6718before a key-value pair is consistently available to all peers (the DHT
6719randomizes paths and thus storage locations, and only after multiple rounds of
6720PUTs there will be a sufficient number of replicas in large DHTs). An explicit
6721PUT operation using the DHT API will only cause network traffic once, so in
6722order to ensure basic availability and resistance to churn (and adversaries),
6723PUTs must be repeated. While the exact frequency depends on the application, a
6724rule of thumb is that there should be at least a dozen PUT operations within
6725the content lifetime. Content in the DHT typically expires after one day, so
6726DHT PUT operations should be repeated at least every 1-2 hours.
6727
6728@node MONITOR
6729@subsubsection MONITOR
6730
6731@c %**end of header
6732
6733The DHT API also allows applications to monitor messages crossing the local
6734DHT service. The types of messages used by the DHT are GET, PUT and RESULT
6735messages. Using the monitoring API, applications can choose to monitor these
6736requests, possibly limiting themselves to requests for a particular block
6737type.
6738
6739The monitoring API is not only usefu only for diagnostics, it can also be used
6740to trigger application operations based on PUT operations. For example, an
6741application may use PUTs to distribute work requests to other peers. The
6742workers would then monitor for PUTs that give them work, instead of looking
6743for work using GET operations. This can be beneficial, especially if the
6744workers have no good way to guess the keys under which work would be stored.
6745Naturally, additional protocols might be needed to ensure that the desired
6746number of workers will process the distributed workload.
6747
6748@node DHT Routing Options
6749@subsubsection DHT Routing Options
6750
6751@c %**end of header
6752
6753There are two important options for GET and PUT requests:
6754
6755@table @asis
6756@item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all peers
6757should process the request, even if their peer ID is not closest to the key.
6758For a PUT request, this means that all peers that a request traverses may make
6759a copy of the data. Similarly for a GET request, all peers will check their
6760local database for a result. Setting this option can thus significantly improve
6761caching and reduce bandwidth consumption --- at the expense of a larger DHT
6762database. If in doubt, we recommend that this option should be used.
6763@item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record the path
6764that a GET or a PUT request is taking through the overlay network. The
6765resulting paths are then returned to the application with the respective
6766result. This allows the receiver of a result to construct a path to the
6767originator of the data, which might then be used for routing. Naturally,
6768setting this option requires additional bandwidth and disk space, so
6769applications should only set this if the paths are needed by the application
6770logic.
6771@item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
6772the DHT's peer discovery mechanism and should not be used by applications.
6773@item GNUNET_DHT_RO_BART This option is currently not implemented. It may in
6774the future offer performance improvements for clique topologies.
6775@end table
6776
6777@node The DHT Client-Service Protocol
6778@subsection The DHT Client-Service Protocol
6779
6780@c %**end of header
6781
6782@menu
6783* PUTting data into the DHT::
6784* GETting data from the DHT::
6785* Monitoring the DHT::
6786@end menu
6787
6788@node PUTting data into the DHT
6789@subsubsection PUTting data into the DHT
6790
6791@c %**end of header
6792
6793To store (PUT) data into the DHT, the client sends a@ @code{struct
6794GNUNET_DHT_ClientPutMessage} to the service. This message specifies the block
6795type, routing options, the desired replication level, the expiration time, key,
6796value and a 64-bit unique ID for the operation. The service responds with a@
6797@code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same 64-bit
6798unique ID. Note that the service sends the confirmation as soon as it has
6799locally processed the PUT request. The PUT may still be propagating through the
6800network at this time.
6801
6802In the future, we may want to change this to provide (limited) feedback to the
6803client, for example if we detect that the PUT operation had no effect because
6804the same key-value pair was already stored in the DHT. However, changing this
6805would also require additional state and messages in the P2P
6806interaction.
6807
6808@node GETting data from the DHT
6809@subsubsection GETting data from the DHT
6810
6811@c %**end of header
6812
6813To retrieve (GET) data from the DHT, the client sends a@ @code{struct
6814GNUNET_DHT_ClientGetMessage} to the service. The message specifies routing
6815options, a replication level (for replicating the GET, not the content), the
6816desired block type, the key, the (optional) extended query and unique 64-bit
6817request ID.
6818
6819Additionally, the client may send any number of@ @code{struct
6820GNUNET_DHT_ClientGetResultSeenMessage}s to notify the service about results
6821that the client is already aware of. These messages consist of the key, the
6822unique 64-bit ID of the request, and an arbitrary number of hash codes over the
6823blocks that the client is already aware of. As messages are restricted to 64k,
6824a client that already knows more than about a thousand blocks may need to send
6825several of these messages. Naturally, the client should transmit these messages
6826as quickly as possible after the original GET request such that the DHT can
6827filter those results in the network early on. Naturally, as these messages are
6828send after the original request, it is conceivalbe that the DHT service may
6829return blocks that match those already known to the client anyway.
6830
6831In response to a GET request, the service will send @code{struct
6832GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
6833block type, expiration, key, unique ID of the request and of course the value
6834(a block). Depending on the options set for the respective operations, the
6835replies may also contain the path the GET and/or the PUT took through the
6836network.
6837
6838A client can stop receiving replies either by disconnecting or by sending a
6839@code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the key and
6840the 64-bit unique ID of the original request. Using an explicit "stop" message
6841is more common as this allows a client to run many concurrent GET operations
6842over the same connection with the DHT service --- and to stop them
6843individually.
6844
6845@node Monitoring the DHT
6846@subsubsection Monitoring the DHT
6847
6848@c %**end of header
6849
6850To begin monitoring, the client sends a @code{struct
6851GNUNET_DHT_MonitorStartStop} message to the DHT service. In this message, flags
6852can be set to enable (or disable) monitoring of GET, PUT and RESULT messages
6853that pass through a peer. The message can also restrict monitoring to a
6854particular block type or a particular key. Once monitoring is enabled, the DHT
6855service will notify the client about any matching event using @code{struct
6856GNUNET_DHT_MonitorGetMessage}s for GET events, @code{struct
6857GNUNET_DHT_MonitorPutMessage} for PUT events and@ @code{struct
6858GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of these messages contains
6859all of the information about the event.
6860
6861@node The DHT Peer-to-Peer Protocol
6862@subsection The DHT Peer-to-Peer Protocol
6863@c %**end of header
6864
6865
6866@menu
6867* Routing GETs or PUTs::
6868* PUTting data into the DHT2::
6869* GETting data from the DHT2::
6870@end menu
6871
6872@node Routing GETs or PUTs
6873@subsubsection Routing GETs or PUTs
6874
6875@c %**end of header
6876
6877When routing GETs or PUTs, the DHT service selects a suitable subset of
6878neighbours for forwarding. The exact number of neighbours can be zero or more
6879and depends on the hop counter of the query (initially zero) in relation to the
6880(log of) the network size estimate, the desired replication level and the
6881peer's connectivity. Depending on the hop counter and our network size
6882estimate, the selection of the peers maybe randomized or by proximity to the
6883key. Furthermore, requests include a set of peers that a request has already
6884traversed; those peers are also excluded from the selection.
6885
6886@node PUTting data into the DHT2
6887@subsubsection PUTting data into the DHT2
6888
6889@c %**end of header
6890
6891To PUT data into the DHT, the service sends a @code{struct PeerPutMessage} of
6892type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective neighbour. In
6893addition to the usual information about the content (type, routing options,
6894desired replication level for the content, expiration time, key and value), the
6895message contains a fixed-size Bloom filter with information about which peers
6896(may) have already seen this request. This Bloom filter is used to ensure that
6897DHT messages never loop back to a peer that has already processed the request.
6898Additionally, the message includes the current hop counter and, depending on
6899the routing options, the message may include the full path that the message has
6900taken so far. The Bloom filter should already contain the identity of the
6901previous hop; however, the path should not include the identity of the previous
6902hop and the receiver should append the identity of the sender to the path, not
6903its own identity (this is done to reduce bandwidth).
6904
6905@node GETting data from the DHT2
6906@subsubsection GETting data from the DHT2
6907
6908@c %**end of header
6909
6910A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
6911@code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the usual
6912information about the request (type, routing options, desired replication level
6913for the request, the key and the extended query), a GET request also again
6914contains a hop counter, a Bloom filter over the peers that have processed the
6915request already and depending on the routing options the full path traversed by
6916the GET. Finally, a GET request includes a variable-size second Bloom filter
6917and a so-called Bloom filter mutator value which together indicate which
6918replies the sender has already seen. During the lookup, each block that matches
6919they block type, key and extended query is additionally subjected to a test
6920against this Bloom filter. The block plugin is expected to take the hash of the
6921block and combine it with the mutator value and check if the result is not yet
6922in the Bloom filter. The originator of the query will from time to time modify
6923the mutator to (eventually) allow false-positives filtered by the Bloom filter
6924to be returned.
6925
6926Peers that receive a GET request perform a local lookup (depending on their
6927proximity to the key and the query options) and forward the request to other
6928peers. They then remember the request (including the Bloom filter for blocking
6929duplicate results) and when they obtain a matching, non-filtered response a
6930@code{struct PeerResultMessage} of type@
6931@code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous hop.
6932Whenver a result is forwarded, the block plugin is used to update the Bloom
6933filter accordingly, to ensure that the same result is never forwarded more than
6934once. The DHT service may also cache forwarded results locally if the
6935"CACHE_RESULTS" option is set to "YES" in the configuration.
6936
6937@node The GNU Name System (GNS)
6938@section The GNU Name System (GNS)
6939
6940@c %**end of header
6941
6942The GNU Name System (GNS) is a decentralized database that enables users to
6943securely resolve names to values. Names can be used to identify other users
6944(for example, in social networking), or network services (for example, VPN
6945services running at a peer in GNUnet, or purely IP-based services on the
6946Internet). Users interact with GNS by typing in a hostname that ends in ".gnu"
6947or ".zkey".
6948
6949Videos giving an overview of most of the GNS and the motivations behind it is
6950available here and here. The remainder of this chapter targets developers that
6951are familiar with high level concepts of GNS as presented in these talks.
6952
6953GNS-aware applications should use the GNS resolver to obtain the respective
6954records that are stored under that name in GNS. Each record consists of a type,
6955value, expiration time and flags.
6956
6957The type specifies the format of the value. Types below 65536 correspond to DNS
6958record types, larger values are used for GNS-specific records. Applications can
6959define new GNS record types by reserving a number and implementing a plugin
6960(which mostly needs to convert the binary value representation to a
6961human-readable text format and vice-versa). The expiration time specifies how
6962long the record is to be valid. The GNS API ensures that applications are only
6963given non-expired values. The flags are typically irrelevant for applications,
6964as GNS uses them internally to control visibility and validity of records.
6965
6966Records are stored along with a signature. The signature is generated using the
6967private key of the authoritative zone. This allows any GNS resolver to verify
6968the correctness of a name-value mapping.
6969
6970Internally, GNS uses the NAMECACHE to cache information obtained from other
6971users, the NAMESTORE to store information specific to the local users, and the
6972DHT to exchange data between users. A plugin API is used to enable applications
6973to define new GNS record types.
6974
6975@menu
6976* libgnunetgns::
6977* libgnunetgnsrecord::
6978* GNS plugins::
6979* The GNS Client-Service Protocol::
6980* Hijacking the DNS-Traffic using gnunet-service-dns::
6981* Serving DNS lookups via GNS on W32::
6982@end menu
6983
6984@node libgnunetgns
6985@subsection libgnunetgns
6986
6987@c %**end of header
6988
6989The GNS API itself is extremely simple. Clients first connec to the GNS service
6990using @code{GNUNET_GNS_connect}. They can then perform lookups using
6991@code{GNUNET_GNS_lookup} or cancel pending lookups using
6992@code{GNUNET_GNS_lookup_cancel}. Once finished, clients disconnect using
6993@code{GNUNET_GNS_disconnect}.
6994
6995
6996@menu
6997* Looking up records::
6998* Accessing the records::
6999* Creating records::
7000* Future work::
7001@end menu
7002
7003@node Looking up records
7004@subsubsection Looking up records
7005
7006@c %**end of header
7007
7008@code{GNUNET_GNS_lookup} takes a number of arguments:
7009
7010@table @asis
7011@item handle This is simply the GNS connection handle from
7012@code{GNUNET_GNS_connect}.
7013@item name The client needs to specify the name to
7014be resolved. This can be any valid DNS or GNS hostname.
7015@item zone The client
7016needs to specify the public key of the GNS zone against which the resolution
7017should be done (the ".gnu" zone). Note that a key must be provided, even if the
7018name ends in ".zkey". This should typically be the public key of the
7019master-zone of the user.
7020@item type This is the desired GNS or DNS record type
7021to look for. While all records for the given name will be returned, this can be
7022important if the client wants to resolve record types that themselves delegate
7023resolution, such as CNAME, PKEY or GNS2DNS. Resolving a record of any of these
7024types will only work if the respective record type is specified in the request,
7025as the GNS resolver will otherwise follow the delegation and return the records
7026from the respective destination, instead of the delegating record.
7027@item only_cached This argument should typically be set to @code{GNUNET_NO}. Setting
7028it to @code{GNUNET_YES} disables resolution via the overlay network.
7029@item shorten_zone_key If GNS encounters new names during resolution, their
7030respective zones can automatically be learned and added to the "shorten zone".
7031If this is desired, clients must pass the private key of the shorten zone. If
7032NULL is passed, shortening is disabled.
7033@item proc This argument identifies
7034the function to call with the result. It is given proc_cls, the number of
7035records found (possilby zero) and the array of the records as arguments. proc
7036will only be called once. After proc,> has been called, the lookup must no
7037longer be cancelled.
7038@item proc_cls The closure for proc.
7039@end table
7040
7041@node Accessing the records
7042@subsubsection Accessing the records
7043
7044@c %**end of header
7045
7046The @code{libgnunetgnsrecord} library provides an API to manipulate the GNS
7047record array that is given to proc. In particular, it offers functions such as
7048converting record values to human-readable strings (and back). However, most
7049@code{libgnunetgnsrecord} functions are not interesting to GNS client
7050applications.
7051
7052For DNS records, the @code{libgnunetdnsparser} library provides functions for
7053parsing (and serializing) common types of DNS records.
7054
7055@node Creating records
7056@subsubsection Creating records
7057
7058@c %**end of header
7059
7060Creating GNS records is typically done by building the respective record
7061information (possibly with the help of @code{libgnunetgnsrecord} and
7062@code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
7063publish the information. The GNS API is not involved in this
7064operation.
7065
7066@node Future work
7067@subsubsection Future work
7068
7069@c %**end of header
7070
7071In the future, we want to expand @code{libgnunetgns} to allow applications to
7072observe shortening operations performed during GNS resolution, for example so
7073that users can receive visual feedback when this happens.
7074
7075@node libgnunetgnsrecord
7076@subsection libgnunetgnsrecord
7077
7078@c %**end of header
7079
7080The @code{libgnunetgnsrecord} library is used to manipulate GNS records (in
7081plaintext or in their encrypted format). Applications mostly interact with
7082@code{libgnunetgnsrecord} by using the functions to convert GNS record values
7083to strings or vice-versa, or to lookup a GNS record type number by name (or
7084vice-versa). The library also provides various other functions that are mostly
7085used internally within GNS, such as converting keys to names, checking for
7086expiration, encrypting GNS records to GNS blocks, verifying GNS block
7087signatures and decrypting GNS records from GNS blocks.
7088
7089We will now discuss the four commonly used functions of the API.@
7090@code{libgnunetgnsrecord} does not perform these operations itself, but instead
7091uses plugins to perform the operation. GNUnet includes plugins to support
7092common DNS record types as well as standard GNS record types.
7093
7094
7095@menu
7096* Value handling::
7097* Type handling::
7098@end menu
7099
7100@node Value handling
7101@subsubsection Value handling
7102
7103@c %**end of header
7104
7105@code{GNUNET_GNSRECORD_value_to_string} can be used to convert the (binary)
7106representation of a GNS record value to a human readable, 0-terminated UTF-8
7107string. NULL is returned if the specified record type is not supported by any
7108available plugin.
7109
7110@code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a human
7111readable string to the respective (binary) representation of a GNS record
7112value.
7113
7114@node Type handling
7115@subsubsection Type handling
7116
7117@c %**end of header
7118
7119@code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the numeric
7120value associated with a given typename. For example, given the typename "A"
7121(for DNS A reocrds), the function will return the number 1. A list of common
7122DNS record types is
7123@uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here. Note that
7124not all DNS record types are supported by GNUnet GNSRECORD plugins at this
7125time.}
7126
7127@code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the typename
7128associated with a given numeric value. For example, given the type number 1,
7129the function will return the typename "A".
7130
7131@node GNS plugins
7132@subsection GNS plugins
7133
7134@c %**end of header
7135
7136Adding a new GNS record type typically involves writing (or extending) a
7137GNSRECORD plugin. The plugin needs to implement the
7138@code{gnunet_gnsrecord_plugin.h} API which provides basic functions that are
7139needed by GNSRECORD to convert typenames and values of the respective record
7140type to strings (and back). These gnsrecord plugins are typically implemented
7141within their respective subsystems. Examples for such plugins can be found in
7142the GNSRECORD, GNS and CONVERSATION subsystems.
7143
7144The @code{libgnunetgnsrecord} library is then used to locate, load and query
7145the appropriate gnsrecord plugin. Which plugin is appropriate is determined by
7146the record type (which is just a 32-bit integer). The @code{libgnunetgnsrecord}
7147library loads all block plugins that are installed at the local peer and
7148forwards the application request to the plugins. If the record type is not
7149supported by the plugin, it should simply return an error code.
7150
7151The central functions of the block APIs (plugin and main library) are the same
7152four functions for converting between values and strings, and typenames and
7153numbers documented in the previous subsection.
7154
7155@node The GNS Client-Service Protocol
7156@subsection The GNS Client-Service Protocol
7157
7158@c %**end of header
7159
7160The GNS client-service protocol consists of two simple messages, the
7161@code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP} message
7162contains a unique 32-bit identifier, which will be included in the
7163corresponding response. Thus, clients can send many lookup requests in parallel
7164and receive responses out-of-order. A @code{LOOKUP} request also includes the
7165public key of the GNS zone, the desired record type and fields specifying
7166whether shortening is enabled or networking is disabled. Finally, the
7167@code{LOOKUP} message includes the name to be resolved.
7168
7169The response includes the number of records and the records themselves in the
7170format created by @code{GNUNET_GNSRECORD_records_serialize}. They can thus be
7171deserialized using @code{GNUNET_GNSRECORD_records_deserialize}.
7172
7173@node Hijacking the DNS-Traffic using gnunet-service-dns
7174@subsection Hijacking the DNS-Traffic using gnunet-service-dns
7175
7176@c %**end of header
7177
7178This section documents how the gnunet-service-dns (and the gnunet-helper-dns)
7179intercepts DNS queries from the local system.@ This is merely one method for
7180how we can obtain GNS queries. It is also possible to change @code{resolv.conf}
7181to point to a machine running @code{gnunet-dns2gns} or to modify libc's name
7182system switch (NSS) configuration to include a GNS resolution plugin. The
7183method described in this chaper is more of a last-ditch catch-all approach.
7184
7185@code{gnunet-service-dns} enables intercepting DNS traffic using policy based
7186routing. We MARK every outgoing DNS-packet if it was not sent by our
7187application. Using a second routing table in the Linux kernel these marked
7188packets are then routed through our virtual network interface and can thus be
7189captured unchanged.
7190
7191Our application then reads the query and decides how to handle it: A query to
7192an address ending in ".gnu" or ".zkey" is hijacked by @code{gnunet-service-gns}
7193and resolved internally using GNS. In the future, a reverse query for an
7194address of the configured virtual network could be answered with records kept
7195about previous forward queries. Queries that are not hijacked by some
7196application using the DNS service will be sent to the original recipient. The
7197answer to the query will always be sent back through the virtual interface with
7198the original nameserver as source address.
7199
7200
7201@menu
7202* Network Setup Details::
7203@end menu
7204
7205@node Network Setup Details
7206@subsubsection Network Setup Details
7207
7208@c %**end of header
7209
7210The DNS interceptor adds the following rules to the Linux kernel:
7211@example
7212iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 -j
7213ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK --set-mark 3 ip
7214rule add fwmark 3 table2 ip route add default via $VIRTUALDNS table2
7215@end example
7216
7217Line 1 makes sure that all packets coming from a port our application opened
7218beforehand (@code{$LOCALPORT}) will be routed normally. Line 2 marks every
7219other packet to a DNS-Server with mark 3 (chosen arbitrarily). The third line
7220adds a routing policy based on this mark 3 via the routing table.
7221
7222@node Serving DNS lookups via GNS on W32
7223@subsection Serving DNS lookups via GNS on W32
7224
7225@c %**end of header
7226
7227This section documents how the libw32nsp (and gnunet-gns-helper-service-w32) do
7228DNS resolutions of DNS queries on the local system. This only applies to GNUnet
7229running on W32.
7230
7231W32 has a concept of "Namespaces" and "Namespace providers". These are used to
7232present various name systems to applications in a generic way. Namespaces
7233include DNS, mDNS, NLA and others. For each namespace any number of providers
7234could be registered, and they are queried in an order of priority (which is
7235adjustable).
7236
7237Applications can resolve names by using WSALookupService*() family of
7238functions.
7239
7240However, these are WSA-only facilities. Common BSD socket functions for
7241namespace resolutions are gethostbyname and getaddrinfo (among others). These
7242functions are implemented internally (by default - by mswsock, which also
7243implements the default DNS provider) as wrappers around WSALookupService*()
7244functions (see "Sample Code for a Service Provider" on MSDN).
7245
7246On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
7247installed into the system by using w32nsp-install (and uninstalled by
7248w32nsp-uninstall), as described in "Installation Handbook".
7249
7250libw32nsp is very simple and has almost no dependencies. As a response to
7251NSPLookupServiceBegin(), it only checks that the provider GUID passed to it by
7252the caller matches GNUnet DNS Provider GUID, checks that name being resolved
7253ends in ".gnu" or ".zkey", then connects to gnunet-gns-helper-service-w32 at
7254127.0.0.1:5353 (hardcoded) and sends the name resolution request there,
7255returning the connected socket to the caller.
7256
7257When the caller invokes NSPLookupServiceNext(), libw32nsp reads a completely
7258formed reply from that socket, unmarshalls it, then gives it back to the
7259caller.
7260
7261At the moment gnunet-gns-helper-service-w32 is implemented to ever give only
7262one reply, and subsequent calls to NSPLookupServiceNext() will fail with
7263WSA_NODATA (first call to NSPLookupServiceNext() might also fail if GNS failed
7264to find the name, or there was an error connecting to it).
7265
7266gnunet-gns-helper-service-w32 does most of the processing:
7267
7268@itemize @bullet
7269@item Maintains a connection to GNS.
7270@item Reads GNS config and loads appropriate keys.
7271@item Checks service GUID and decides on the type of record to look up,
7272refusing to make a lookup outright when unsupported service GUID is passed.
7273@item Launches the lookup
7274@end itemize
7275
7276When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
7277reply (including filling a WSAQUERYSETW structure and, possibly, a binary blob
7278with a hostent structure for gethostbyname() client), marshalls it, and sends
7279it back to libw32nsp. If no records were found, it sends an empty header.
7280
7281This works for most normal applications that use gethostbyname() or
7282getaddrinfo() to resolve names, but fails to do anything with applications that
7283use alternative means of resolving names (such as sending queries to a DNS
7284server directly by themselves). This includes some of well known utilities,
7285like "ping" and "nslookup".
7286
7287@node The GNS Namecache
7288@section The GNS Namecache
7289
7290@c %**end of header
7291
7292The NAMECACHE subsystem is responsible for caching (encrypted) resolution
7293results of the GNU Name System (GNS). GNS makes zone information available to
7294other users via the DHT. However, as accessing the DHT for every lookup is
7295expensive (and as the DHT's local cache is lost whenever the peer is
7296restarted), GNS uses the NAMECACHE as a more persistent cache for DHT lookups.
7297Thus, instead of always looking up every name in the DHT, GNS first checks if
7298the result is already available locally in the NAMECACHE. Only if there is no
7299result in the NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the
7300same (encrypted) format as the DHT. It thus makes no sense to iterate over all
7301items in the NAMECACHE --- the NAMECACHE does not have a way to provide the
7302keys required to decrypt the entries.
7303
7304Blocks in the NAMECACHE share the same expiration mechanism as blocks in the
7305DHT --- the block expires wheneever any of the records in the (encrypted) block
7306expires. The expiration time of the block is the only information stored in
7307plaintext. The NAMECACHE service internally performs all of the required work
7308to expire blocks, clients do not have to worry about this. Also, given that
7309NAMECACHE stores only GNS blocks that local users requested, there is no
7310configuration option to limit the size of the NAMECACHE. It is assumed to be
7311always small enough (a few MB) to fit on the drive.
7312
7313The NAMECACHE supports the use of different database backends via a plugin API.
7314
7315@menu
7316* libgnunetnamecache::
7317* The NAMECACHE Client-Service Protocol::
7318* The NAMECACHE Plugin API::
7319@end menu
7320
7321@node libgnunetnamecache
7322@subsection libgnunetnamecache
7323
7324@c %**end of header
7325
7326The NAMECACHE API consists of five simple functions. First, there is
7327@code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service. This
7328returns the handle required for all other operations on the NAMECACHE. Using
7329@code{GNUNET_NAMECACHE_block_cache} clients can insert a block into the cache.
7330@code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that were
7331stored in the NAMECACHE. Both operations can be cancelled using
7332@code{GNUNET_NAMECACHE_cancel}. Note that cancelling a
7333@code{GNUNET_NAMECACHE_block_cache} operation can result in the block being
7334stored in the NAMECACHE --- or not. Cancellation primarily ensures that the
7335continuation function with the result of the operation will no longer be
7336invoked. Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to
7337the NAMECACHE.
7338
7339The maximum size of a block that can be stored in the NAMECACHE is
7340@code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
7341
7342@node The NAMECACHE Client-Service Protocol
7343@subsection The NAMECACHE Client-Service Protocol
7344
7345@c %**end of header
7346
7347All messages in the NAMECACHE IPC protocol start with the @code{struct
7348GNUNET_NAMECACHE_Header} which adds a request ID (32-bit integer) to the
7349standard message header. The request ID is used to match requests with the
7350respective responses from the NAMECACHE, as they are allowed to happen
7351out-of-order.
7352
7353
7354@menu
7355* Lookup::
7356* Store::
7357@end menu
7358
7359@node Lookup
7360@subsubsection Lookup
7361
7362@c %**end of header
7363
7364The @code{struct LookupBlockMessage} is used to lookup a block stored in the
7365cache. It contains the query hash. The NAMECACHE always responds with a
7366@code{struct LookupBlockResponseMessage}. If the NAMECACHE has no response, it
7367sets the expiration time in the response to zero. Otherwise, the response is
7368expected to contain the expiration time, the ECDSA signature, the derived key
7369and the (variable-size) encrypted data of the block.
7370
7371@node Store
7372@subsubsection Store
7373
7374@c %**end of header
7375
7376The @code{struct BlockCacheMessage} is used to cache a block in the NAMECACHE.
7377It has the same structure as the @code{struct LookupBlockResponseMessage}. The
7378service responds with a @code{struct BlockCacheResponseMessage} which contains
7379the result of the operation (success or failure). In the future, we might want
7380to make it possible to provide an error message as well.
7381
7382@node The NAMECACHE Plugin API
7383@subsection The NAMECACHE Plugin API
7384@c %**end of header
7385
7386The NAMECACHE plugin API consists of two functions, @code{cache_block} to store
7387a block in the database, and @code{lookup_block} to lookup a block in the
7388database.
7389
7390
7391@menu
7392* Lookup2::
7393* Store2::
7394@end menu
7395
7396@node Lookup2
7397@subsubsection Lookup2
7398
7399@c %**end of header
7400
7401The @code{lookup_block} function is expected to return at most one block to the
7402iterator, and return @code{GNUNET_NO} if there were no non-expired results. If
7403there are multiple non-expired results in the cache, the lookup is supposed to
7404return the result with the largest expiration time.
7405
7406@node Store2
7407@subsubsection Store2
7408
7409@c %**end of header
7410
7411The @code{cache_block} function is expected to try to store the block in the
7412database, and return @code{GNUNET_SYSERR} if this was not possible for any
7413reason. Furthermore, @code{cache_block} is expected to implicitly perform cache
7414maintenance and purge blocks from the cache that have expired. Note that
7415@code{cache_block} might encounter the case where the database already has
7416another block stored under the same key. In this case, the plugin must ensure
7417that the block with the larger expiration time is preserved. Obviously, this
7418can done either by simply adding new blocks and selecting for the most recent
7419expiration time during lookup, or by checking which block is more recent during
7420the store operation.
7421
7422@node The REVOCATION Subsystem
7423@section The REVOCATION Subsystem
7424@c %**end of header
7425
7426The REVOCATION subsystem is responsible for key revocation of Egos. If a user
7427learns that his private key has been compromised or has lost it, he can use the
7428REVOCATION system to inform all of the other users that this private key is no
7429longer valid. The subsystem thus includes ways to query for the validity of
7430keys and to propagate revocation messages.
7431
7432@menu
7433* Dissemination::
7434* Revocation Message Design Requirements::
7435* libgnunetrevocation::
7436* The REVOCATION Client-Service Protocol::
7437* The REVOCATION Peer-to-Peer Protocol::
7438@end menu
7439
7440@node Dissemination
7441@subsection Dissemination
7442
7443@c %**end of header
7444
7445When a revocation is performed, the revocation is first of all disseminated by
7446flooding the overlay network. The goal is to reach every peer, so that when a
7447peer needs to check if a key has been revoked, this will be purely a local
7448operation where the peer looks at his local revocation list. Flooding the
7449network is also the most robust form of key revocation --- an adversary would
7450have to control a separator of the overlay graph to restrict the propagation of
7451the revocation message. Flooding is also very easy to implement --- peers that
7452receive a revocation message for a key that they have never seen before simply
7453pass the message to all of their neighbours.
7454
7455Flooding can only distribute the revocation message to peers that are online.
7456In order to notify peers that join the network later, the revocation service
7457performs efficient set reconciliation over the sets of known revocation
7458messages whenever two peers (that both support REVOCATION dissemination)
7459connect. The SET service is used to perform this operation
7460efficiently.
7461
7462@node Revocation Message Design Requirements
7463@subsection Revocation Message Design Requirements
7464
7465@c %**end of header
7466
7467However, flooding is also quite costly, creating O(|E|) messages on a network
7468with |E| edges. Thus, revocation messages are required to contain a
7469proof-of-work, the result of an expensive computation (which, however, is cheap
7470to verify). Only peers that have expended the CPU time necessary to provide
7471this proof will be able to flood the network with the revocation message. This
7472ensures that an attacker cannot simply flood the network with millions of
7473revocation messages. The proof-of-work required by GNUnet is set to take days
7474on a typical PC to compute; if the ability to quickly revoke a key is needed,
7475users have the option to pre-compute revocation messages to store off-line and
7476use instantly after their key has expired.
7477
7478Revocation messages must also be signed by the private key that is being
7479revoked. Thus, they can only be created while the private key is in the
7480possession of the respective user. This is another reason to create a
7481revocation message ahead of time and store it in a secure location.
7482
7483@node libgnunetrevocation
7484@subsection libgnunetrevocation
7485
7486@c %**end of header
7487
7488The REVOCATION API consists of two parts, to query and to issue
7489revocations.
7490
7491
7492@menu
7493* Querying for revoked keys::
7494* Preparing revocations::
7495* Issuing revocations::
7496@end menu
7497
7498@node Querying for revoked keys
7499@subsubsection Querying for revoked keys
7500
7501@c %**end of header
7502
7503@code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public key has
7504been revoked. The given callback will be invoked with the result of the check.
7505The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on the
7506return value.
7507
7508@node Preparing revocations
7509@subsubsection Preparing revocations
7510
7511@c %**end of header
7512
7513It is often desirable to create a revocation record ahead-of-time and store it
7514in an off-line location to be used later in an emergency. This is particularly
7515true for GNUnet revocations, where performing the revocation operation itself
7516is computationally expensive and thus is likely to take some time. Thus, if
7517users want the ability to perform revocations quickly in an emergency, they
7518must pre-compute the revocation message. The revocation API enables this with
7519two functions that are used to compute the revocation message, but not trigger
7520the actual revocation operation.
7521
7522@code{GNUNET_REVOCATION_check_pow} should be used to calculate the
7523proof-of-work required in the revocation message. This function takes the
7524public key, the required number of bits for the proof of work (which in GNUnet
7525is a network-wide constant) and finally a proof-of-work number as arguments.
7526The function then checks if the given proof-of-work number is a valid proof of
7527work for the given public key. Clients preparing a revocation are expected to
7528call this function repeatedly (typically with a monotonically increasing
7529sequence of numbers of the proof-of-work number) until a given number satisfies
7530the check. That number should then be saved for later use in the revocation
7531operation.
7532
7533@code{GNUNET_REVOCATION_sign_revocation} is used to generate the signature that
7534is required in a revocation message. It takes the private key that (possibly in
7535the future) is to be revoked and returns the signature. The signature can again
7536be saved to disk for later use, which will then allow performing a revocation
7537even without access to the private key.
7538
7539@node Issuing revocations
7540@subsubsection Issuing revocations
7541
7542
7543Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign} and
7544the proof-of-work, @code{GNUNET_REVOCATION_revoke} can be used to perform the
7545actual revocation. The given callback is called upon completion of the
7546operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
7547library from calling the continuation; however, in that case it is undefined
7548whether or not the revocation operation will be executed.
7549
7550@node The REVOCATION Client-Service Protocol
7551@subsection The REVOCATION Client-Service Protocol
7552
7553
7554The REVOCATION protocol consists of four simple messages.
7555
7556A @code{QueryMessage} containing a public ECDSA key is used to check if a
7557particular key has been revoked. The service responds with a
7558@code{QueryResponseMessage} which simply contains a bit that says if the given
7559public key is still valid, or if it has been revoked.
7560
7561The second possible interaction is for a client to revoke a key by passing a
7562@code{RevokeMessage} to the service. The @code{RevokeMessage} contains the
7563ECDSA public key to be revoked, a signature by the corresponding private key
7564and the proof-of-work, The service responds with a
7565@code{RevocationResponseMessage} which can be used to indicate that the
7566@code{RevokeMessage} was invalid (i.e. proof of work incorrect), or otherwise
7567indicates that the revocation has been processed successfully.
7568
7569@node The REVOCATION Peer-to-Peer Protocol
7570@subsection The REVOCATION Peer-to-Peer Protocol
7571
7572@c %**end of header
7573
7574Revocation uses two disjoint ways to spread revocation information among peers.
7575First of all, P2P gossip exchanged via CORE-level neighbours is used to quickly
7576spread revocations to all connected peers. Second, whenever two peers (that
7577both support revocations) connect, the SET service is used to compute the union
7578of the respective revocation sets.
7579
7580In both cases, the exchanged messages are @code{RevokeMessage}s which contain
7581the public key that is being revoked, a matching ECDSA signature, and a
7582proof-of-work. Whenever a peer learns about a new revocation this way, it first
7583validates the signature and the proof-of-work, then stores it to disk
7584(typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally spreads the
7585information to all directly connected neighbours.
7586
7587For computing the union using the SET service, the peer with the smaller hashed
7588peer identity will connect (as a "client" in the two-party set protocol) to the
7589other peer after one second (to reduce traffic spikes on connect) and initiate
7590the computation of the set union. All revocation services use a common hash to
7591identify the SET operation over revocation sets.
7592
7593The current implementation accepts revocation set union operations from all
7594peers at any time; however, well-behaved peers should only initiate this
7595operation once after establishing a connection to a peer with a larger hashed
7596peer identity.
7597
7598@node GNUnet's File-sharing (FS) Subsystem
7599@section GNUnet's File-sharing (FS) Subsystem
7600
7601@c %**end of header
7602
7603This chapter describes the details of how the file-sharing service works. As
7604with all services, it is split into an API (libgnunetfs), the service process
7605(gnunet-service-fs) and user interface(s). The file-sharing service uses the
7606datastore service to store blocks and the DHT (and indirectly datacache) for
7607lookups for non-anonymous file-sharing.@ Furthermore, the file-sharing service
7608uses the block library (and the block fs plugin) for validation of DHT
7609operations.
7610
7611In contrast to many other services, libgnunetfs is rather complex since the
7612client library includes a large number of high-level abstractions; this is
7613necessary since the Fs service itself largely only operates on the block level.
7614The FS library is responsible for providing a file-based abstraction to
7615applications, including directories, meta data, keyword search, verification,
7616and so on.
7617
7618The method used by GNUnet to break large files into blocks and to use keyword
7619search is called the "Encoding for Censorship Resistant Sharing" (ECRS). ECRS
7620is largely implemented in the fs library; block validation is also reflected in
7621the block FS plugin and the FS service. ECRS on-demand encoding is implemented
7622in the FS service.
7623
7624NOTE: The documentation in this chapter is quite incomplete.
7625
7626@menu
7627* Encoding for Censorship-Resistant Sharing (ECRS)::
7628* File-sharing persistence directory structure::
7629@end menu
7630
7631@node Encoding for Censorship-Resistant Sharing (ECRS)
7632@subsection Encoding for Censorship-Resistant Sharing (ECRS)
7633
7634@c %**end of header
7635
7636When GNUnet shares files, it uses a content encoding that is called ECRS, the
7637Encoding for Censorship-Resistant Sharing. Most of ECRS is described in the
7638(so far unpublished) research paper attached to this page. ECRS obsoletes the
7639previous ESED and ESED II encodings which were used in GNUnet before version
76400.7.0.@ @ The rest of this page assumes that the reader is familiar with the
7641attached paper. What follows is a description of some minor extensions that
7642GNUnet makes over what is described in the paper. The reason why these
7643extensions are not in the paper is that we felt that they were obvious or
7644trivial extensions to the original scheme and thus did not warrant space in
7645the research report.
7646
7647
7648@menu
7649* Namespace Advertisements::
7650* KSBlocks::
7651@end menu
7652
7653@node Namespace Advertisements
7654@subsubsection Namespace Advertisements
7655
7656@c %**end of header
7657@c %**FIXME: all zeroses -> ?
7658
7659An @code{SBlock} with identifier all zeros is a signed
7660advertisement for a namespace. This special @code{SBlock} contains metadata
7661describing the content of the namespace. Instead of the name of the identifier
7662for a potential update, it contains the identifier for the root of the
7663namespace. The URI should always be empty. The @code{SBlock} is signed with
7664the content provder's RSA private key (just like any other SBlock). Peers
7665can search for @code{SBlock}s in order to find out more about a namespace.
7666
7667@node KSBlocks
7668@subsubsection KSBlocks
7669
7670@c %**end of header
7671
7672GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead of
7673encrypting a CHK and metadata, encrypt an @code{SBlock} instead. In other
7674words, @code{KSBlocks} enable GNUnet to find @code{SBlocks} using the global
7675keyword search. Usually the encrypted @code{SBlock} is a namespace
7676advertisement. The rationale behind @code{KSBlock}s and @code{SBlock}s is to
7677enable peers to discover namespaces via keyword searches, and, to associate
7678useful information with namespaces. When GNUnet finds @code{KSBlocks} during a
7679normal keyword search, it adds the information to an internal list of
7680discovered namespaces. Users looking for interesting namespaces can then
7681inspect this list, reducing the need for out-of-band discovery of namespaces.
7682Naturally, namespaces (or more specifically, namespace advertisements) can
7683also be referenced from directories, but @code{KSBlock}s should make it easier
7684to advertise namespaces for the owner of the pseudonym since they eliminate
7685the need to first create a directory.
7686
7687Collections are also advertised using @code{KSBlock}s.
7688
7689@table @asis
7690@item Attachment Size
7691@item ecrs.pdf 270.68 KB
7692@item https://gnunet.org/sites/default/files/ecrs.pdf
7693@end table
7694
7695@node File-sharing persistence directory structure
7696@subsection File-sharing persistence directory structure
7697
7698@c %**end of header
7699
7700This section documents how the file-sharing library implements persistence of
7701file-sharing operations and specifically the resulting directory structure.
7702This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag was set
7703when calling @code{GNUNET_FS_start}. In this case, the file-sharing library
7704will try hard to ensure that all major operations (searching, downloading,
7705publishing, unindexing) are persistent, that is, can live longer than the
7706process itself. More specifically, an operation is supposed to live until it is
7707explicitly stopped.
7708
7709If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
7710@code{SUSPEND} event is generated and then when the process calls
7711@code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
7712Additionally, even if an application crashes (segfault, SIGKILL, system crash)
7713and hence @code{GNUNET_FS_stop} is never called and no @code{SUSPEND} events
7714are generated, operations are still resumed (with @code{RESUME} events). This
7715is implemented by constantly writing the current state of the file-sharing
7716operations to disk. Specifically, the current state is always written to disk
7717whenever anything significant changes (the exception are block-wise progress in
7718publishing and unindexing, since those operations would be slowed down
7719significantly and can be resumed cheaply even without detailed accounting).
7720Note that@ if the process crashes (or is killed) during a serialization
7721operation, FS does not guarantee that this specific operation is recoverable
7722(no strict transactional semantics, again for performance reasons). However,
7723all other unrelated operations should resume nicely.
7724
7725Since we need to serialize the state continuously and want to recover as much
7726as possible even after crashing during a serialization operation, we do not use
7727one large file for serialization. Instead, several directories are used for the
7728various operations. When @code{GNUNET_FS_start} executes, the master
7729directories are scanned for files describing operations to resume. Sometimes,
7730these operations can refer to related operations in child directories which may
7731also be resumed at this point. Note that corrupted files are cleaned up
7732automatically. However, dangling files in child directories (those that are not
7733referenced by files from the master directories) are not automatically removed.
7734
7735Persistence data is kept in a directory that begins with the "STATE_DIR" prefix
7736from the configuration file (by default, "$SERVICEHOME/persistence/") followed
7737by the name of the client as given to @code{GNUNET_FS_start} (for example,
7738"gnunet-gtk") followed by the actual name of the master or child directory.
7739
7740The names for the master directories follow the names of the operations:
7741
7742@itemize @bullet
7743@item "search"
7744@item "download"
7745@item "publish"
7746@item "unindex"
7747@end itemize
7748
7749Each of the master directories contains names (chosen at random) for each
7750active top-level (master) operation.
7751Note that a download that is associated with a search result is not a
7752top-level operation.
7753
7754In contrast to the master directories, the child directories are only
7755consulted when another operation refers to them.
7756For each search, a subdirectory (named after the master search
7757synchronization file) contains the search results.
7758Search results can have an associated download, which is then stored in
7759the general "download-child" directory.
7760Downloads can be recursive, in which case children are stored in
7761subdirectories mirroring the structure of the recursive download
7762(either starting in the master "download" directory or in the
7763"download-child" directory depending on how the download was initiated).
7764For publishing operations, the "publish-file" directory contains
7765information about the individual files and directories that are part of
7766the publication.
7767However, this directory structure is flat and does not mirror the
7768structure of the publishing operation.
7769Note that unindex operations cannot have associated child operations.
7770
7771@cindex REGEX subsystem
7772@cindex regex subsystem
7773@node GNUnet's REGEX Subsystem
7774@section GNUnet's REGEX Subsystem
7775
7776@c %**end of header
7777
7778Using the REGEX subsystem, you can discover peers that offer a particular
7779service using regular expressions.
7780The peers that offer a service specify it using a regular expressions.
7781Peers that want to patronize a service search using a string.
7782The REGEX subsystem will then use the DHT to return a set of matching
7783offerers to the patrons.
7784
7785For the technical details, we have Max's defense talk and Max's Master's
7786thesis.
7787
7788@c An additional publication is under preparation and available to
7789@c team members (in Git).
7790@c FIXME: Where is the file? Point to it. Assuming that it's szengel2012ms
7791
7792@menu
7793* How to run the regex profiler::
7794@end menu
7795
7796@node How to run the regex profiler
7797@subsection How to run the regex profiler
7798
7799@c %**end of header
7800
7801The gnunet-regex-profiler can be used to profile the usage of mesh/regex
7802for a given set of regular expressions and strings.
7803Mesh/regex allows you to announce your peer ID under a certain regex and
7804search for peers matching a particular regex using a string.
7805See @uref{https://gnunet.org/szengel2012ms, szengel2012ms} for a full
7806introduction.
7807
7808First of all, the regex profiler uses GNUnet testbed, thus all the
7809implications for testbed also apply to the regex profiler
7810(for example you need password-less ssh login to the machines listed in
7811your hosts file).
7812
7813@strong{Configuration}
7814
7815Moreover, an appropriate configuration file is needed.
7816Generally you can refer to the
7817@file{contrib/regex_profiler_infiniband.conf} file in the sourcecode
7818of GNUnet for an example configuration.
7819In the following paragraph the important details are highlighted.
7820
7821Announcing of the regular expressions is done by the
7822gnunet-daemon-regexprofiler, therefore you have to make sure it is
7823started, by adding it to the AUTOSTART set of ARM:
7824
7825@example
7826[regexprofiler]
7827AUTOSTART = YES
7828@end example
7829
7830@noindent
7831Furthermore you have to specify the location of the binary:
7832
7833@example
7834[regexprofiler]
7835# Location of the gnunet-daemon-regexprofiler binary.
7836BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
7837# Regex prefix that will be applied to all regular expressions and
7838# search string.
7839REGEX_PREFIX = "GNVPN-0001-PAD"
7840@end example
7841
7842@noindent
7843When running the profiler with a large scale deployment, you probably
7844want to reduce the workload of each peer.
7845Use the following options to do this.
7846
7847@example
7848[dht]
7849# Force network size estimation
7850FORCE_NSE = 1
7851
7852[dhtcache]
7853DATABASE = heap
7854# Disable RC-file for Bloom filter? (for benchmarking with limited IO
7855# availability)
7856DISABLE_BF_RC = YES
7857# Disable Bloom filter entirely
7858DISABLE_BF = YES
7859
7860[nse]
7861# Minimize proof-of-work CPU consumption by NSE
7862WORKBITS = 1
7863@end example
7864
7865@noindent
7866@strong{Options}
7867
7868To finally run the profiler some options and the input data need to be
7869specified on the command line.
7870
7871@example
7872gnunet-regex-profiler -c config-file -d log-file -n num-links \
7873-p path-compression-length -s search-delay -t matching-timeout \
7874-a num-search-strings hosts-file policy-dir search-strings-file
7875@end example
7876
7877@noindent
7878Where...
7879
7880@itemize @bullet
7881@item ... @code{config-file} means the configuration file created earlier.
7882@item ... @code{log-file} is the file where to write statistics output.
7883@item ... @code{num-links} indicates the number of random links between
7884started peers.
7885@item ... @code{path-compression-length} is the maximum path compression
7886length in the DFA.
7887@item ... @code{search-delay} time to wait between peers finished linking
7888and starting to match strings.
7889@item ... @code{matching-timeout} timeout after which to cancel the
7890searching.
7891@item ... @code{num-search-strings} number of strings in the
7892search-strings-file.
7893@item ... the @code{hosts-file} should contain a list of hosts for the
7894testbed, one per line in the following format:
7895
7896@itemize @bullet
7897@item @code{user@@host_ip:port}
7898@end itemize
7899@item ... the @code{policy-dir} is a folder containing text files
7900containing one or more regular expressions. A peer is started for each
7901file in that folder and the regular expressions in the corresponding file
7902are announced by this peer.
7903@item ... the @code{search-strings-file} is a text file containing search
7904strings, one in each line.
7905@end itemize
7906
7907@noindent
7908You can create regular expressions and search strings for every AS in the
7909Internet using the attached scripts. You need one of the
7910@uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA
7911routeviews prefix2as} data files for this. Run
7912
7913@example
7914create_regex.py <filename> <output path>
7915@end example
7916
7917@noindent
7918to create the regular expressions and
7919
7920@example
7921create_strings.py <input path> <outfile>
7922@end example
7923
7924@noindent
7925to create a search strings file from the previously created
7926regular expressions.