1 files changed, 7486 insertions, 0 deletions
diff --git a/doc/chapters/developer.texi b/doc/chapters/developer.texi
new file mode 100644
index 000000000..ce6b16087
--- /dev/null
+++ b/doc/chapters/developer.texi
@@ -0,0 +1,7486 @@
+@c ***************************************************************************
+@node GNUnet Developer Handbook
+@chapter GNUnet Developer Handbook
+This book is intended to be an introduction for programmers that want to
+extend the GNUnet framework. GNUnet is more than a simple peer-to-peer
+application. For developers, GNUnet is:
+@itemize @bullet
+@item Free software under the GNU General Public License, with a community
+that believes in the GNU philosophy
+@item
+A set of standards, including coding conventions and architectural rules
+@item
+A set of layered protocols, both specifying the communication between peers as
+well as the communication between components of a single peer.
+@item
+A set of libraries with well-defined APIs suitable for writing extensions
+@end itemize
+In particular, the architecture specifies that a peer consists of many
+processes communicating via protocols. Processes can be written in almost
+any language. C and Java APIs exist for accessing existing services and for
+writing extensions. It is possible to write extensions in other languages by
+implementing the necessary IPC protocols.
+GNUnet can be extended and improved along many possible dimensions, and anyone
+interested in free software and freedom-enhancing networking is welcome to
+join the effort. This developer handbook attempts to provide an initial
+introduction to some of the key design choices and central components of the
+system. This manual is far from complete, and we welcome informed
+contributions, be it in the form of new chapters or insightful comments.
+However, the website is experiencing a constant onslaught of sophisticated
+link-spam entered manually by exploited workers solving puzzles and
+customizing text. To limit this commercial defacement, we are strictly
+moderating comments and have disallowed "normal" users from posting new
+content. However, this is really only intended to keep the spam at bay. If
+you are a real user or aspiring developer, please drop us a note (IRC, e-mail,
+contact form) with your user profile ID number included. We will then relax
+these restrictions on your account. We're sorry for this inconvenience;
+however, few people would want to read this site if 99% of it was
+advertisements for bogus websites.
+@c ***************************************************************************
+@menu
+* Developer Introduction::
+* Code overview::
+* System Architecture::
+* Subsystem stability::
+* Naming conventions and coding style guide::
+* Build-system::
+* Developing extensions for GNUnet using the gnunet-ext template::
+* Writing testcases::
+* GNUnet's TESTING library::
+* Performance regression analysis with Gauger::
+* GNUnet's TESTBED Subsystem::
+* libgnunetutil::
+* The Automatic Restart Manager (ARM)::
+* GNUnet's TRANSPORT Subsystem::
+* NAT library::
+* Distance-Vector plugin::
+* SMTP plugin::
+* Bluetooth plugin::
+* WLAN plugin::
+* The ATS Subsystem::
+* GNUnet's CORE Subsystem::
+* GNUnet's CADET subsystem::
+* GNUnet's NSE subsystem::
+* GNUnet's HOSTLIST subsystem::
+* GNUnet's IDENTITY subsystem::
+* GNUnet's NAMESTORE Subsystem::
+* GNUnet's PEERINFO subsystem::
+* GNUnet's PEERSTORE subsystem::
+* GNUnet's SET Subsystem::
+* GNUnet's STATISTICS subsystem::
+* GNUnet's Distributed Hash Table (DHT)::
+* The GNU Name System (GNS)::
+* The GNS Namecache::
+* The REVOCATION Subsystem::
+* GNUnet's File-sharing (FS) Subsystem::
+* GNUnet's REGEX Subsystem::
+@end menu
+@node Developer Introduction
+@section Developer Introduction
+This developer handbook is intended as first introduction to GNUnet for new
+developers that want to extend the GNUnet framework. After the introduction,
+each of the GNUnet subsystems (directories in the src/ tree) is (supposed to
+be) covered in its own chapter. In addition to this documentation, GNUnet
+developers should be aware of the services available on the GNUnet server to
+them.
+New developers can have a look a the GNUnet tutorials for C and java available
+in the src/ directory of the repository or under the following links:
+@itemize @bullet
+@item GNUnet C tutorial
+@item GNUnet Java tutorial
+@end itemize
+In addition to this book, the GNUnet server contains various resources for
+GNUnet developers. They are all conveniently reachable via the "Developer"
+entry in the navigation menu. Some additional tools (such as static analysis
+reports) require a special developer access to perform certain operations. If
+you feel you need access, you should contact
+@uref{http://grothoff.org/christian/, Christian Grothoff}, GNUnet's maintainer.
+The public subsystems on the GNUnet server that help developers are:
+@itemize @bullet
+@item The Version control system keeps our code and enables distributed
+development. Only developers with write access can commit code, everyone else
+is encouraged to submit patches to the
+@uref{http://mail.gnu.org/mailman/listinfo/gnunet-developers, developer
+mailinglist}.
+@item The GNUnet bugtracking system is used to track feature requests, open bug
+reports and their resolutions. Anyone can report bugs, only developers can
+claim to have fixed them.
+@item A buildbot is used to check GNUnet builds automatically on a range of
+platforms. Builds are triggered automatically after 30 minutes of no changes to
+Git.
+@item The current quality of our automated test suite is assessed using Code
+coverage analysis. This analysis is run daily; however the webpage is only
+updated if all automated tests pass at that time. Testcases that improve our
+code coverage are always welcome.
+@item We try to automatically find bugs using a static analysis scan. This scan
+is run daily; however the webpage is only updated if all automated tests pass
+at the time. Note that not everything that is flagged by the analysis is a bug,
+sometimes even good code can be marked as possibly problematic. Nevertheless,
+developers are encouraged to at least be aware of all issues in their code that
+are listed.
+@item We use Gauger for automatic performance regression visualization. Details
+on how to use Gauger are here.
+@item We use @uref{http://junit.org/, junit} to automatically test gnunet-java.
+Automatically generated, current reports on the test suite are here.
+@item We use Cobertura to generate test coverage reports for gnunet-java.
+Current reports on test coverage are here.
+@end itemize
+@c ***************************************************************************
+@menu
+* Project overview::
+@end menu
+@node Project overview
+@subsection Project overview
+The GNUnet project consists at this point of several sub-projects. This section
+is supposed to give an initial overview about the various sub-projects. Note
+that this description also lists projects that are far from complete, including
+even those that have literally not a single line of code in them yet.
+GNUnet sub-projects in order of likely relevance are currently:
+@table @asis
+@item svn/gnunet Core of the P2P framework, including file-sharing, VPN and
+chat applications; this is what the developer handbook covers mostly
+@item svn/gnunet-gtk/ Gtk+-based user interfaces, including gnunet-fs-gtk
+(file-sharing), gnunet-statistics-gtk (statistics over time),
+gnunet-peerinfo-gtk (information about current connections and known peers),
+gnunet-chat-gtk (chat GUI) and gnunet-setup (setup tool for "everything")
+@item svn/gnunet-fuse/ Mounting directories shared via GNUnet's file-sharing on Linux
+@item svn/gnunet-update/ Installation and update tool
+@item svn/gnunet-ext/
+Template for starting 'external' GNUnet projects
+@item svn/gnunet-java/ Java
+APIs for writing GNUnet services and applications
+@item svn/gnunet-www/ Code
+and media helping drive the GNUnet website
+@item svn/eclectic/ Code to run
+GNUnet nodes on testbeds for research, development, testing and evaluation
+@item svn/gnunet-qt/ qt-based GNUnet GUI (dead?)
+@item svn/gnunet-cocoa/
+cocoa-based GNUnet GUI (dead?)
+@end table
+We are also working on various supporting libraries and tools:
+@table @asis
+@item svn/Extractor/ GNU libextractor (meta data extraction)
+@item svn/libmicrohttpd/ GNU libmicrohttpd (embedded HTTP(S) server library)
+@item svn/gauger/ Tool for performance regression analysis
+@item svn/monkey/ Tool for automated debugging of distributed systems
+@item svn/libmwmodem/ Library for accessing satellite connection quality reports
+@end table
+Finally, there are various external projects (see links for a list of those
+that have a public website) which build on top of the GNUnet framework.
+@c ***************************************************************************
+@node Code overview
+@section Code overview
+This section gives a brief overview of the GNUnet source code. Specifically, we
+sketch the function of each of the subdirectories in the @code{gnunet/src/}
+directory. The order given is roughly bottom-up (in terms of the layers of the
+system).
+@table @asis
+@item util/ --- libgnunetutil Library with general utility functions, all
+GNUnet binaries link against this library. Anything from memory allocation and
+data structures to cryptography and inter-process communication. The goal is to
+provide an OS-independent interface and more 'secure' or convenient
+implementations of commonly used primitives. The API is spread over more than a
+dozen headers, developers should study those closely to avoid duplicating
+existing functions.
+@item hello/ --- libgnunethello HELLO messages are used to
+describe under which addresses a peer can be reached (for example, protocol,
+IP, port). This library manages parsing and generating of HELLO messages.
+@item block/ --- libgnunetblock The DHT and other components of GNUnet store
+information in units called 'blocks'. Each block has a type and the type
+defines a particular format and how that binary format is to be linked to a
+hash code (the key for the DHT and for databases). The block library is a
+wapper around block plugins which provide the necessary functions for each
+block type.
+@item statistics/ The statistics service enables associating
+values (of type uint64_t) with a componenet name and a string. The main uses is
+debugging (counting events), performance tracking and user entertainment (what
+did my peer do today?).
+@item arm/ The automatic-restart-manager (ARM) service
+is the GNUnet master service. Its role is to start gnunet-services, to re-start
+them when they crashed and finally to shut down the system when requested.
+@item peerinfo/ The peerinfo service keeps track of which peers are known to
+the local peer and also tracks the validated addresses for each peer (in the
+form of a HELLO message) for each of those peers. The peer is not necessarily
+connected to all peers known to the peerinfo service. Peerinfo provides
+persistent storage for peer identities --- peers are not forgotten just because
+of a system restart.
+@item datacache/ --- libgnunetdatacache The datacache
+library provides (temporary) block storage for the DHT. Existing plugins can
+store blocks in Sqlite, Postgres or MySQL databases. All data stored in the
+cache is lost when the peer is stopped or restarted (datacache uses temporary
+tables).
+@item datastore/ The datastore service stores file-sharing blocks in
+databases for extended periods of time. In contrast to the datacache, data is
+not lost when peers restart. However, quota restrictions may still cause old,
+expired or low-priority data to be eventually discarded. Existing plugins can
+store blocks in Sqlite, Postgres or MySQL databases.
+@item template/ Template
+for writing a new service. Does nothing.
+@item ats/ The automatic transport
+selection (ATS) service is responsible for deciding which address (i.e. which
+transport plugin) should be used for communication with other peers, and at
+what bandwidth.
+@item nat/ --- libgnunetnat Library that provides basic
+functions for NAT traversal. The library supports NAT traversal with manual
+hole-punching by the user, UPnP and ICMP-based autonomous NAT traversal. The
+library also includes an API for testing if the current configuration works and
+the @code{gnunet-nat-server} which provides an external service to test the
+local configuration.
+@item fragmentation/ --- libgnunetfragmentation Some
+transports (UDP and WLAN, mostly) have restrictions on the maximum transfer
+unit (MTU) for packets. The fragmentation library can be used to break larger
+packets into chunks of at most 1k and transmit the resulting fragments
+reliabily (with acknowledgement, retransmission, timeouts, etc.).
+@item transport/ The transport service is responsible for managing the basic P2P
+communication. It uses plugins to support P2P communication over TCP, UDP,
+HTTP, HTTPS and other protocols.The transport service validates peer addresses,
+enforces bandwidth restrictions, limits the total number of connections and
+enforces connectivity restrictions (i.e. friends-only).
+@item peerinfo-tool/
+This directory contains the gnunet-peerinfo binary which can be used to inspect
+the peers and HELLOs known to the peerinfo service.
+@item core/ The core
+service is responsible for establishing encrypted, authenticated connections
+with other peers, encrypting and decrypting messages and forwarding messages to
+higher-level services that are interested in them.
+@item testing/ ---
+libgnunettesting The testing library allows starting (and stopping) peers for
+writing testcases.@
+It also supports automatic generation of configurations for
+peers ensuring that the ports and paths are disjoint. libgnunettesting is also
+the foundation for the testbed service
+@item testbed/ The testbed service is
+used for creating small or large scale deployments of GNUnet peers for
+evaluation of protocols. It facilitates peer depolyments on multiple hosts (for
+example, in a cluster) and establishing varous network topologies (both
+underlay and overlay).
+@item nse/ The network size estimation (NSE) service
+implements a protocol for (securely) estimating the current size of the P2P
+network.
+@item dht/ The distributed hash table (DHT) service provides a
+distributed implementation of a hash table to store blocks under hash keys in
+the P2P network.
+@item hostlist/ The hostlist service allows learning about
+other peers in the network by downloading HELLO messages from an HTTP server,
+can be configured to run such an HTTP server and also implements a P2P protocol
+to advertise and automatically learn about other peers that offer a public
+hostlist server.
+@item topology/ The topology service is responsible for
+maintaining the mesh topology. It tries to maintain connections to friends
+(depending on the configuration) and also tries to ensure that the peer has a
+decent number of active connections at all times. If necessary, new connections
+are added. All peers should run the topology service, otherwise they may end up
+not being connected to any other peer (unless some other service ensures that
+core establishes the required connections). The topology service also tells the
+transport service which connections are permitted (for friend-to-friend
+networking)
+@item fs/ The file-sharing (FS) service implements GNUnet's
+file-sharing application. Both anonymous file-sharing (using gap) and
+non-anonymous file-sharing (using dht) are supported.
+@item cadet/ The CADET
+service provides a general-purpose routing abstraction to create end-to-end
+encrypted tunnels in mesh networks. We wrote a paper documenting key aspects of
+the design.
+@item tun/ --- libgnunettun Library for building IPv4, IPv6
+packets and creating checksums for UDP, TCP and ICMP packets. The header
+defines C structs for common Internet packet formats and in particular structs
+for interacting with TUN (virtual network) interfaces.
+@item mysql/ ---
+libgnunetmysql Library for creating and executing prepared MySQL statements and
+to manage the connection to the MySQL database. Essentially a lightweight
+wrapper for the interaction between GNUnet components and libmysqlclient.
+@item dns/ Service that allows intercepting and modifying DNS requests of the
+local machine. Currently used for IPv4-IPv6 protocol translation (DNS-ALG) as
+implemented by "pt/" and for the GNUnet naming system. The service can also be
+configured to offer an exit service for DNS traffic.
+@item vpn/ The virtual
+public network (VPN) service provides a virtual tunnel interface (VTUN) for IP
+routing over GNUnet. Needs some other peers to run an "exit" service to work.
+Can be activated using the "gnunet-vpn" tool or integrated with DNS using the
+"pt" daemon.
+@item exit/ Daemon to allow traffic from the VPN to exit this
+peer to the Internet or to specific IP-based services of the local peer.
+Currently, an exit service can only be restricted to IPv4 or IPv6, not to
+specific ports and or IP address ranges. If this is not acceptable, additional
+firewall rules must be added manually. exit currently only works for normal
+UDP, TCP and ICMP traffic; DNS queries need to leave the system via a DNS
+service.
+@item pt/ protocol translation daemon. This daemon enables 4-to-6,
+6-to-4, 4-over-6 or 6-over-4 transitions for the local system. It essentially
+uses "DNS" to intercept DNS replies and then maps results to those offered by
+the VPN, which then sends them using mesh to some daemon offering an
+appropriate exit service.
+@item identity/ Management of egos (alter egos) of a
+user; identities are essentially named ECC private keys and used for zones in
+the GNU name system and for namespaces in file-sharing, but might find other
+uses later
+@item revocation/ Key revocation service, can be used to revoke the
+private key of an identity if it has been compromised
+@item namecache/ Cache
+for resolution results for the GNU name system; data is encrypted and can be
+shared among users, loss of the data should ideally only result in a
+performance degradation (persistence not required)
+@item namestore/ Database
+for the GNU name system with per-user private information, persistence required
+@item gns/ GNU name system, a GNU approach to DNS and PKI.
+@item dv/ A plugin
+for distance-vector (DV)-based routing. DV consists of a service and a
+transport plugin to provide peers with the illusion of a direct P2P connection
+for connections that use multiple (typically up to 3) hops in the actual
+underlay network.
+@item regex/ Service for the (distributed) evaluation of
+regular expressions.
+@item scalarproduct/ The scalar product service offers an
+API to perform a secure multiparty computation which calculates a scalar
+product between two peers without exposing the private input vectors of the
+peers to each other.
+@item consensus/ The consensus service will allow a set
+of peers to agree on a set of values via a distributed set union computation.
+@item rest/ The rest API allows access to GNUnet services using RESTful
+interaction. The services provide plugins that can exposed by the rest server.
+@item experimentation/ The experimentation daemon coordinates distributed
+experimentation to evaluate transport and ats properties
+@end table
+@c ***************************************************************************
+@node System Architecture
+@section System Architecture
+GNUnet developers like legos. The blocks are indestructible, can be stacked
+together to construct complex buildings and it is generally easy to swap one
+block for a different one that has the same shape. GNUnet's architecture is
+based on legos:
+This chapter documents the GNUnet lego system, also known as GNUnet's system
+architecture.
+The most common GNUnet component is a service. Services offer an API (or
+several, depending on what you count as "an API") which is implemented as a
+library. The library communicates with the main process of the service using a
+service-specific network protocol. The main process of the service typically
+doesn't fully provide everything that is needed --- it has holes to be filled
+by APIs to other services.
+A special kind of component in GNUnet are user interfaces and daemons. Like
+services, they have holes to be filled by APIs of other services. Unlike
+services, daemons do not implement their own network protocol and they have no
+API:
+The GNUnet system provides a range of services, daemons and user interfaces,
+which are then combined into a layered GNUnet instance (also known as a peer).
+Note that while it is generally possible to swap one service for another
+compatible service, there is often only one implementation. However, during
+development we often have a "new" version of a service in parallel with an
+"old" version. While the "new" version is not working, developers working on
+other parts of the service can continue their development by simply using the
+"old" service. Alternative design ideas can also be easily investigated by
+swapping out individual components. This is typically achieved by simply
+changing the name of the "BINARY" in the respective configuration section.
+Key properties of GNUnet services are that they must be separate processes and
+that they must protect themselves by applying tight error checking against the
+network protocol they implement (thereby achieving a certain degree of
+robustness).
+On the other hand, the APIs are implemented to tolerate failures of the
+service, isolating their host process from errors by the service. If the
+service process crashes, other services and daemons around it should not also
+fail, but instead wait for the service process to be restarted by ARM.
+@c ***************************************************************************
+@node Subsystem stability
+@section Subsystem stability
+This page documents the current stability of the various GNUnet subsystems.
+Stability here describes the expected degree of compatibility with future
+versions of GNUnet. For each subsystem we distinguish between compatibility on
+the P2P network level (communication protocol between peers), the IPC level
+(communication between the service and the service library) and the API level
+(stability of the API). P2P compatibility is relevant in terms of which
+applications are likely going to be able to communicate with future versions of
+the network. IPC communication is relevant for the implementation of language
+bindings that re-implement the IPC messages. Finally, API compatibility is
+relevant to developers that hope to be able to avoid changes to applications
+build on top of the APIs of the framework.
+The following table summarizes our current view of the stability of the
+respective protocols or APIs:
+@multitable @columnfractions .20 .20 .20 .20
+@headitem Subsystem @tab P2P @tab IPC @tab C API
+@item util @tab n/a @tab n/a @tab stable
+@item arm @tab n/a @tab stable @tab stable
+@item ats @tab n/a @tab unstable @tab testing
+@item block @tab n/a @tab n/a @tab stable
+@item cadet @tab testing @tab testing @tab testing
+@item consensus @tab experimental @tab experimental @tab experimental
+@item core @tab stable @tab stable @tab stable
+@item datacache @tab n/a @tab n/a @tab stable
+@item datastore @tab n/a @tab stable @tab stable
+@item dht @tab stable @tab stable @tab stable
+@item dns @tab stable @tab stable @tab stable
+@item dv @tab testing @tab testing @tab n/a
+@item exit @tab testing @tab n/a @tab n/a
+@item fragmentation @tab stable @tab n/a @tab stable
+@item fs @tab stable @tab stable @tab stable
+@item gns @tab stable @tab stable @tab stable
+@item hello @tab n/a @tab n/a @tab testing
+@item hostlist @tab stable @tab stable @tab n/a
+@item identity @tab stable @tab stable @tab n/a
+@item multicast @tab experimental @tab experimental @tab experimental
+@item mysql @tab stable @tab n/a @tab stable
+@item namestore @tab n/a @tab stable @tab stable
+@item nat @tab n/a @tab n/a @tab stable
+@item nse @tab stable @tab stable @tab stable
+@item peerinfo @tab n/a @tab stable @tab stable
+@item psyc @tab experimental @tab experimental @tab experimental
+@item pt @tab n/a @tab n/a @tab n/a
+@item regex @tab stable @tab stable @tab stable
+@item revocation @tab stable @tab stable @tab stable
+@item social @tab experimental @tab experimental @tab experimental
+@item statistics @tab n/a @tab stable @tab stable
+@item testbed @tab n/a @tab testing @tab testing
+@item testing @tab n/a @tab n/a @tab testing
+@item topology @tab n/a @tab n/a @tab n/a
+@item transport @tab stable @tab stable @tab stable
+@item tun @tab n/a @tab n/a @tab stable
+@item vpn @tab testing @tab n/a @tab n/a
+@end multitable
+Here is a rough explanation of the values:
+@table @samp
+@item stable
+No incompatible changes are planned at this time; for IPC/APIs, if
+there are incompatible changes, they will be minor and might only require
+minimal changes to existing code; for P2P, changes will be avoided if at all
+possible for the 0.10.x-series
+@item testing
+No incompatible changes are
+planned at this time, but the code is still known to be in flux; so while we
+have no concrete plans, our expectation is that there will still be minor
+modifications; for P2P, changes will likely be extensions that should not break
+existing code
+@item unstable
+Changes are planned and will happen; however, they
+will not be totally radical and the result should still resemble what is there
+now; nevertheless, anticipated changes will break protocol/API compatibility
+@item experimental
+Changes are planned and the result may look nothing like
+what the API/protocol looks like today
+@item unknown
+Someone should think about where this subsystem headed
+@item n/a
+This subsystem does not have an API/IPC-protocol/P2P-protocol
+@end table
+@c ***************************************************************************
+@node Naming conventions and coding style guide
+@section Naming conventions and coding style guide
+Here you can find some rules to help you write code for GNUnet.
+@c ***************************************************************************
+@menu
+* Naming conventions::
+* Coding style::
+@end menu
+@node Naming conventions
+@subsection Naming conventions
+@c ***************************************************************************
+@menu
+* include files::
+* binaries::
+* logging::
+* configuration::
+* exported symbols::
+* private (library-internal) symbols (including structs and macros)::
+* testcases::
+* performance tests::
+* src/ directories::
+@end menu
+@node include files
+@subsubsection include files
+@itemize @bullet
+@item _lib: library without need for a process
+@item _service: library that needs a service process
+@item _plugin: plugin definition
+@item _protocol: structs used in network protocol
+@item exceptions:
+@itemize @bullet
+@item gnunet_config.h --- generated
+@item platform.h --- first included
+@item plibc.h --- external library
+@item gnunet_common.h --- fundamental routines
+@item gnunet_directories.h --- generated
+@item gettext.h --- external library
+@end itemize
+@end itemize
+@c ***************************************************************************
+@node binaries
+@subsubsection binaries
+@itemize @bullet
+@item gnunet-service-xxx: service process (has listen socket)
+@item gnunet-daemon-xxx: daemon process (no listen socket)
+@item gnunet-helper-xxx[-yyy]: SUID helper for module xxx
+@item gnunet-yyy: command-line tool for end-users
+@item libgnunet_plugin_xxx_yyy.so: plugin for API xxx
+@item libgnunetxxx.so: library for API xxx
+@end itemize
+@c ***************************************************************************
+@node logging
+@subsubsection logging
+@itemize @bullet
+@item services and daemons use their directory name in GNUNET_log_setup (i.e.
+'core') and log using plain 'GNUNET_log'.
+@item command-line tools use their full name in GNUNET_log_setup (i.e.
+'gnunet-publish') and log using plain 'GNUNET_log'.
+@item service access libraries log using 'GNUNET_log_from' and use
+'DIRNAME-api' for the component (i.e. 'core-api')
+@item pure libraries (without associated service) use 'GNUNET_log_from' with
+the component set to their library name (without lib or '.so'), which should
+also be their directory name (i.e. 'nat')
+@item plugins should use 'GNUNET_log_from' with the directory name and the
+plugin name combined to produce the component name (i.e. 'transport-tcp').
+@item logging should be unified per-file by defining a LOG macro with the
+appropriate arguments, along these lines:@ #define LOG(kind,...)
+GNUNET_log_from (kind, "example-api",__VA_ARGS__)
+@end itemize
+@c ***************************************************************************
+@node configuration
+@subsubsection configuration
+@itemize @bullet
+@item paths (that are substituted in all filenames) are in PATHS (have as few
+as possible)
+@item all options for a particular module (src/MODULE) are under [MODULE]
+@item options for a plugin of a module are under [MODULE-PLUGINNAME]
+@end itemize
+@c ***************************************************************************
+@node exported symbols
+@subsubsection exported symbols
+@itemize @bullet
+@item must start with "GNUNET_modulename_" and be defined in "modulename.c"
+@item exceptions: those defined in gnunet_common.h
+@end itemize
+@c ***************************************************************************
+@node private (library-internal) symbols (including structs and macros)
+@subsubsection private (library-internal) symbols (including structs and macros)
+@itemize @bullet
+@item must NOT start with any prefix
+@item must not be exported in a way that linkers could use them or@ other
+libraries might see them via headers; they must be either@ declared/defined in
+C source files or in headers that are in@ the respective directory under
+src/modulename/ and NEVER be@ declared in src/include/.
+@end itemize
+@node testcases
+@subsubsection testcases
+@itemize @bullet
+@item must be called "test_module-under-test_case-description.c"
+@item "case-description" maybe omitted if there is only one test
+@end itemize
+@c ***************************************************************************
+@node performance tests
+@subsubsection performance tests
+@itemize @bullet
+@item must be called "perf_module-under-test_case-description.c"
+@item "case-description" maybe omitted if there is only one performance test
+@item Must only be run if HAVE_BENCHMARKS is satisfied
+@end itemize
+@c ***************************************************************************
+@node src/ directories
+@subsubsection src/ directories
+@itemize @bullet
+@item gnunet-NAME: end-user applications (i.e., gnunet-search, gnunet-arm)
+@item gnunet-service-NAME: service processes with accessor library (i.e.,
+gnunet-service-arm)
+@item libgnunetNAME: accessor library (_service.h-header) or standalone library
+(_lib.h-header)
+@item gnunet-daemon-NAME: daemon process without accessor library (i.e.,
+gnunet-daemon-hostlist) and no GNUnet management port
+@item libgnunet_plugin_DIR_NAME: loadable plugins (i.e.,
+libgnunet_plugin_transport_tcp)
+@end itemize
+@c ***************************************************************************
+@node Coding style
+@subsection Coding style
+@itemize @bullet
+@item GNU guidelines generally apply
+@item Indentation is done with spaces, two per level, no tabs
+@item C99 struct initialization is fine
+@item declare only one variable per line, so@
+@example
+int i; int j;
+@end example
+instead of
+@example
+int i,j;
+@end example
+This helps keep diffs small and forces developers to think precisely about the
+type of every variable. Note that @code{char *} is different from @code{const
+char*} and @code{int} is different from @code{unsigned int} or @code{uint32_t}.
+Each variable type should be chosen with care.
+@item While @code{goto} should generally be avoided, having a @code{goto} to
+the end of a function to a block of clean up statements (free, close, etc.) can
+be acceptable.
+@item Conditions should be written with constants on the left (to avoid
+accidental assignment) and with the 'true' target being either the 'error' case
+or the significantly simpler continuation. For example:@
+@example
+if (0 != stat ("filename," &sbuf)) @{ error(); @} else @{
+  /* handle normal case here */
+@}
+@end example
+instead of
+@example
+if (stat ("filename," &sbuf) == 0) @{
+  /* handle normal case here */
+@} else @{ error(); @}
+@end example
+If possible, the error clause should be terminated with a 'return' (or 'goto'
+to some cleanup routine) and in this case, the 'else' clause should be omitted:
+@example
+if (0 != stat ("filename," &sbuf)) @{ error(); return; @}
+/* handle normal case here */
+@end example
+This serves to avoid deep nesting. The 'constants on the left' rule applies to
+all constants (including. @code{GNUNET_SCHEDULER_NO_TASK}), NULL, and enums).
+With the two above rules (constants on left, errors in 'true' branch), there is
+only one way to write most branches correctly.
+@item Combined assignments and tests are allowed if they do not hinder code
+clarity. For example, one can write:@
+@example
+if (NULL == (value = lookup_function())) @{ error(); return; @}
+@end example
+@item Use @code{break} and @code{continue} wherever possible to avoid deep(er)
+nesting. Thus, we would write:@
+@example
+next = head; while (NULL != (pos = next)) @{ next = pos->next; if (!
+should_free (pos)) continue; GNUNET_CONTAINER_DLL_remove (head, tail, pos);
+GNUNET_free (pos); @}
+@end example
+instead of
+@example
+next = head; while (NULL != (pos = next)) @{ next =
+pos->next; if (should_free (pos)) @{
+    /* unnecessary nesting! */
+    GNUNET_CONTAINER_DLL_remove (head, tail, pos); GNUNET_free (pos); @} @}
+@end example
+@item We primarily use @code{for} and @code{while} loops. A @code{while} loop
+is used if the method for advancing in the loop is not a straightforward
+increment operation. In particular, we use:@
+@example
+next = head;
+while (NULL != (pos = next))
+@{
+  next = pos->next;
+  if (! should_free (pos))
+    continue;
+  GNUNET_CONTAINER_DLL_remove (head, tail, pos);
+  GNUNET_free (pos);
+@}
+@end example
+to free entries in a list (as the iteration changes the structure of the list
+due to the free; the equivalent @code{for} loop does no longer follow the
+simple @code{for} paradigm of @code{for(INIT;TEST;INC)}). However, for loops
+that do follow the simple @code{for} paradigm we do use @code{for}, even if it
+involves linked lists:
+@example
+/* simple iteration over a linked list */
+for (pos = head; NULL != pos; pos = pos->next)
+@{
+   use (pos);
+@}
+@end example
+@item The first argument to all higher-order functions in GNUnet must be
+declared to be of type @code{void *} and is reserved for a closure. We do not
+use inner functions, as trampolines would conflict with setups that use
+non-executable stacks.@ The first statement in a higher-order function, which
+unusually should be part of the variable declarations, should assign the
+@code{cls} argument to the precise expected type. For example:
+@example
+int callback (void *cls, char *args) @{
+  struct Foo *foo = cls; int other_variables;
+   /* rest of function */
+@}
+@end example
+@item It is good practice to write complex @code{if} expressions instead of
+using deeply nested @code{if} statements. However, except for addition and
+multiplication, all operators should use parens. This is fine:@
+@example
+if ( (1 == foo) || ((0 == bar) && (x != y)) )
+  return x;
+@end example
+However, this is not:
+@example
+if (1 == foo)
+  return x;
+if (0 == bar && x != y)
+  return x;
+@end example
+Note that splitting the @code{if} statement above is debateable as the
+@code{return x} is a very trivial statement. However, once the logic after the
+branch becomes more complicated (and is still identical), the "or" formulation
+should be used for sure.
+@item There should be two empty lines between the end of the function and the
+comments describing the following function. There should be a single empty line
+after the initial variable declarations of a function. If a function has no
+local variables, there should be no initial empty line. If a long function
+consists of several complex steps, those steps might be separated by an empty
+line (possibly followed by a comment describing the following step). The code
+should not contain empty lines in arbitrary places; if in doubt, it is likely
+better to NOT have an empty line (this way, more code will fit on the screen).
+@end itemize
+@c ***************************************************************************
+@node Build-system
+@section Build-system
+If you have code that is likely not to compile or build rules you might want to
+not trigger for most developers, use "if HAVE_EXPERIMENTAL" in your
+Makefile.am. Then it is OK to (temporarily) add non-compiling (or
+known-to-not-port) code.
+If you want to compile all testcases but NOT run them, run configure with the@
+@code{--enable-test-suppression} option.
+If you want to run all testcases, including those that take a while, run
+configure with the@ @code{--enable-expensive-testcases} option.
+If you want to compile and run benchmarks, run configure with the@
+@code{--enable-benchmarks} option.
+If you want to obtain code coverage results, run configure with the@
+@code{--enable-coverage} option and run the coverage.sh script in contrib/.
+@c ***************************************************************************
+@node Developing extensions for GNUnet using the gnunet-ext template
+@section Developing extensions for GNUnet using the gnunet-ext template
+For developers who want to write extensions for GNUnet we provide the
+gnunet-ext template to provide an easy to use skeleton.
+gnunet-ext contains the build environment and template files for the
+development of GNUnet services, command line tools, APIs and tests.
+First of all you have to obtain gnunet-ext from SVN:
+@code{svn co https://gnunet.org/svn/gnunet-ext}
+The next step is to bootstrap and configure it. For configure you have to
+provide the path containing GNUnet with @code{--with-gnunet=/path/to/gnunet}
+and the prefix where you want the install the extension using
+@code{--prefix=/path/to/install}@ @code{@ ./bootstrap@ ./configure
+--prefix=/path/to/install --with-gnunet=/path/to/gnunet@ }
+When your GNUnet installation is not included in the default linker search
+path, you have to add @code{/path/to/gnunet} to the file @code{/etc/ld.so.conf}
+and run @code{ldconfig} or your add it to the environmental variable
+@code{LD_LIBRARY_PATH} by using
+@code{export LD_LIBRARY_PATH=/path/to/gnunet/lib}
+@c ***************************************************************************
+@node Writing testcases
+@section Writing testcases
+Ideally, any non-trivial GNUnet code should be covered by automated testcases.
+Testcases should reside in the same place as the code that is being tested. The
+name of source files implementing tests should begin with "test_" followed by
+the name of the file that contains the code that is being tested.
+Testcases in GNUnet should be integrated with the autotools build system. This
+way, developers and anyone building binary packages will be able to run all
+testcases simply by running @code{make check}. The final testcases shipped with
+the distribution should output at most some brief progress information and not
+display debug messages by default. The success or failure of a testcase must be
+indicated by returning zero (success) or non-zero (failure) from the main
+method of the testcase. The integration with the autotools is relatively
+straightforward and only requires modifications to the @code{Makefile.am} in
+the directory containing the testcase. For a testcase testing the code in
+@code{foo.c} the @code{Makefile.am} would contain the following lines:
+@example
+check_PROGRAMS = test_foo TESTS = $(check_PROGRAMS) test_foo_SOURCES =
+test_foo.c test_foo_LDADD = $(top_builddir)/src/util/libgnunetutil.la
+@end example
+Naturally, other libraries used by the testcase may be specified in the
+@code{LDADD} directive as necessary.
+Often testcases depend on additional input files, such as a configuration file.
+These support files have to be listed using the EXTRA_DIST directive in order
+to ensure that they are included in the distribution. Example:
+@example
+EXTRA_DIST = test_foo_data.conf
+@end example
+Executing @code{make check} will run all testcases in the current directory and
+all subdirectories. Testcases can be compiled individually by running
+@code{make test_foo} and then invoked directly using @code{./test_foo}. Note
+that due to the use of plugins in GNUnet, it is typically necessary to run
+@code{make install} before running any testcases. Thus the canonical command
+@code{make check install} has to be changed to @code{make install check} for
+GNUnet.
+@c ***************************************************************************
+@node GNUnet's TESTING library
+@section GNUnet's TESTING library
+The TESTING library is used for writing testcases which involve starting a
+single or multiple peers. While peers can also be started by testcases using
+the ARM subsystem, using TESTING library provides an elegant way to do this.
+The configurations of the peers are auto-generated from a given template to
+have non-conflicting port numbers ensuring that peers' services do not run into
+bind errors. This is achieved by testing ports' availability by binding a
+listening socket to them before allocating them to services in the generated
+configurations.
+An another advantage while using TESTING is that it shortens the testcase
+startup time as the hostkeys for peers are copied from a pre-computed set of
+hostkeys instead of generating them at peer startup which may take a
+considerable amount of time when starting multiple peers or on an embedded
+processor.
+TESTING also allows for certain services to be shared among peers. This feature
+is invaluable when testing with multiple peers as it helps to reduce the number
+of services run per each peer and hence the total number of processes run per
+testcase.
+TESTING library only handles creating, starting and stopping peers. Features
+useful for testcases such as connecting peers in a topology are not available
+in TESTING but are available in the TESTBED subsystem. Furthermore, TESTING
+only creates peers on the localhost, however by using TESTBED testcases can
+benefit from creating peers across multiple hosts.
+@menu
+* API::
+* Finer control over peer stop::
+* Helper functions::
+* Testing with multiple processes::
+@end menu
+@c ***************************************************************************
+@node API
+@subsection API
+TESTING abstracts a group of peers as a TESTING system. All peers in a system
+have common hostname and no two services of these peers have a same port or a
+UNIX domain socket path.
+TESTING system can be created with the function
+@code{GNUNET_TESTING_system_create()} which returns a handle to the system.
+This function takes a directory path which is used for generating the
+configurations of peers, an IP address from which connections to the peers'
+services should be allowed, the hostname to be used in peers' configuration,
+and an array of shared service specifications of type @code{struct
+GNUNET_TESTING_SharedService}.
+The shared service specification must specify the name of the service to share,
+the configuration pertaining to that shared service and the maximum number of
+peers that are allowed to share a single instance of the shared service.
+TESTING system created with @code{GNUNET_TESTING_system_create()} chooses ports
+from the default range 12000 - 56000 while auto-generating configurations for
+peers. This range can be customised with the function
+@code{GNUNET_TESTING_system_create_with_portrange()}. This function is similar
+to @code{GNUNET_TESTING_system_create()} except that it take 2 additional
+parameters --- the start and end of the port range to use.
+A TESTING system is destroyed with the funciton
+@code{GNUNET_TESTING_system_destory()}. This function takes the handle of the
+system and a flag to remove the files created in the directory used to generate
+configurations.
+A peer is created with the function @code{GNUNET_TESTING_peer_configure()}.
+This functions takes the system handle, a configuration template from which the
+configuration for the peer is auto-generated and the index from where the
+hostkey for the peer has to be copied from. When successfull, this function
+returs a handle to the peer which can be used to start and stop it and to
+obtain the identity of the peer. If unsuccessful, a NULL pointer is returned
+with an error message. This function handles the generated configuration to
+have non-conflicting ports and paths.
+Peers can be started and stopped by calling the functions
+@code{GNUNET_TESTING_peer_start()} and @code{GNUNET_TESTING_peer_stop()}
+respectively. A peer can be destroyed by calling the function
+@code{GNUNET_TESTING_peer_destroy}. When a peer is destroyed, the ports and
+paths in allocated in its configuration are reclaimed for usage in new
+peers.
+@c ***************************************************************************
+@node Finer control over peer stop
+@subsection Finer control over peer stop
+Using @code{GNUNET_TESTING_peer_stop()} is normally fine for testcases.
+However, calling this function for each peer is inefficient when trying to
+shutdown multiple peers as this function sends the termination signal to the
+given peer process and waits for it to terminate. It would be faster in this
+case to send the termination signals to the peers first and then wait on them.
+This is accomplished by the functions @code{GNUNET_TESTING_peer_kill()} which
+sends a termination signal to the peer, and the function
+@code{GNUNET_TESTING_peer_wait()} which waits on the peer.
+Further finer control can be achieved by choosing to stop a peer asynchronously
+with the function @code{GNUNET_TESTING_peer_stop_async()}. This function takes
+a callback parameter and a closure for it in addition to the handle to the peer
+to stop. The callback function is called with the given closure when the peer
+is stopped. Using this function eliminates blocking while waiting for the peer
+to terminate.
+An asynchronous peer stop can be cancelled by calling the function
+@code{GNUNET_TESTING_peer_stop_async_cancel()}. Note that calling this function
+does not prevent the peer from terminating if the termination signal has
+already been sent to it. It does, however, cancels the callback to be called
+when the peer is stopped.
+@c ***************************************************************************
+@node Helper functions
+@subsection Helper functions
+Most of the testcases can benefit from an abstraction which configures a peer
+and starts it. This is provided by the function
+@code{GNUNET_TESTING_peer_run()}. This function takes the testing directory
+pathname, a configuration template, a callback and its closure. This function
+creates a peer in the given testing directory by using the configuration
+template, starts the peer and calls the given callback with the given closure.
+The function @code{GNUNET_TESTING_peer_run()} starts the ARM service of the
+peer which starts the rest of the configured services. A similar function
+@code{GNUNET_TESTING_service_run} can be used to just start a single service of
+a peer. In this case, the peer's ARM service is not started; instead, only the
+given service is run.
+@c ***************************************************************************
+@node Testing with multiple processes
+@subsection Testing with multiple processes
+When testing GNUnet, the splitting of the code into a services and clients
+often complicates testing. The solution to this is to have the testcase fork
+@code{gnunet-service-arm}, ask it to start the required server and daemon
+processes and then execute appropriate client actions (to test the client APIs
+or the core module or both). If necessary, multiple ARM services can be forked
+using different ports (!) to simulate a network. However, most of the time only
+one ARM process is needed. Note that on exit, the testcase should shutdown ARM
+with a @code{TERM} signal (to give it the chance to cleanly stop its child
+processes).
+The following code illustrates spawning and killing an ARM process from a
+testcase:
+@example
+static void run (void *cls, char *const *args, const char
+*cfgfile, const struct GNUNET_CONFIGURATION_Handle *cfg) @{ struct
+GNUNET_OS_Process *arm_pid; arm_pid = GNUNET_OS_start_process (NULL, NULL,
+"gnunet-service-arm", "gnunet-service-arm", "-c", cfgname, NULL);
+  /* do real test work here */
+  if (0 != GNUNET_OS_process_kill (arm_pid, SIGTERM)) GNUNET_log_strerror
+  (GNUNET_ERROR_TYPE_WARNING, "kill"); GNUNET_assert (GNUNET_OK ==
+  GNUNET_OS_process_wait (arm_pid)); GNUNET_OS_process_close (arm_pid); @}
+GNUNET_PROGRAM_run (argc, argv, "NAME-OF-TEST", "nohelp", options, &run, cls);
+@end example
+An alternative way that works well to test plugins is to implement a
+mock-version of the environment that the plugin expects and then to simply load
+the plugin directly.
+@c ***************************************************************************
+@node Performance regression analysis with Gauger
+@section Performance regression analysis with Gauger
+To help avoid performance regressions, GNUnet uses Gauger. Gauger is a simple
+logging tool that allows remote hosts to send performance data to a central
+server, where this data can be analyzed and visualized. Gauger shows graphs of
+the repository revisions and the performace data recorded for each revision, so
+sudden performance peaks or drops can be identified and linked to a specific
+revision number.
+In the case of GNUnet, the buildbots log the performance data obtained during
+the tests after each build. The data can be accesed on GNUnet's Gauger page.
+The menu on the left allows to select either the results of just one build bot
+(under "Hosts") or review the data from all hosts for a given test result
+(under "Metrics"). In case of very different absolute value of the results, for
+instance arm vs. amd64 machines, the option "Normalize" on a metric view can
+help to get an idea about the performance evolution across all hosts.
+Using Gauger in GNUnet and having the performance of a module tracked over time
+is very easy. First of course, the testcase must generate some consistent
+metric, which makes sense to have logged. Highly volatile or random dependant
+metrics probably are not ideal candidates for meaningful regression detection.
+To start logging any value, just include @code{gauger.h} in your testcase code.
+Then, use the macro @code{GAUGER()} to make the buildbots log whatever value is
+of interest for you to @code{gnunet.org}'s Gauger server. No setup is necessary
+as most buildbots have already everything in place and new metrics are created
+on demand. To delete a metric, you need to contact a member of the GNUnet
+development team (a file will need to be removed manually from the respective
+directory).
+The code in the test should look like this:
+@example
+[other includes]
+#include <gauger.h>
+int main (int argc, char *argv[]) @{
+  [run test, generate data] GAUGER("YOUR_MODULE", "METRIC_NAME", (float)value,
+  "UNIT"); @}
+@end example
+Where:
+@table @asis
+@item @strong{YOUR_MODULE} is a category in the gauger page and should be the
+name of the module or subsystem like "Core" or "DHT"
+@item @strong{METRIC} is
+the name of the metric being collected and should be concise and descriptive,
+like "PUT operations in sqlite-datastore".
+@item @strong{value} is the value
+of the metric that is logged for this run.
+@item @strong{UNIT} is the unit in
+which the value is measured, for instance "kb/s" or "kb of RAM/node".
+@end table
+If you wish to use Gauger for your own project, you can grab a copy of the
+latest stable release or check out Gauger's Subversion repository.
+@c ***************************************************************************
+@node GNUnet's TESTBED Subsystem
+@section GNUnet's TESTBED Subsystem
+The TESTBED subsystem facilitates testing and measuring of multi-peer
+deployments on a single host or over multiple hosts.
+The architecture of the testbed module is divided into the following:
+@itemize @bullet
+@item Testbed API: An API which is used by the testing driver programs. It
+provides with functions for creating, destroying, starting, stopping peers,
+etc.
+@item Testbed service (controller): A service which is started through the
+Testbed API. This service handles operations to create, destroy, start, stop
+peers, connect them, modify their configurations.
+@item Testbed helper: When a controller has to be started on a host, the
+testbed API starts the testbed helper on that host which in turn starts the
+controller. The testbed helper receives a configuration for the controller
+through its stdin and changes it to ensure the controller doesn't run into any
+port conflict on that host.
+@end itemize
+The testbed service (controller) is different from the other GNUnet services in
+that it is not started by ARM and is not supposed to be run as a daemon. It is
+started by the testbed API through a testbed helper. In a typical scenario
+involving multiple hosts, a controller is started on each host. Controllers
+take up the actual task of creating peers, starting and stopping them on the
+hosts they run.
+While running deployments on a single localhost the testbed API starts the
+testbed helper directly as a child process. When running deployments on remote
+hosts the testbed API starts Testbed Helpers on each remote host through remote
+shell. By default testbed API uses SSH as a remote shell. This can be changed
+by setting the environmental variable GNUNET_TESTBED_RSH_CMD to the required
+remote shell program. This variable can also contain parameters which are to be
+passed to the remote shell program. For e.g:@ @code{@ export
+GNUNET_TESTBED_RSH_CMD="ssh -o BatchMode=yes -o
+NoHostAuthenticationForLocalhost=yes %h"@ }@ Substitutions are allowed int the
+above command string also allows for substitions. through placemarks which
+begin with a `%'. At present the following substitutions are supported
+@itemize @bullet
+@item
+%h: hostname
+@item
+%u: username
+@item
+%p: port
+@end itemize
+Note that the substitution placemark is replaced only when the corresponding
+field is available and only once. Specifying @code{%u@@%h} doesn't work either.
+If you want to user username substitutions for SSH use the argument @code{-l}
+before the username substitution. Ex: @code{ssh -l %u -p %p %h}
+The testbed API and the helper communicate through the helpers stdin and
+stdout. As the helper is started through a remote shell on remote hosts any
+output messages from the remote shell interfere with the communication and
+results in a failure while starting the helper. For this reason, it is
+suggested to use flags to make the remote shells produce no output messages and
+to have password-less logins. The default remote shell, SSH, the default
+options are "-o BatchMode=yes -o NoHostBasedAuthenticationForLocalhost=yes".
+Password-less logins should be ensured by using SSH keys.
+Since the testbed API executes the remote shell as a non-interactive shell,
+certain scripts like .bashrc, .profiler may not be executed. If this is the
+case testbed API can be forced to execute an interactive shell by setting up
+the environmental variable `GNUNET_TESTBED_RSH_CMD_SUFFIX' to a shell program.
+An example could be:@ @code{@ export GNUNET_TESTBED_RSH_CMD_SUFFIX="sh -lc"@ }@
+The testbed API will then execute the remote shell program as: @code{
+$GNUNET_TESTBED_RSH_CMD -p $port $dest $GNUNET_TESTBED_RSH_CMD_SUFFIX
+gnunet-helper-testbed }
+On some systems, problems may arise while starting testbed helpers if GNUnet is
+installed into a custom location since the helper may not be found in the
+standard path. This can be addressed by setting the variable
+`HELPER_BINARY_PATH' to the path of the testbed helper. Testbed API will then
+use this path to start helper binaries both locally and remotely.
+Testbed API can accessed by including "gnunet_testbed_service.h" file and
+linking with -lgnunettestbed.
+@c ***************************************************************************
+@menu
+* Supported Topologies::
+* Hosts file format::
+* Topology file format::
+* Testbed Barriers::
+* Automatic large-scale deployment of GNUnet in the PlanetLab testbed::
+* TESTBED Caveats::
+@end menu
+@node Supported Topologies
+@subsection Supported Topologies
+While testing multi-peer deployments, it is often needed that the peers are
+connected in some topology. This requirement is addressed by the function
+@code{GNUNET_TESTBED_overlay_connect()} which connects any given two peers in
+the testbed.
+The API also provides a helper function
+@code{GNUNET_TESTBED_overlay_configure_topology()} to connect a given set of
+peers in any of the following supported topologies:
+@itemize @bullet
+@item @code{GNUNET_TESTBED_TOPOLOGY_CLIQUE}: All peers are connected with each
+other
+@item @code{GNUNET_TESTBED_TOPOLOGY_LINE}: Peers are connected to form a line
+@item @code{GNUNET_TESTBED_TOPOLOGY_RING}: Peers are connected to form a ring
+topology
+@item @code{GNUNET_TESTBED_TOPOLOGY_2D_TORUS}: Peers are connected to form a 2
+dimensional torus topology. The number of peers may not be a perfect square, in
+that case the resulting torus may not have the uniform poloidal and toroidal
+lengths
+@item @code{GNUNET_TESTBED_TOPOLOGY_ERDOS_RENYI}: Topology is generated to form
+a random graph. The number of links to be present should be given
+@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD}: Peers are connected to form a
+2D Torus with some random links among them. The number of random links are to
+be given
+@item @code{GNUNET_TESTBED_TOPOLOGY_SMALL_WORLD_RING}: Peers are connected to
+form a ring with some random links among them. The number of random links are
+to be given
+@item @code{GNUNET_TESTBED_TOPOLOGY_SCALE_FREE}: Connects peers in a topology
+where peer connectivity follows power law - new peers are connected with high
+probabililty to well connected peers. See Emergence of Scaling in Random
+Networks. Science 286, 509-512, 1999.
+@item @code{GNUNET_TESTBED_TOPOLOGY_FROM_FILE}: The topology information is
+loaded from a file. The path to the file has to be given. See Topology file
+format for the format of this file.
+@item @code{GNUNET_TESTBED_TOPOLOGY_NONE}: No topology
+@end itemize
+The above supported topologies can be specified respectively by setting the
+variable @code{OVERLAY_TOPOLOGY} to the following values in the configuration
+passed to Testbed API functions @code{GNUNET_TESTBED_test_run()} and
+@code{GNUNET_TESTBED_run()}:
+@itemize @bullet
+@item @code{CLIQUE}
+@item @code{RING}
+@item @code{LINE}
+@item @code{2D_TORUS}
+@item @code{RANDOM}
+@item @code{SMALL_WORLD}
+@item @code{SMALL_WORLD_RING}
+@item @code{SCALE_FREE}
+@item @code{FROM_FILE}
+@item @code{NONE}
+@end itemize
+Topologies @code{RANDOM}, @code{SMALL_WORLD} and @code{SMALL_WORLD_RING}
+require the option @code{OVERLAY_RANDOM_LINKS} to be set to the number of
+random links to be generated in the configuration. The option will be ignored
+for the rest of the topologies.
+Toplogy @code{SCALE_FREE} requires the options @code{SCALE_FREE_TOPOLOGY_CAP}
+to be set to the maximum number of peers which can connect to a peer and
+@code{SCALE_FREE_TOPOLOGY_M} to be set to how many peers a peer should be
+atleast connected to.
+Similarly, the topology @code{FROM_FILE} requires the option
+@code{OVERLAY_TOPOLOGY_FILE} to contain the path of the file containing the
+topology information. This option is ignored for the rest of the topologies.
+See Topology file format for the format of this file.
+@c ***************************************************************************
+@node Hosts file format
+@subsection Hosts file format
+The testbed API offers the function GNUNET_TESTBED_hosts_load_from_file() to
+load from a given file details about the hosts which testbed can use for
+deploying peers. This function is useful to keep the data about hosts separate
+instead of hard coding them in code.
+Another helper function from testbed API, GNUNET_TESTBED_run() also takes a
+hosts file name as its parameter. It uses the above function to populate the
+hosts data structures and start controllers to deploy peers.
+These functions require the hosts file to be of the following format:
+@itemize @bullet
+@item Each line is interpreted to have details about a host
+@item Host details should include the username to use for logging into the
+host, the hostname of the host and the port number to use for the remote shell
+program. All thee values should be given.
+@item These details should be given in the following format:
+@code{<username>@@<hostname>:<port>}
+@end itemize
+Note that having canonical hostnames may cause problems while resolving the IP
+addresses (See this bug). Hence it is advised to provide the hosts' IP
+numerical addresses as hostnames whenever possible.
+@c ***************************************************************************
+@node Topology file format
+@subsection Topology file format
+A topology file describes how peers are to be connected. It should adhere to
+the following format for testbed to parse it correctly.
+Each line should begin with the target peer id. This should be followed by a
+colon(`:') and origin peer ids seperated by `|'. All spaces except for newline
+characters are ignored. The API will then try to connect each origin peer to
+the target peer.
+For example, the following file will result in 5 overlay connections: [2->1],
+[3->1],[4->3], [0->3], [2->0]@ @code{@ 1:2|3@ 3:4| 0@ 0: 2@ }
+@c ***************************************************************************
+@node Testbed Barriers
+@subsection Testbed Barriers
+The testbed subsystem's barriers API facilitates coordination among the peers
+run by the testbed and the experiment driver. The concept is similar to the
+barrier synchronisation mechanism found in parallel programming or
+multi-threading paradigms - a peer waits at a barrier upon reaching it until
+the barrier is reached by a predefined number of peers. This predefined number
+of peers required to cross a barrier is also called quorum. We say a peer has
+reached a barrier if the peer is waiting for the barrier to be crossed.
+Similarly a barrier is said to be reached if the required quorum of peers reach
+the barrier. A barrier which is reached is deemed as crossed after all the
+peers waiting on it are notified.
+The barriers API provides the following functions:
+@itemize @bullet
+@item @strong{@code{GNUNET_TESTBED_barrier_init()}:} function to initialse a
+barrier in the experiment
+@item @strong{@code{GNUNET_TESTBED_barrier_cancel()}:} function to cancel a
+barrier which has been initialised before
+@item @strong{@code{GNUNET_TESTBED_barrier_wait()}:} function to signal barrier
+service that the caller has reached a barrier and is waiting for it to be
+crossed
+@item @strong{@code{GNUNET_TESTBED_barrier_wait_cancel()}:} function to stop
+waiting for a barrier to be crossed
+@end itemize
+Among the above functions, the first two, namely
+@code{GNUNET_TESTBED_barrier_init()} and @code{GNUNET_TESTBED_barrier_cancel()}
+are used by experiment drivers. All barriers should be initialised by the
+experiment driver by calling @code{GNUNET_TESTBED_barrier_init()}. This
+function takes a name to identify the barrier, the quorum required for the
+barrier to be crossed and a notification callback for notifying the experiment
+driver when the barrier is crossed. @code{GNUNET_TESTBED_barrier_cancel()}
+cancels an initialised barrier and frees the resources allocated for it. This
+function can be called upon a initialised barrier before it is crossed.
+The remaining two functions @code{GNUNET_TESTBED_barrier_wait()} and
+@code{GNUNET_TESTBED_barrier_wait_cancel()} are used in the peer's processes.
+@code{GNUNET_TESTBED_barrier_wait()} connects to the local barrier service
+running on the same host the peer is running on and registers that the caller
+has reached the barrier and is waiting for the barrier to be crossed. Note that
+this function can only be used by peers which are started by testbed as this
+function tries to access the local barrier service which is part of the testbed
+controller service. Calling @code{GNUNET_TESTBED_barrier_wait()} on an
+uninitialised barrier results in failure.
+@code{GNUNET_TESTBED_barrier_wait_cancel()} cancels the notification registered
+by @code{GNUNET_TESTBED_barrier_wait()}.
+@c ***************************************************************************
+@menu
+* Implementation::
+@end menu
+@node Implementation
+@subsubsection Implementation
+Since barriers involve coordination between experiment driver and peers, the
+barrier service in the testbed controller is split into two components. The
+first component responds to the message generated by the barrier API used by
+the experiment driver (functions @code{GNUNET_TESTBED_barrier_init()} and
+@code{GNUNET_TESTBED_barrier_cancel()}) and the second component to the
+messages generated by barrier API used by peers (functions
+@code{GNUNET_TESTBED_barrier_wait()} and
+@code{GNUNET_TESTBED_barrier_wait_cancel()}).
+Calling @code{GNUNET_TESTBED_barrier_init()} sends a
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_INIT} message to the master
+controller. The master controller then registers a barrier and calls
+@code{GNUNET_TESTBED_barrier_init()} for each its subcontrollers. In this way
+barrier initialisation is propagated to the controller hierarchy. While
+propagating initialisation, any errors at a subcontroller such as timeout
+during further propagation are reported up the hierarchy back to the experiment
+driver.
+Similar to @code{GNUNET_TESTBED_barrier_init()},
+@code{GNUNET_TESTBED_barrier_cancel()} propagates
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_CANCEL} message which causes
+controllers to remove an initialised barrier.
+The second component is implemented as a separate service in the binary
+`gnunet-service-testbed' which already has the testbed controller service.
+Although this deviates from the gnunet process architecture of having one
+service per binary, it is needed in this case as this component needs access to
+barrier data created by the first component. This component responds to
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages from local peers when
+they call @code{GNUNET_TESTBED_barrier_wait()}. Upon receiving
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} message, the service checks if
+the requested barrier has been initialised before and if it was not
+initialised, an error status is sent through
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to the local peer and
+the connection from the peer is terminated. If the barrier is initialised
+before, the barrier's counter for reached peers is incremented and a
+notification is registered to notify the peer when the barrier is reached. The
+connection from the peer is left open.
+When enough peers required to attain the quorum send
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_WAIT} messages, the controller sends
+a @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message to its parent
+informing that the barrier is crossed. If the controller has started further
+subcontrollers, it delays this message until it receives a similar notification
+from each of those subcontrollers. Finally, the barriers API at the experiment
+driver receives the @code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} when the
+barrier is reached at all the controllers.
+The barriers API at the experiment driver responds to the
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message by echoing it back to
+the master controller and notifying the experiment controller through the
+notification callback that a barrier has been crossed. The echoed
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message is propagated by the
+master controller to the controller hierarchy. This propagation triggers the
+notifications registered by peers at each of the controllers in the hierarchy.
+Note the difference between this downward propagation of the
+@code{GNUNET_MESSAGE_TYPE_TESTBED_BARRIER_STATUS} message from its upward
+propagation --- the upward propagation is needed for ensuring that the barrier
+is reached by all the controllers and the downward propagation is for
+triggering that the barrier is crossed.
+@c ***************************************************************************
+@node Automatic large-scale deployment of GNUnet in the PlanetLab testbed
+@subsection Automatic large-scale deployment of GNUnet in the PlanetLab testbed
+PlanetLab is as a testbed for computer networking and distributed systems
+research. It was established in 2002 and as of June 2010 was composed of 1090
+nodes at 507 sites worldwide.
+To automate the GNUnet we created a set of automation tools to simplify the
+large-scale deployment. We provide you a set of scripts you can use to deploy
+GNUnet on a set of nodes and manage your installation.
+Please also check @uref{https://gnunet.org/installation-fedora8-svn} and@
+@uref{https://gnunet.org/installation-fedora12-svn} to find detailled
+instructions how to install GNUnet on a PlanetLab node.
+@c ***************************************************************************
+@menu
+* PlanetLab Automation for Fedora8 nodes::
+* Install buildslave on PlanetLab nodes running fedora core 8::
+* Setup a new PlanetLab testbed using GPLMT::
+* Why do i get an ssh error when using the regex profiler?::
+@end menu
+@node PlanetLab Automation for Fedora8 nodes
+@subsubsection PlanetLab Automation for Fedora8 nodes
+@c ***************************************************************************
+@node Install buildslave on PlanetLab nodes running fedora core 8
+@subsubsection Install buildslave on PlanetLab nodes running fedora core 8
+@c ** Actually this is a subsubsubsection, but must be fixed differently
+@c ** as subsubsection is the lowest.
+Since most of the PlanetLab nodes are running the very old fedora core 8 image,
+installing the buildslave software is quite some pain. For our PlanetLab
+testbed we figured out how to install the buildslave software best.
+Install Distribute for python:@ @code{@ curl
+http://python-distribute.org/distribute_setup.py | sudo python@ }
+Install Distribute for zope.interface <= 3.8.0 (4.0 and 4.0.1 will not work):@
+@code{@ wget
+http://pypi.python.org/packages/source/z/zope.interface/zope.interface-3.8.0.tar.gz@
+tar zvfz zope.interface-3.8.0.tar.gz@ cd zope.interface-3.8.0@ sudo python
+setup.py install@ }
+Install the buildslave software (0.8.6 was the latest version):@ @code{@ wget
+http://buildbot.googlecode.com/files/buildbot-slave-0.8.6p1.tar.gz@ tar xvfz
+buildbot-slave-0.8.6p1.tar.gz@ cd buildslave-0.8.6p1@ sudo python setup.py
+install@ }
+The setup will download the matching twisted package and install it.@ It will
+also try to install the latest version of zope.interface which will fail to
+install. Buildslave will work anyway since version 3.8.0 was installed before!
+@c ***************************************************************************
+@node Setup a new PlanetLab testbed using GPLMT
+@subsubsection Setup a new PlanetLab testbed using GPLMT
+@itemize @bullet
+@item Get a new slice and assign nodes
+Ask your PlanetLab PI to give you a new slice and assign the nodes you need
+@item Install a buildmaster
+You can stick to the buildbot documentation:@
+@uref{http://buildbot.net/buildbot/docs/current/manual/installation.html}
+@item Install the buildslave software on all nodes
+To install the buildslave on all nodes assigned to your slice you can use the
+tasklist @code{install_buildslave_fc8.xml} provided with GPLMT:
+@code{@ ./gplmt.py -c contrib/tumple_gnunet.conf -t
+contrib/tasklists/install_buildslave_fc8.xml -a -p <planetlab password>@ }
+@item Create the buildmaster configuration and the slave setup commands
+The master and the and the slaves have need to have credentials and the master
+has to have all nodes configured. This can be done with the
+@code{create_buildbot_configuration.py} script in the @code{scripts} directory
+This scripts takes a list of nodes retrieved directly from PlanetLab or read
+from a file and a configuration template and creates:@
+ - a tasklist which can be executed with gplmt to setup the slaves@
+ - a master.cfg file containing a PlanetLab nodes
+A configuration template is included in the <contrib>, most important is that
+the script replaces the following tags in the template:
+%GPLMT_BUILDER_DEFINITION :@ GPLMT_BUILDER_SUMMARY@ GPLMT_SLAVES@
+%GPLMT_SCHEDULER_BUILDERS
+Create configuration for all nodes assigned to a slice:@ @code{@
+./create_buildbot_configuration.py -u <planetlab username> -p <planetlab
+password> -s <slice> -m <buildmaster+port> -t <template>@ }@ Create
+configuration for some nodes in a file:@ @code{@
+./create_buildbot_configuration.p -f <node_file> -m <buildmaster+port> -t
+<template>@ }
+@item Copy the @code{master.cfg} to the buildmaster and start it
+Use @code{buildbot start <basedir>} to start the server
+@item Setup the buildslaves
+@end itemize
+@c ***************************************************************************
+@node Why do i get an ssh error when using the regex profiler?
+@subsubsection Why do i get an ssh error when using the regex profiler?
+Why do i get an ssh error "Permission denied (publickey,password)." when using
+the regex profiler although passwordless ssh to localhost works using publickey
+and ssh-agent?
+You have to generate a public/private-key pair with no password:@
+@code{ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_localhost}@
+and then add the following to your ~/.ssh/config file:
+@code{Host 127.0.0.1@ IdentityFile ~/.ssh/id_localhost}
+now make sure your hostsfile looks like@
+[USERNAME]@@127.0.0.1:22@
+[USERNAME]@@127.0.0.1:22
+You can test your setup by running `ssh 127.0.0.1` in a terminal and then in
+the opened session run it again. If you were not asked for a password on either
+login, then you should be good to go.
+@c ***************************************************************************
+@node TESTBED Caveats
+@subsection TESTBED Caveats
+This section documents a few caveats when using the GNUnet testbed
+subsystem.
+@c ***************************************************************************
+@menu
+* CORE must be started::
+* ATS must want the connections::
+@end menu
+@node CORE must be started
+@subsubsection CORE must be started
+A simple issue is #3993: Your configuration MUST somehow ensure that for each
+peer the CORE service is started when the peer is setup, otherwise TESTBED may
+fail to connect peers when the topology is initialized, as TESTBED will start
+some CORE services but not necessarily all (but it relies on all of them
+running). The easiest way is to set 'FORCESTART = YES' in the '[core]' section
+of the configuration file. Alternatively, having any service that directly or
+indirectly depends on CORE being started with FORCESTART will also do. This
+issue largely arises if users try to over-optimize by not starting any services
+with FORCESTART.
+@c ***************************************************************************
+@node ATS must want the connections
+@subsubsection ATS must want the connections
+When TESTBED sets up connections, it only offers the respective HELLO
+information to the TRANSPORT service. It is then up to the ATS service to
+@strong{decide} to use the connection. The ATS service will typically eagerly
+establish any connection if the number of total connections is low (relative to
+bandwidth). Details may further depend on the specific ATS backend that was
+configured. If ATS decides to NOT establish a connection (even though TESTBED
+provided the required information), then that connection will count as failed
+for TESTBED. Note that you can configure TESTBED to tolerate a certain number
+of connection failures (see '-e' option of gnunet-testbed-profiler). This issue
+largely arises for dense overlay topologies, especially if you try to create
+cliques with more than 20 peers.
+@c ***************************************************************************
+@node libgnunetutil
+@section libgnunetutil
+libgnunetutil is the fundamental library that all GNUnet code builds upon.
+Ideally, this library should contain most of the platform dependent code
+(except for user interfaces and really special needs that only few applications
+have). It is also supposed to offer basic services that most if not all GNUnet
+binaries require. The code of libgnunetutil is in the src/util/ directory. The
+public interface to the library is in the gnunet_util.h header. The functions
+provided by libgnunetutil fall roughly into the following categories (in
+roughly the order of importance for new developers):
+@itemize @bullet
+@item logging (common_logging.c)
+@item memory allocation (common_allocation.c)
+@item endianess conversion (common_endian.c)
+@item internationalization (common_gettext.c)
+@item String manipulation (string.c)
+@item file access (disk.c)
+@item buffered disk IO (bio.c)
+@item time manipulation (time.c)
+@item configuration parsing (configuration.c)
+@item command-line handling (getopt*.c)
+@item cryptography (crypto_*.c)
+@item data structures (container_*.c)
+@item CPS-style scheduling (scheduler.c)
+@item Program initialization (program.c)
+@item Networking (network.c, client.c, server*.c, service.c)
+@item message queueing (mq.c)
+@item bandwidth calculations (bandwidth.c)
+@item Other OS-related (os*.c, plugin.c, signal.c)
+@item Pseudonym management (pseudonym.c)
+@end itemize
+It should be noted that only developers that fully understand this entire API
+will be able to write good GNUnet code.
+Ideally, porting GNUnet should only require porting the gnunetutil library.
+More testcases for the gnunetutil APIs are therefore a great way to make
+porting of GNUnet easier.
+@menu
+* Logging::
+* Interprocess communication API (IPC)::
+* Cryptography API::
+* Message Queue API::
+* Service API::
+* Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps::
+* The CONTAINER_MDLL API::
+@end menu
+@c ***************************************************************************
+@node Logging
+@subsection Logging
+GNUnet is able to log its activity, mostly for the purposes of debugging the
+program at various levels.
+@code{gnunet_common.h} defines several @strong{log levels}:
+@table @asis
+@item ERROR for errors (really problematic situations, often leading to
+crashes)
+@item WARNING for warnings (troubling situations that might have
+negative consequences, although not fatal)
+@item INFO for various information.
+Used somewhat rarely, as GNUnet statistics is used to hold and display most of
+the information that users might find interesting.
+@item DEBUG for debugging.
+Does not produce much output on normal builds, but when extra logging is
+enabled at compile time, a staggering amount of data is outputted under this
+log level.
+@end table
+Normal builds of GNUnet (configured with @code{--enable-logging[=yes]}) are
+supposed to log nothing under DEBUG level. The @code{--enable-logging=verbose}
+configure option can be used to create a build with all logging enabled.
+However, such build will produce large amounts of log data, which is
+inconvenient when one tries to hunt down a specific problem.
+To mitigate this problem, GNUnet provides facilities to apply a filter to
+reduce the logs:
+@table @asis
+@item Logging by default When no log levels are configured in any other way
+(see below), GNUnet will default to the WARNING log level. This mostly applies
+to GNUnet command line utilities, services and daemons; tests will always set
+log level to WARNING or, if @code{--enable-logging=verbose} was passed to
+configure, to DEBUG. The default level is suggested for normal operation.
+@item The -L option Most GNUnet executables accept an "-L loglevel" or
+"--log=loglevel" option. If used, it makes the process set a global log level
+to "loglevel". Thus it is possible to run some processes with -L DEBUG, for
+example, and others with -L ERROR to enable specific settings to diagnose
+problems with a particular process.
+@item Configuration files.  Because GNUnet
+service and deamon processes are usually launched by gnunet-arm, it is not
+possible to pass different custom command line options directly to every one of
+them. The options passed to @code{gnunet-arm} only affect gnunet-arm and not
+the rest of GNUnet. However, one can specify a configuration key "OPTIONS" in
+the section that corresponds to a service or a daemon, and put a value of "-L
+loglevel" there. This will make the respective service or daemon set its log
+level to "loglevel" (as the value of OPTIONS will be passed as a command-line
+argument).
+To specify the same log level for all services without creating separate
+"OPTIONS" entries in the configuration for each one, the user can specify a
+config key "GLOBAL_POSTFIX" in the [arm] section of the configuration file. The
+value of GLOBAL_POSTFIX will be appended to all command lines used by the ARM
+service to run other services. It can contain any option valid for all GNUnet
+commands, thus in particular the "-L loglevel" option. The ARM service itself
+is, however, unaffected by GLOBAL_POSTFIX; to set log level for it, one has to
+specify "OPTIONS" key in the [arm] section.
+@item Environment variables.
+Setting global per-process log levels with "-L loglevel" does not offer
+sufficient log filtering granularity, as one service will call interface
+libraries and supporting libraries of other GNUnet services, potentially
+producing lots of debug log messages from these libraries. Also, changing the
+config file is not always convenient (especially when running the GNUnet test
+suite).@ To fix that, and to allow GNUnet to use different log filtering at
+runtime without re-compiling the whole source tree, the log calls were changed
+to be configurable at run time. To configure them one has to define environment
+variables "GNUNET_FORCE_LOGFILE", "GNUNET_LOG" and/or "GNUNET_FORCE_LOG":
+@itemize @bullet
+@item "GNUNET_LOG" only affects the logging when no global log level is
+configured by any other means (that is, the process does not explicitly set its
+own log level, there are no "-L loglevel" options on command line or in
+configuration files), and can be used to override the default WARNING log
+level.
+@item "GNUNET_FORCE_LOG" will completely override any other log configuration
+options given.
+@item "GNUNET_FORCE_LOGFILE" will completely override the location of the file
+to log messages to. It should contain a relative or absolute file name. Setting
+GNUNET_FORCE_LOGFILE is equivalent to passing "--log-file=logfile" or "-l
+logfile" option (see below). It supports "[]" format in file names, but not
+"@{@}" (see below).
+@end itemize
+Because environment variables are inherited by child processes when they are
+launched, starting or re-starting the ARM service with these variables will
+propagate them to all other services.
+"GNUNET_LOG" and "GNUNET_FORCE_LOG" variables must contain a specially
+formatted @strong{logging definition} string, which looks like this:@ @code{@
+[component];[file];[function];[from_line[-to_line]];loglevel@emph{[/component...]}@
+}@ That is, a logging definition consists of definition entries, separated by
+slashes ('/'). If only one entry is present, there is no need to add a slash
+to its end (although it is not forbidden either).@ All definition fields
+(component, file, function, lines and loglevel) are mandatory, but (except for
+the loglevel) they can be empty. An empty field means "match anything". Note
+that even if fields are empty, the semicolon (';') separators must be
+present.@ The loglevel field is mandatory, and must contain one of the log
+level names (ERROR, WARNING, INFO or DEBUG).@ The lines field might contain
+one non-negative number, in which case it matches only one line, or a range
+"from_line-to_line", in which case it matches any line in the interval
+[from_line;to_line] (that is, including both start and end line).@ GNUnet
+mostly defaults component name to the name of the service that is implemented
+in a process ('transport', 'core', 'peerinfo', etc), but logging calls can
+specify custom component names using @code{GNUNET_log_from}.@ File name and
+function name are provided by the compiler (__FILE__ and __FUNCTION__
+built-ins).
+Component, file and function fields are interpreted as non-extended regular
+expressions (GNU libc regex functions are used). Matching is case-sensitive, ^
+and $ will match the beginning and the end of the text. If a field is empty,
+its contents are automatically replaced with a ".*" regular expression, which
+matches anything. Matching is done in the default way, which means that the
+expression matches as long as it's contained anywhere in the string. Thus
+"GNUNET_" will match both "GNUNET_foo" and "BAR_GNUNET_BAZ". Use '^' and/or '$'
+to make sure that the expression matches at the start and/or at the end of the
+string.@ The semicolon (';') can't be escaped, and GNUnet will not use it in
+component names (it can't be used in function names and file names anyway).@
+@end table
+Every logging call in GNUnet code will be (at run time) matched against the
+log definitions passed to the process. If a log definition fields are matching
+the call arguments, then the call log level is compared the the log level of
+that definition. If the call log level is less or equal to the definition log
+level, the call is allowed to proceed. Otherwise the logging call is
+forbidden, and nothing is logged. If no definitions matched at all, GNUnet
+will use the global log level or (if a global log level is not specified) will
+default to WARNING (that is, it will allow the call to proceed, if its level
+is less or equal to the global log level or to WARNING).
+That is, definitions are evaluated from left to right, and the first matching
+definition is used to allow or deny the logging call. Thus it is advised to
+place narrow definitions at the beginning of the logdef string, and generic
+definitions - at the end.
+Whether a call is allowed or not is only decided the first time this particular
+call is made. The evaluation result is then cached, so that any attempts to
+make the same call later will be allowed or disallowed right away. Because of
+that runtime log level evaluation should not significantly affect the process
+performance.@ Log definition parsing is only done once, at the first call to
+GNUNET_log_setup () made by the process (which is usually done soon after it
+starts).
+At the moment of writing there is no way to specify logging definitions from
+configuration files, only via environment variables.
+At the moment GNUnet will stop processing a log definition when it encounters
+an error in definition formatting or an error in regular expression syntax, and
+will not report the failure in any way.
+@c ***************************************************************************
+@menu
+* Examples::
+* Log files::
+* Updated behavior of GNUNET_log::
+@end menu
+@node Examples
+@subsubsection Examples
+@table @asis
+@item @code{GNUNET_FORCE_LOG=";;;;DEBUG" gnunet-arm -s} Start GNUnet process
+tree, running all processes with DEBUG level (one should be careful with it, as
+log files will grow at alarming rate!)
+@item @code{GNUNET_FORCE_LOG="core;;;;DEBUG" gnunet-arm -s} Start GNUnet process
+tree, running the core service under DEBUG level (everything else will use
+configured or default level).
+@item @code{GNUNET_FORCE_LOG=";gnunet-service-transport_validation.c;;;DEBUG" gnunet-arm -s}
+Start GNUnet process tree, allowing any logging calls from
+gnunet-service-transport_validation.c (everything else will use configured or
+default level).
+@item @code{GNUNET_FORCE_LOG="fs;gnunet-service-fs_push.c;;;DEBUG" gnunet-arm -s}
+Start GNUnet process tree, allowing any logging calls from
+gnunet-gnunet-service-fs_push.c (everything else will use configured or default
+level).
+@item @code{GNUNET_FORCE_LOG=";;GNUNET_NETWORK_socket_select;;DEBUG" gnunet-arm -s}
+Start GNUnet process tree, allowing any logging calls from the
+GNUNET_NETWORK_socket_select function (everything else will use configured or
+default level).
+@item @code{GNUNET_FORCE_LOG="transport.*;;.*send.*;;DEBUG/;;;;WARNING" gnunet-arm -s}
+Start GNUnet process tree, allowing any logging calls from the components
+that have "transport" in their names, and are made from function that have
+"send" in their names. Everything else will be allowed to be logged only if it
+has WARNING level.
+@end table
+On Windows, one can use batch files to run GNUnet processes with special
+environment variables, without affecting the whole system. Such batch file will
+look like this:@ @code{@ set GNUNET_FORCE_LOG=;;do_transmit;;DEBUG@ gnunet-arm
+-s@ }@ (note the absence of double quotes in the environment variable
+definition, as opposed to earlier examples, which use the shell).@ Another
+limitation, on Windows, GNUNET_FORCE_LOGFILE @strong{MUST} be set in order to
+GNUNET_FORCE_LOG to work.
+@c ***************************************************************************
+@node Log files
+@subsubsection Log files
+GNUnet can be told to log everything into a file instead of stderr (which is
+the default) using the "--log-file=logfile" or "-l logfile" option. This option
+can also be passed via command line, or from the "OPTION" and "GLOBAL_POSTFIX"
+configuration keys (see above). The file name passed with this option is
+subject to GNUnet filename expansion. If specified in "GLOBAL_POSTFIX", it is
+also subject to ARM service filename expansion, in particular, it may contain
+"@{@}" (left and right curly brace) sequence, which will be replaced by ARM
+with the name of the service. This is used to keep logs from more than one
+service separate, while only specifying one template containing "@{@}" in
+GLOBAL_POSTFIX.
+As part of a secondary file name expansion, the first occurrence of "[]"
+sequence ("left square brace" followed by "right square brace") in the file
+name will be replaced with a process identifier or the process when it
+initializes its logging subsystem. As a result, all processes will log into
+different files. This is convenient for isolating messages of a particular
+process, and prevents I/O races when multiple processes try to write into the
+file at the same time. This expansion is done independently of "@{@}"
+expansion that ARM service does (see above).
+The log file name that is specified via "-l" can contain format characters
+from the 'strftime' function family. For example, "%Y" will be replaced with
+the current year. Using "basename-%Y-%m-%d.log" would include the current
+year, month and day in the log file. If a GNUnet process runs for long enough
+to need more than one log file, it will eventually clean up old log files.
+Currently, only the last three log files (plus the current log file) are
+preserved. So once the fifth log file goes into use (so after 4 days if you
+use "%Y-%m-%d" as above), the first log file will be automatically deleted.
+Note that if your log file name only contains "%Y", then log files would be
+kept for 4 years and the logs from the first year would be deleted once year 5
+begins. If you do not use any date-related string format codes, logs would
+never be automatically deleted by GNUnet.
+@c ***************************************************************************
+@node Updated behavior of GNUNET_log
+@subsubsection Updated behavior of GNUNET_log
+It's currently quite common to see constructions like this all over the code:
+@example
+#if MESH_DEBUG GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, "MESH: client
+disconnected\n"); #endif
+@end example
+The reason for the #if is not to avoid displaying the message when disabled
+(GNUNET_ERROR_TYPE takes care of that), but to avoid the compiler including it
+in the binary at all, when compiling GNUnet for platforms with restricted
+storage space / memory (MIPS routers, ARM plug computers / dev boards, etc).
+This presents several problems: the code gets ugly, hard to write and it is
+very easy to forget to include the #if guards, creating non-consistent code. A
+new change in GNUNET_log aims to solve these problems.
+@strong{This change requires to @code{./configure} with at least
+@code{--enable-logging=verbose} to see debug messages.}
+Here is an example of code with dense debug statements:
+@example
+switch (restrict_topology) @{
+case GNUNET_TESTING_TOPOLOGY_CLIQUE: #if VERBOSE_TESTING
+GNUNET_log (GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but clique
+topology\n")); #endif unblacklisted_connections = create_clique (pg,
+&remove_connections, BLACKLIST, GNUNET_NO); break; case
+GNUNET_TESTING_TOPOLOGY_SMALL_WORLD_RING: #if VERBOSE_TESTING GNUNET_log
+(GNUNET_ERROR_TYPE_DEBUG, _("Blacklisting all but small world (ring)
+topology\n")); #endif unblacklisted_connections = create_small_world_ring (pg,
+&remove_connections, BLACKLIST); break;
+@end example
+Pretty hard to follow, huh?
+From now on, it is not necessary to include the #if / #endif statements to
+acheive the same behavior. The GNUNET_log and GNUNET_log_from macros take care
+of it for you, depending on the configure option:
+@itemize @bullet
+@item If @code{--enable-logging} is set to @code{no}, the binary will contain
+no log messages at all.
+@item If @code{--enable-logging} is set to @code{yes}, the binary will contain
+no DEBUG messages, and therefore running with -L DEBUG will have no effect.
+Other messages (ERROR, WARNING, INFO, etc) will be included.
+@item If @code{--enable-logging} is set to @code{verbose}, or
+@code{veryverbose} the binary will contain DEBUG messages (still, it will be
+neccessary to run with -L DEBUG or set the DEBUG config option to show them).
+@end itemize
+If you are a developer:
+@itemize @bullet
+@item please make sure that you @code{./configure
+--enable-logging=@{verbose,veryverbose@}}, so you can see DEBUG messages.
+@item please remove the @code{#if} statements around @code{GNUNET_log
+(GNUNET_ERROR_TYPE_DEBUG, ...)} lines, to improve the readibility of your code.
+@end itemize
+Since now activating DEBUG automatically makes it VERBOSE and activates
+@strong{all} debug messages by default, you probably want to use the
+https://gnunet.org/logging functionality to filter only relevant messages. A
+suitable configuration could be:@ @code{$ export
+GNUNET_FORCE_LOG="^YOUR_SUBSYSTEM$;;;;DEBUG/;;;;WARNING"}@ Which will behave
+almost like enabling DEBUG in that subsytem before the change. Of course you
+can adapt it to your particular needs, this is only a quick example.
+@c ***************************************************************************
+@node Interprocess communication API (IPC)
+@subsection Interprocess communication API (IPC)
+In GNUnet a variety of new message types might be defined and used in
+interprocess communication, in this tutorial we use the @code{struct
+AddressLookupMessage} as a example to introduce how to construct our own
+message type in GNUnet and how to implement the message communication between
+service and client.@ (Here, a client uses the @code{struct
+AddressLookupMessage} as a request to ask the server to return the address of
+any other peer connecting to the service.)
+@c ***************************************************************************
+@menu
+* Define new message types::
+* Define message struct::
+* Client: Establish connection::
+* Client: Initialize request message::
+* Client: Send request and receive response::
+* Server: Startup service::
+* Server: Add new handles for specified messages::
+* Server: Process request message::
+* Server: Response to client::
+* Server: Notification of clients::
+* Conversion between Network Byte Order (Big Endian) and Host Byte Order::
+@end menu
+@node Define new message types
+@subsubsection Define new message types
+First of all, you should define the new message type in
+@code{gnunet_protocols.h}:
+@example
+ // Request to look addresses of peers in server.
+#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP 29
+  // Response to the address lookup request.
+#define GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY 30
+@end example
+@c ***************************************************************************
+@node Define message struct
+@subsubsection Define message struct
+After the type definition, the specified message structure should also be
+described in the header file, e.g. transport.h in our case.
+@example
+GNUNET_NETWORK_STRUCT_BEGIN
+struct AddressLookupMessage @{ struct GNUNET_MessageHeader header; int32_t
+numeric_only GNUNET_PACKED; struct GNUNET_TIME_AbsoluteNBO timeout; uint32_t
+addrlen GNUNET_PACKED;
+ /* followed by 'addrlen' bytes of the actual address, then
+    followed by the 0-terminated name of the transport */ @};
+    GNUNET_NETWORK_STRUCT_END
+@end example
+Please note @code{GNUNET_NETWORK_STRUCT_BEGIN} and @code{GNUNET_PACKED} which
+both ensure correct alignment when sending structs over the network
+@menu
+@end menu
+@c ***************************************************************************
+@node Client: Establish connection
+@subsubsection Client: Establish connection
+@c %**end of header
+At first, on the client side, the underlying API is employed to create a new
+connection to a service, in our example the transport service would be
+connected.
+@example
+struct GNUNET_CLIENT_Connection *client; client =
+GNUNET_CLIENT_connect ("transport", cfg);
+@end example
+@c ***************************************************************************
+@node Client: Initialize request message
+@subsubsection Client: Initialize request message
+@c %**end of header
+When the connection is ready, we initialize the message. In this step, all the
+fields of the message should be properly initialized, namely the size, type,
+and some extra user-defined data, such as timeout, name of transport, address
+and name of transport.
+@example
+struct AddressLookupMessage *msg; size_t len =
+sizeof (struct AddressLookupMessage) + addressLen + strlen (nameTrans) + 1;
+msg->header->size = htons (len); msg->header->type = htons
+(GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP); msg->timeout =
+GNUNET_TIME_absolute_hton (abs_timeout); msg->addrlen = htonl (addressLen);
+char *addrbuf = (char *) &msg[1]; memcpy (addrbuf, address, addressLen); char
+*tbuf = &addrbuf[addressLen]; memcpy (tbuf, nameTrans, strlen (nameTrans) + 1);
+@end example
+Note that, here the functions @code{htonl}, @code{htons} and
+@code{GNUNET_TIME_absolute_hton} are applied to convert little endian into big
+endian, about the usage of the big/small edian order and the corresponding
+conversion function please refer to Introduction of Big Endian and Little
+Endian.
+@c ***************************************************************************
+@node Client: Send request and receive response
+@subsubsection Client: Send request and receive response
+@c %**end of header
+FIXME: This is very outdated, see the tutorial for the
+current API!
+Next, the client would send the constructed message as a request to the service
+and wait for the response from the service. To accomplish this goal, there are
+a number of API calls that can be used. In this example,
+@code{GNUNET_CLIENT_transmit_and_get_response} is chosen as the most
+appropriate function to use.
+@example
+GNUNET_CLIENT_transmit_and_get_response
+(client, msg->header, timeout, GNUNET_YES, &address_response_processor,
+arp_ctx);
+@end example
+the argument @code{address_response_processor} is a function with
+@code{GNUNET_CLIENT_MessageHandler} type, which is used to process the reply
+message from the service.
+@node Server: Startup service
+@subsubsection Server: Startup service
+After receiving the request message, we run a standard GNUnet service startup
+sequence using @code{GNUNET_SERVICE_run}, as follows,
+@example
+int main(int
+argc, char**argv) @{ GNUNET_SERVICE_run(argc, argv, "transport"
+GNUNET_SERVICE_OPTION_NONE, &run, NULL)); @}
+@end example
+@c ***************************************************************************
+@node Server: Add new handles for specified messages
+@subsubsection Server: Add new handles for specified messages
+@c %**end of header
+in the function above the argument @code{run} is used to initiate transport
+service,and defined like this:
+@example
+static void run (void *cls, struct
+GNUNET_SERVER_Handle *serv, const struct GNUNET_CONFIGURATION_Handle *cfg) @{
+GNUNET_SERVER_add_handlers (serv, handlers); @}
+@end example
+Here, @code{GNUNET_SERVER_add_handlers} must be called in the run function to
+add new handlers in the service. The parameter @code{handlers} is a list of
+@code{struct GNUNET_SERVER_MessageHandler} to tell the service which function
+should be called when a particular type of message is received, and should be
+defined in this way:
+@example
+static struct GNUNET_SERVER_MessageHandler
+handlers[] = @{ @{&handle_start, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_START,
+0@}, @{&handle_send, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_SEND, 0@},
+@{&handle_try_connect, NULL, GNUNET_MESSAGE_TYPE_TRANSPORT_TRY_CONNECT, sizeof
+(struct TryConnectMessage)@}, @{&handle_address_lookup, NULL,
+GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP, 0@}, @{NULL, NULL, 0, 0@} @};
+@end example
+As shown, the first member of the struct in the first area is a callback
+function, which is called to process the specified message types, given as the
+third member. The second parameter is the closure for the callback function,
+which is set to @code{NULL} in most cases, and the last parameter is the
+expected size of the message of this type, usually we set it to 0 to accept
+variable size, for special cases the exact size of the specified message also
+can be set. In addition, the terminator sign depicted as @code{@{NULL, NULL, 0,
+0@}} is set in the last aera.
+@c ***************************************************************************
+@node Server: Process request message
+@subsubsection Server: Process request message
+@c %**end of header
+After the initialization of transport service, the request message would be
+processed. Before handling the main message data, the validity of this message
+should be checked out, e.g., to check whether the size of message is correct.
+@example
+size = ntohs (message->size); if (size < sizeof (struct
+AddressLookupMessage)) @{ GNUNET_break_op (0); GNUNET_SERVER_receive_done
+(client, GNUNET_SYSERR); return; @}
+@end example
+Note that, opposite to the construction method of the request message in the
+client, in the server the function @code{nothl} and @code{ntohs} should be
+employed during the extraction of the data from the message, so that the data
+in big endian order can be converted back into little endian order. See more in
+detail please refer to Introduction of Big Endian and Little Endian.
+Moreover in this example, the name of the transport stored in the message is a
+0-terminated string, so we should also check whether the name of the transport
+in the received message is 0-terminated:
+@example
+nameTransport = (const char *)
+&address[addressLen]; if (nameTransport[size - sizeof (struct
+AddressLookupMessage)
+                                - addressLen - 1] != '\0') @{ GNUNET_break_op
+                                  (0); GNUNET_SERVER_receive_done (client,
+                                  GNUNET_SYSERR); return; @}
+@end example
+Here, @code{GNUNET_SERVER_receive_done} should be called to tell the service
+that the request is done and can receive the next message. The argument
+@code{GNUNET_SYSERR} here indicates that the service didn't understand the
+request message, and the processing of this request would be terminated.
+In comparison to the aforementioned situation, when the argument is equal to
+@code{GNUNET_OK}, the service would continue to process the requst message.
+@c ***************************************************************************
+@node Server: Response to client
+@subsubsection Server: Response to client
+@c %**end of header
+Once the processing of current request is done, the server should give the
+response to the client. A new @code{struct AddressLookupMessage} would be
+produced by the server in a similar way as the client did and sent to the
+client, but here the type should be
+@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY} rather than
+@code{GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_LOOKUP} in client.
+@example
+struct
+AddressLookupMessage *msg; size_t len = sizeof (struct AddressLookupMessage) +
+addressLen + strlen (nameTrans) + 1; msg->header->size = htons (len);
+msg->header->type = htons (GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
+// ...
+struct GNUNET_SERVER_TransmitContext *tc; tc =
+GNUNET_SERVER_transmit_context_create (client);
+GNUNET_SERVER_transmit_context_append_data (tc, NULL, 0,
+GNUNET_MESSAGE_TYPE_TRANSPORT_ADDRESS_REPLY);
+GNUNET_SERVER_transmit_context_run (tc, rtimeout);
+@end example
+Note that, there are also a number of other APIs provided to the service to
+send the message.
+@c ***************************************************************************
+@node Server: Notification of clients
+@subsubsection Server: Notification of clients
+@c %**end of header
+Often a service needs to (repeatedly) transmit notifications to a client or a
+group of clients. In these cases, the client typically has once registered for
+a set of events and then needs to receive a message whenever such an event
+happens (until the client disconnects). The use of a notification context can
+help manage message queues to clients and handle disconnects. Notification
+contexts can be used to send individualized messages to a particular client or
+to broadcast messages to a group of clients. An individualized notification
+might look like this:
+@example
+ GNUNET_SERVER_notification_context_unicast(nc,
+ client, msg, GNUNET_YES);
+@end example
+Note that after processing the original registration message for notifications,
+the server code still typically needs to call@
+@code{GNUNET_SERVER_receive_done} so that the client can transmit further
+messages to the server.
+@c ***************************************************************************
+@node Conversion between Network Byte Order (Big Endian) and Host Byte Order
+@subsubsection Conversion between Network Byte Order (Big Endian) and Host Byte Order
+@c %** subsub? it's a referenced page on the ipc document.
+@c %**end of header
+Here we can simply comprehend big endian and little endian as Network Byte
+Order and Host Byte Order respectively. What is the difference between both
+two?
+Usually in our host computer we store the data byte as Host Byte Order, for
+example, we store a integer in the RAM which might occupies 4 Byte, as Host
+Byte Order the higher Byte would be stored at the lower address of RAM, and
+the lower Byte would be stored at the higher address of RAM. However, contrast
+to this, Network Byte Order just take the totally opposite way to store the
+data, says, it will store the lower Byte at the lower address, and the higher
+Byte will stay at higher address.
+For the current communication of network, we normally exchange the information
+by surveying the data package, every two host wants to communicate with each
+other must send and receive data package through network. In order to maintain
+the identity of data through the transmission in the network, the order of the
+Byte storage must changed before sending and after receiving the data.
+There ten convenient functions to realize the conversion of Byte Order in
+GNUnet, as following:
+@table @asis
+@item uint16_t htons(uint16_t hostshort) Convert host byte order to net byte
+order with short int
+@item uint32_t htonl(uint32_t hostlong) Convert host byte
+order to net byte order with long int
+@item uint16_t ntohs(uint16_t netshort)
+Convert net byte order to host byte order with short int
+@item uint32_t
+ntohl(uint32_t netlong) Convert net byte order to host byte order with long int
+@item unsigned long long GNUNET_ntohll (unsigned long long netlonglong) Convert
+net byte order to host byte order with long long int
+@item unsigned long long
+GNUNET_htonll (unsigned long long hostlonglong) Convert host byte order to net
+byte order with long long int
+@item struct GNUNET_TIME_RelativeNBO
+GNUNET_TIME_relative_hton (struct GNUNET_TIME_Relative a) Convert relative time
+to network byte order.
+@item struct GNUNET_TIME_Relative
+GNUNET_TIME_relative_ntoh (struct GNUNET_TIME_RelativeNBO a) Convert relative
+time from network byte order.
+@item struct GNUNET_TIME_AbsoluteNBO
+GNUNET_TIME_absolute_hton (struct GNUNET_TIME_Absolute a) Convert relative time
+to network byte order.
+@item struct GNUNET_TIME_Absolute
+GNUNET_TIME_absolute_ntoh (struct GNUNET_TIME_AbsoluteNBO a) Convert relative
+time from network byte order.
+@end table
+@c ***************************************************************************
+@node Cryptography API
+@subsection Cryptography API
+@c %**end of header
+The gnunetutil APIs provides the cryptographic primitives used in GNUnet.
+GNUnet uses 2048 bit RSA keys for the session key exchange and for signing
+messages by peers and most other public-key operations. Most researchers in
+cryptography consider 2048 bit RSA keys as secure and practically unbreakable
+for a long time. The API provides functions to create a fresh key pair, read a
+private key from a file (or create a new file if the file does not exist),
+encrypt, decrypt, sign, verify and extraction of the public key into a format
+suitable for network transmission.
+For the encryption of files and the actual data exchanged between peers GNUnet
+uses 256-bit AES encryption. Fresh, session keys are negotiated for every new
+connection.@ Again, there is no published technique to break this cipher in any
+realistic amount of time. The API provides functions for generation of keys,
+validation of keys (important for checking that decryptions using RSA
+succeeded), encryption and decryption.
+GNUnet uses SHA-512 for computing one-way hash codes. The API provides
+functions to compute a hash over a block in memory or over a file on disk.
+The crypto API also provides functions for randomizing a block of memory,
+obtaining a single random number and for generating a permuation of the numbers
+0 to n-1. Random number generation distinguishes between WEAK and STRONG random
+number quality; WEAK random numbers are pseudo-random whereas STRONG random
+numbers use entropy gathered from the operating system.
+Finally, the crypto API provides a means to deterministically generate a
+1024-bit RSA key from a hash code. These functions should most likely not be
+used by most applications; most importantly,@
+GNUNET_CRYPTO_rsa_key_create_from_hash does not create an RSA-key that should
+be considered secure for traditional applications of RSA.
+@c ***************************************************************************
+@node Message Queue API
+@subsection Message Queue API
+@c %**end of header
+@strong{ Introduction }@ Often, applications need to queue messages that are to
+be sent to other GNUnet peers, clients or services. As all of GNUnet's
+message-based communication APIs, by design, do not allow messages to be
+queued, it is common to implement custom message queues manually when they are
+needed. However, writing very similar code in multiple places is tedious and
+leads to code duplication.
+MQ (for Message Queue) is an API that provides the functionality to implement
+and use message queues. We intend to eventually replace all of the custom
+message queue implementations in GNUnet with MQ.
+@strong{ Basic Concepts }@ The two most important entities in MQ are queues and
+envelopes.
+Every queue is backed by a specific implementation (e.g. for mesh, stream,
+connection, server client, etc.) that will actually deliver the queued
+messages. For convenience,@ some queues also allow to specify a list of message
+handlers. The message queue will then also wait for incoming messages and
+dispatch them appropriately.
+An envelope holds the the memory for a message, as well as metadata (Where is
+the envelope queued? What should happen after it has been sent?). Any envelope
+can only be queued in one message queue.
+@strong{ Creating Queues }@ The following is a list of currently available
+message queues. Note that to avoid layering issues, message queues for higher
+level APIs are not part of @code{libgnunetutil}, but@ the respective API itself
+provides the queue implementation.
+@table @asis
+@item @code{GNUNET_MQ_queue_for_connection_client} Transmits queued messages
+over a @code{GNUNET_CLIENT_Connection}@ handle. Also supports receiving with
+message handlers.@
+@item @code{GNUNET_MQ_queue_for_server_client} Transmits queued messages over a
+@code{GNUNET_SERVER_Client}@ handle. Does not support incoming message
+handlers.@
+@item @code{GNUNET_MESH_mq_create} Transmits queued messages over a
+@code{GNUNET_MESH_Tunnel}@ handle. Does not support incoming message handlers.@
+@item @code{GNUNET_MQ_queue_for_callbacks} This is the most general
+implementation. Instead of delivering and receiving messages with one of
+GNUnet's communication APIs, implementation callbacks are called. Refer to
+"Implementing Queues" for a more detailed explanation.
+@end table
+@strong{ Allocating Envelopes }@ A GNUnet message (as defined by the
+GNUNET_MessageHeader) has three parts: The size, the type, and the body.
+MQ provides macros to allocate an envelope containing a message conveniently,@
+automatically setting the size and type fields of the message.
+Consider the following simple message, with the body consisting of a single
+number value.@ @code{}
+@example
+struct NumberMessage @{
+  /** Type: GNUNET_MESSAGE_TYPE_EXAMPLE_1 */
+  struct GNUNET_MessageHeader header; uint32_t number GNUNET_PACKED; @};
+@end example
+An envelope containing an instance of the NumberMessage can be constructed like
+this:
+@example
+struct GNUNET_MQ_Envelope *ev; struct NumberMessage *msg; ev =
+GNUNET_MQ_msg (msg, GNUNET_MESSAGE_TYPE_EXAMPLE_1); msg->number = htonl (42);
+@end example
+In the above code, @code{GNUNET_MQ_msg} is a macro. The return value is the
+newly allocated envelope. The first argument must be a pointer to some
+@code{struct} containing a @code{struct GNUNET_MessageHeader header} field,
+while the second argument is the desired message type, in host byte order.
+The @code{msg} pointer now points to an allocated message, where the message
+type and the message size are already set. The message's size is inferred from
+the type of the @code{msg} pointer: It will be set to 'sizeof(*msg)', properly
+converted to network byte order.
+If the message body's size is dynamic, the the macro @code{GNUNET_MQ_msg_extra}
+can be used to allocate an envelope whose message has additional space
+allocated after the @code{msg} structure.
+If no structure has been defined for the message,
+@code{GNUNET_MQ_msg_header_extra} can be used to allocate additional space
+after the message header. The first argument then must be a pointer to a
+@code{GNUNET_MessageHeader}.
+@strong{Envelope Properties}@ A few functions in MQ allow to set additional
+properties on envelopes:
+@table @asis
+@item @code{GNUNET_MQ_notify_sent} Allows to specify a function that will be
+called once the envelope's message@ has been sent irrevocably. An envelope can
+be canceled precisely up to the@ point where the notify sent callback has been
+called.
+@item @code{GNUNET_MQ_disable_corking} No corking will be used when
+sending the message. Not every@ queue supports this flag, per default,
+envelopes are sent with corking.@
+@end table
+@strong{Sending Envelopes}@ Once an envelope has been constructed, it can be
+queued for sending with @code{GNUNET_MQ_send}.
+Note that in order to avoid memory leaks, an envelope must either be sent (the
+queue will free it) or destroyed explicitly with @code{GNUNET_MQ_discard}.
+@strong{Canceling Envelopes}@ An envelope queued with @code{GNUNET_MQ_send} can
+be canceled with @code{GNUNET_MQ_cancel}. Note that after the notify sent
+callback has been called, canceling a message results in undefined behavior.
+Thus it is unsafe to cancel an envelope that does not have a notify sent
+callback. When canceling an envelope, it is not necessary@ to call
+@code{GNUNET_MQ_discard}, and the envelope can't be sent again.
+@strong{ Implementing Queues }@ @code{TODO}
+@c ***************************************************************************
+@node Service API
+@subsection Service API
+@c %**end of header
+Most GNUnet code lives in the form of services. Services are processes that
+offer an API for other components of the system to build on. Those other
+components can be command-line tools for users, graphical user interfaces or
+other services. Services provide their API using an IPC protocol. For this,
+each service must listen on either a TCP port or a UNIX domain socket; for
+this, the service implementation uses the server API. This use of server is
+exposed directly to the users of the service API. Thus, when using the service
+API, one is usually also often using large parts of the server API. The service
+API provides various convenience functions, such as parsing command-line
+arguments and the configuration file, which are not found in the server API.
+The dual to the service/server API is the client API, which can be used to
+access services.
+The most common way to start a service is to use the GNUNET_SERVICE_run
+function from the program's main function. GNUNET_SERVICE_run will then parse
+the command line and configuration files and, based on the options found there,
+start the server. It will then give back control to the main program, passing
+the server and the configuration to the GNUNET_SERVICE_Main callback.
+GNUNET_SERVICE_run will also take care of starting the scheduler loop. If this
+is inappropriate (for example, because the scheduler loop is already running),
+GNUNET_SERVICE_start and related functions provide an alternative to
+GNUNET_SERVICE_run.
+When starting a service, the service_name option is used to determine which
+sections in the configuration file should be used to configure the service. A
+typical value here is the name of the src/ sub-directory, for example
+"statistics". The same string would also be given to GNUNET_CLIENT_connect to
+access the service.
+Once a service has been initialized, the program should use the@
+GNUNET_SERVICE_Main callback to register message handlers using
+GNUNET_SERVER_add_handlers. The service will already have registered a handler
+for the "TEST" message.
+The option bitfield (enum GNUNET_SERVICE_Options) determines how a service
+should behave during shutdown. There are three key strategies:
+@table @asis
+@item instant (GNUNET_SERVICE_OPTION_NONE) Upon receiving the shutdown signal
+from the scheduler, the service immediately terminates the server, closing all
+existing connections with clients.
+@item manual
+(GNUNET_SERVICE_OPTION_MANUAL_SHUTDOWN) The service does nothing by itself
+during shutdown. The main program will need to take the appropriate action by
+calling GNUNET_SERVER_destroy or GNUNET_SERVICE_stop (depending on how the
+service was initialized) to terminate the service. This method is used by
+gnunet-service-arm and rather uncommon.
+@item soft
+(GNUNET_SERVICE_OPTION_SOFT_SHUTDOWN) Upon receiving the shutdown signal from
+the scheduler, the service immediately tells the server to stop listening for
+incoming clients. Requests from normal existing clients are still processed and
+the server/service terminates once all normal clients have disconnected.
+Clients that are not expected to ever disconnect (such as clients that monitor
+performance values) can be marked as 'monitor' clients using
+GNUNET_SERVER_client_mark_monitor. Those clients will continue to be processed
+until all 'normal' clients have disconnected. Then, the server will terminate,
+closing the monitor connections. This mode is for example used by 'statistics',
+allowing existing 'normal' clients to set (possibly persistent) statistic
+values before terminating.
+@end table
+@c ***************************************************************************
+@node Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
+@subsection Optimizing Memory Consumption of GNUnet's (Multi-) Hash Maps
+@c %**end of header
+A commonly used data structure in GNUnet is a (multi-)hash map. It is most
+often used to map a peer identity to some data structure, but also to map
+arbitrary keys to values (for example to track requests in the distributed hash
+table or in file-sharing). As it is commonly used, the DHT is actually
+sometimes responsible for a large share of GNUnet's overall memory consumption
+(for some processes, 30% is not uncommon). The following text documents some
+API quirks (and their implications for applications) that were recently
+introduced to minimize the footprint of the hash map.
+@c ***************************************************************************
+@menu
+* Analysis::
+* Solution::
+* Migration::
+* Conclusion::
+* Availability::
+@end menu
+@node Analysis
+@subsubsection Analysis
+@c %**end of header
+The main reason for the "excessive" memory consumption by the hash map is that
+GNUnet uses 512-bit cryptographic hash codes --- and the (multi-)hash map also
+uses the same 512-bit 'struct GNUNET_HashCode'. As a result, storing just the
+keys requires 64 bytes of memory for each key. As some applications like to
+keep a large number of entries in the hash map (after all, that's what maps
+are good for), 64 bytes per hash is significant: keeping a pointer to the
+value and having a linked list for collisions consume between 8 and 16 bytes,
+and 'malloc' may add about the same overhead per allocation, putting us in the
+16 to 32 byte per entry ballpark. Adding a 64-byte key then triples the
+overall memory requirement for the hash map.
+To make things "worse", most of the time storing the key in the hash map is
+not required: it is typically already in memory elsewhere! In most cases, the
+values stored in the hash map are some application-specific struct that _also_
+contains the hash. Here is a simplified example:
+@example
+struct MyValue @{
+struct GNUNET_HashCode key; unsigned int my_data; @};
+// ...
+val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
+42; GNUNET_CONTAINER_multihashmap_put (map, &key, val, ...);
+@end example
+This is a common pattern as later the entries might need to be removed, and at
+that time it is convenient to have the key immediately at hand:
+@example
+GNUNET_CONTAINER_multihashmap_remove (map, &val->key, val);
+@end example
+Note that here we end up with two times 64 bytes for the key, plus maybe 64
+bytes total for the rest of the 'struct MyValue' and the map entry in the hash
+map. The resulting redundant storage of the key increases overall memory
+consumption per entry from the "optimal" 128 bytes to 192 bytes. This is not
+just an extreme example: overheads in practice are actually sometimes close to
+those highlighted in this example. This is especially true for maps with a
+significant number of entries, as there we tend to really try to keep the
+entries small.
+@c ***************************************************************************
+@node Solution
+@subsubsection Solution
+@c %**end of header
+The solution that has now been implemented is to @strong{optionally} allow the
+hash map to not make a (deep) copy of the hash but instead have a pointer to
+the hash/key in the entry. This reduces the memory consumption for the key
+from 64 bytes to 4 to 8 bytes. However, it can also only work if the key is
+actually stored in the entry (which is the case most of the time) and if the
+entry does not modify the key (which in all of the code I'm aware of has been
+always the case if there key is stored in the entry). Finally, when the client
+stores an entry in the hash map, it @strong{must} provide a pointer to the key
+within the entry, not just a pointer to a transient location of the key. If
+the client code does not meet these requirements, the result is a dangling
+pointer and undefined behavior of the (multi-)hash map API.
+@c ***************************************************************************
+@node Migration
+@subsubsection Migration
+@c %**end of header
+To use the new feature, first check that the values contain the respective key
+(and never modify it). Then, all calls to
+@code{GNUNET_CONTAINER_multihashmap_put} on the respective map must be audited
+and most likely changed to pass a pointer into the value's struct. For the
+initial example, the new code would look like this:
+@example
+struct MyValue @{
+struct GNUNET_HashCode key; unsigned int my_data; @};
+// ...
+val = GNUNET_malloc (sizeof (struct MyValue)); val->key = key; val->my_data =
+42; GNUNET_CONTAINER_multihashmap_put (map, &val->key, val, ...);
+@end example
+Note that @code{&val} was changed to @code{&val->key} in the argument to the
+@code{put} call. This is critical as often @code{key} is on the stack or in
+some other transient data structure and thus having the hash map keep a pointer
+to @code{key} would not work. Only the key inside of @code{val} has the same
+lifetime as the entry in the map (this must of course be checked as well).
+Naturally, @code{val->key} must be intiialized before the @code{put} call. Once
+all @code{put} calls have been converted and double-checked, you can change the
+call to create the hash map from
+@example
+map =
+GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_NO);
+@end example
+to
+@example
+map = GNUNET_CONTAINER_multihashmap_create (SIZE, GNUNET_YES);
+@end example
+If everything was done correctly, you now use about 60 bytes less memory per
+entry in @code{map}. However, if now (or in the future) any call to @code{put}
+does not ensure that the given key is valid until the entry is removed from the
+map, undefined behavior is likely to be observed.
+@c ***************************************************************************
+@node Conclusion
+@subsubsection Conclusion
+@c %**end of header
+The new optimization can is often applicable and can result in a reduction in
+memory consumption of up to 30% in practice. However, it makes the code less
+robust as additional invariants are imposed on the multi hash map client. Thus
+applications should refrain from enabling the new mode unless the resulting
+performance increase is deemed significant enough. In particular, it should
+generally not be used in new code (wait at least until benchmarks exist).
+@c ***************************************************************************
+@node Availability
+@subsubsection Availability
+@c %**end of header
+The new multi hash map code was committed in SVN 24319 (will be in GNUnet
+0.9.4). Various subsystems (transport, core, dht, file-sharing) were
+previously audited and modified to take advantage of the new capability. In
+particular, memory consumption of the file-sharing service is expected to drop
+by 20-30% due to this change.
+@c ***************************************************************************
+@node The CONTAINER_MDLL API
+@subsection The CONTAINER_MDLL API
+@c %**end of header
+This text documents the GNUNET_CONTAINER_MDLL API. The GNUNET_CONTAINER_MDLL
+API is similar to the GNUNET_CONTAINER_DLL API in that it provides operations
+for the construction and manipulation of doubly-linked lists. The key
+difference to the (simpler) DLL-API is that the MDLL-version allows a single
+element (instance of a "struct") to be in multiple linked lists at the same
+time.
+Like the DLL API, the MDLL API stores (most of) the data structures for the
+doubly-linked list with the respective elements; only the 'head' and 'tail'
+pointers are stored "elsewhere" --- and the application needs to provide the
+locations of head and tail to each of the calls in the MDLL API. The key
+difference for the MDLL API is that the "next" and "previous" pointers in the
+struct can no longer be simply called "next" and "prev" --- after all, the
+element may be in multiple doubly-linked lists, so we cannot just have one
+"next" and one "prev" pointer!
+The solution is to have multiple fields that must have a name of the format
+"next_XX" and "prev_XX" where "XX" is the name of one of the doubly-linked
+lists. Here is a simple example:
+@example
+struct MyMultiListElement @{ struct
+MyMultiListElement *next_ALIST; struct MyMultiListElement *prev_ALIST; struct
+MyMultiListElement *next_BLIST; struct MyMultiListElement *prev_BLIST; void
+*data; @};
+@end example
+Note that by convention, we use all-uppercase letters for the list names. In
+addition, the program needs to have a location for the head and tail pointers
+for both lists, for example:
+@example
+static struct MyMultiListElement
+*head_ALIST; static struct MyMultiListElement *tail_ALIST; static struct
+MyMultiListElement *head_BLIST; static struct MyMultiListElement *tail_BLIST;
+@end example
+Using the MDLL-macros, we can now insert an element into the ALIST:
+@example
+GNUNET_CONTAINER_MDLL_insert (ALIST, head_ALIST, tail_ALIST, element);
+@end example
+Passing "ALIST" as the first argument to MDLL specifies which of the next/prev
+fields in the 'struct MyMultiListElement' should be used. The extra "ALIST"
+argument and the "_ALIST" in the names of the next/prev-members are the only
+differences between the MDDL and DLL-API. Like the DLL-API, the MDLL-API offers
+functions for inserting (at head, at tail, after a given element) and removing
+elements from the list. Iterating over the list should be done by directly
+accessing the "next_XX" and/or "prev_XX" members.
+@c ***************************************************************************
+@node The Automatic Restart Manager (ARM)
+@section The Automatic Restart Manager (ARM)
+@c %**end of header
+GNUnet's Automated Restart Manager (ARM) is the GNUnet service responsible for
+system initialization and service babysitting. ARM starts and halts services,
+detects configuration changes and restarts services impacted by the changes as
+needed. It's also responsible for restarting services in case of crashes and is
+planned to incorporate automatic debugging for diagnosing service crashes
+providing developers insights about crash reasons. The purpose of this document
+is to give GNUnet developer an idea about how ARM works and how to interact
+with it.
+@menu
+* Basic functionality::
+* Key configuration options::
+* Availability2::
+* Reliability::
+@end menu
+@c ***************************************************************************
+@node Basic functionality
+@subsection Basic functionality
+@c %**end of header
+@itemize @bullet
+@item ARM source code can be found under "src/arm".@ Service processes are
+managed by the functions in "gnunet-service-arm.c" which is controlled with
+"gnunet-arm.c" (main function in that file is ARM's entry point).
+@item The functions responsible for communicating with ARM , starting and
+stopping services -including ARM service itself- are provided by the ARM API
+"arm_api.c".@ Function: GNUNET_ARM_connect() returns to the caller an ARM
+handle after setting it to the caller's context (configuration and scheduler in
+use). This handle can be used afterwards by the caller to communicate with ARM.
+Functions GNUNET_ARM_start_service() and GNUNET_ARM_stop_service() are used for
+starting and stopping services respectively.
+@item A typical example of using these basic ARM services can be found in file
+test_arm_api.c. The test case connects to ARM, starts it, then uses it to start
+a service "resolver", stops the "resolver" then stops "ARM".
+@end itemize
+@c ***************************************************************************
+@node Key configuration options
+@subsection Key configuration options
+@c %**end of header
+Configurations for ARM and services should be available in a .conf file (As an
+example, see test_arm_api_data.conf). When running ARM, the configuration file
+to use should be passed to the command:@ @code{@ $ gnunet-arm -s -c
+configuration_to_use.conf@ }@ If no configuration is passed, the default
+configuration file will be used (see GNUNET_PREFIX/share/gnunet/defaults.conf
+which is created from contrib/defaults.conf).@ Each of the services is having a
+section starting by the service name between square brackets, for example:
+"[arm]". The following options configure how ARM configures or interacts with
+the various services:
+@table @asis
+@item PORT Port number on which the service is listening for incoming TCP
+connections. ARM will start the services should it notice a request at this
+port.
+@item HOSTNAME Specifies on which host the service is deployed. Note
+that ARM can only start services that are running on the local system (but will
+not check that the hostname matches the local machine name). This option is
+used by the @code{gnunet_client_lib.h} implementation to determine which system
+to connect to. The default is "localhost".
+@item BINARY The name of the service binary file.
+@item OPTIONS To be passed to the service.
+@item PREFIX A command to pre-pend to the actual command, for example, running
+a service with "valgrind" or "gdb"
+@item DEBUG Run in debug mode (much verbosity).
+@item AUTOSTART ARM will listen to UNIX domain socket and/or TCP port of the
+service and start the service on-demand.
+@item FORCESTART ARM will always
+start this service when the peer is started.
+@item ACCEPT_FROM IPv4 addresses the service accepts connections from.
+@item ACCEPT_FROM6 IPv6 addresses the service accepts connections from.
+@end table
+Options that impact the operation of ARM overall are in the "[arm]" section.
+ARM is a normal service and has (except for AUTOSTART) all of the options that
+other services do. In addition, ARM has the following options:
+@table @asis
+@item GLOBAL_PREFIX Command to be pre-pended to all services that are going to
+run.@
+@item GLOBAL_POSTFIX Global option that will be supplied to all the services
+that are going to run.@
+@end table
+@c ***************************************************************************
+@node Availability2
+@subsection Availability2
+@c %**end of header
+As mentioned before, one of the features provided by ARM is starting services
+on demand. Consider the example of one service "client" that wants to connect
+to another service a "server". The "client" will ask ARM to run the "server".
+ARM starts the "server". The "server" starts listening to incoming connections.
+The "client" will establish a connection with the "server". And then, they will
+start to communicate together.@ One problem with that scheme is that it's
+slow!@ The "client" service wants to communicate with the "server" service at
+once and is not willing wait for it to be started and listening to incoming
+connections before serving its request.@ One solution for that problem will be
+that ARM starts all services as default services. That solution will solve the
+problem, yet, it's not quite practical, for some services that are going to be
+started can never be used or are going to be used after a relatively long
+time.@ The approach followed by ARM to solve this problem is as follows:
+@itemize @bullet
+@item For each service having a PORT field in the configuration file and that
+is not one of the default services ( a service that accepts incoming
+connections from clients), ARM creates listening sockets for all addresses
+associated with that service.
+@item The "client" will immediately establish a connection with the "server".
+@item ARM --- pretending to be the "server" --- will listen on the respective
+port and notice the incoming connection from the "client" (but not accept it),
+instead
+@item Once there is an incoming connection, ARM will start the "server",
+passing on the listen sockets (now, the service is started and can do its
+work).
+@item Other client services now can directly connect directly to the "server".
+@end itemize
+@c ***************************************************************************
+@node Reliability
+@subsection Reliability
+One of the features provided by ARM, is the automatic restart of crashed
+services.@ ARM needs to know which of the running services died. Function
+"gnunet-service-arm.c/maint_child_death()" is responsible for that. The
+function is scheduled to run upon receiving a SIGCHLD signal. The function,
+then, iterates ARM's list of services running and monitors which service has
+died (crashed). For all crashing services, ARM restarts them.@ Now, considering
+the case of a service having a serious problem causing it to crash each time
+it's started by ARM. If ARM keeps blindly restarting such a service, we are
+going to have the pattern: start-crash-restart-crash-restart-crash and so
+forth!! Which is of course not practical.@ For that reason, ARM schedules the
+service to be restarted after waiting for some delay that grows exponentially
+with each crash/restart of that service.@ To clarify the idea, considering the
+following example:
+@itemize @bullet
+@item Service S crashed.
+@item ARM receives the SIGCHLD and inspects its list of services to find the
+dead one(s).
+@item ARM finds S dead and schedules it for restarting after "backoff" time
+which is initially set to 1ms. ARM will double the backoff time correspondent
+to S (now backoff(S) = 2ms)
+@item Because there is a severe problem with S, it crashed again.
+@item Again ARM receives the SIGCHLD and detects that it's S again that's
+crashed. ARM schedules it for restarting but after its new backoff time (which
+became 2ms), and doubles its backoff time (now backoff(S) = 4).
+@item and so on, until backoff(S) reaches a certain threshold
+(EXPONENTIAL_BACKOFF_THRESHOLD is set to half an hour), after reaching it,
+backoff(S) will remain half an hour, hence ARM won't be busy for a lot of time
+trying to restart a problematic service.
+@end itemize
+@c ***************************************************************************
+@node GNUnet's TRANSPORT Subsystem
+@section GNUnet's TRANSPORT Subsystem
+@c %**end of header
+This chapter documents how the GNUnet transport subsystem works. The GNUnet
+transport subsystem consists of three main components: the transport API (the
+interface used by the rest of the system to access the transport service), the
+transport service itself (most of the interesting functions, such as choosing
+transports, happens here) and the transport plugins. A transport plugin is a
+concrete implementation for how two GNUnet peers communicate; many plugins
+exist, for example for communication via TCP, UDP, HTTP, HTTPS and others.
+Finally, the transport subsystem uses supporting code, especially the NAT/UPnP
+library to help with tasks such as NAT traversal.
+Key tasks of the transport service include:
+@itemize @bullet
+@item Create our HELLO message, notify clients and neighbours if our HELLO
+changes (using NAT library as necessary)
+@item Validate HELLOs from other peers (send PING), allow other peers to
+validate our HELLO's addresses (send PONG)
+@item Upon request, establish connections to other peers (using address
+selection from ATS subsystem) and maintain them (again using PINGs and PONGs)
+as long as desired
+@item Accept incoming connections, give ATS service the opportunity to switch
+communication channels
+@item Notify clients about peers that have connected to us or that have been
+disconnected from us
+@item If a (stateful) connection goes down unexpectedly (without explicit
+DISCONNECT), quickly attempt to recover (without notifying clients) but do
+notify clients quickly if reconnecting fails
+@item Send (payload) messages arriving from clients to other peers via
+transport plugins and receive messages from other peers, forwarding those to
+clients
+@item Enforce inbound traffic limits (using flow-control if it is applicable);
+outbound traffic limits are enforced by CORE, not by us (!)
+@item Enforce restrictions on P2P connection as specified by the blacklist
+configuration and blacklisting clients
+@end itemize
+Note that the term "clients" in the list above really refers to the GNUnet-CORE
+service, as CORE is typically the only client of the transport service.
+@menu
+* Address validation protocol::
+@end menu
+@node Address validation protocol
+@subsection Address validation protocol
+@c %**end of header
+This section documents how the GNUnet transport service validates connections
+with other peers. It is a high-level description of the protocol necessary to
+understand the details of the implementation. It should be noted that when we
+talk about PING and PONG messages in this section, we refer to transport-level
+PING and PONG messages, which are different from core-level PING and PONG
+messages (both in implementation and function).
+The goal of transport-level address validation is to minimize the chances of a
+successful man-in-the-middle attack against GNUnet peers on the transport
+level. Such an attack would not allow the adversary to decrypt the P2P
+transmissions, but a successful attacker could at least measure traffic volumes
+and latencies (raising the adversaries capablities by those of a global passive
+adversary in the worst case). The scenarios we are concerned about is an
+attacker, Mallory, giving a HELLO to Alice that claims to be for Bob, but
+contains Mallory's IP address instead of Bobs (for some transport). Mallory
+would then forward the traffic to Bob (by initiating a connection to Bob and
+claiming to be Alice). As a further complication, the scheme has to work even
+if say Alice is behind a NAT without traversal support and hence has no address
+of her own (and thus Alice must always initiate the connection to Bob).
+An additional constraint is that HELLO messages do not contain a cryptographic
+signature since other peers must be able to edit (i.e. remove) addresses from
+the HELLO at any time (this was not true in GNUnet 0.8.x). A basic
+@strong{assumption} is that each peer knows the set of possible network
+addresses that it @strong{might} be reachable under (so for example, the
+external IP address of the NAT plus the LAN address(es) with the respective
+ports).
+The solution is the following. If Alice wants to validate that a given address
+for Bob is valid (i.e. is actually established @strong{directly} with the
+intended target), it sends a PING message over that connection to Bob. Note
+that in this case, Alice initiated the connection so only she knows which
+address was used for sure (Alice maybe behind NAT, so whatever address Bob
+sees may not be an address Alice knows she has). Bob checks that the address
+given in the PING is actually one of his addresses (does not belong to
+Mallory), and if it is, sends back a PONG (with a signature that says that Bob
+owns/uses the address from the PING). Alice checks the signature and is happy
+if it is valid and the address in the PONG is the address she used. This is
+similar to the 0.8.x protocol where the HELLO contained a signature from Bob
+for each address used by Bob. Here, the purpose code for the signature is
+@code{GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN}. After this, Alice will
+remember Bob's address and consider the address valid for a while (12h in the
+current implementation). Note that after this exchange, Alice only considers
+Bob's address to be valid, the connection itself is not considered
+'established'. In particular, Alice may have many addresses for Bob that she
+considers valid.
+The PONG message is protected with a nonce/challenge against replay attacks
+and uses an expiration time for the signature (but those are almost
+implementation details).
+@node NAT library
+@section NAT library
+@c %**end of header
+The goal of the GNUnet NAT library is to provide a general-purpose API for NAT
+traversal @strong{without} third-party support. So protocols that involve
+contacting a third peer to help establish a connection between two peers are
+outside of the scope of this API. That does not mean that GNUnet doesn't
+support involving a third peer (we can do this with the distance-vector
+transport or using application-level protocols), it just means that the NAT API
+is not concerned with this possibility. The API is written so that it will work
+for IPv6-NAT in the future as well as current IPv4-NAT. Furthermore, the NAT
+API is always used, even for peers that are not behind NAT --- in that case,
+the mapping provided is simply the identity.
+NAT traversal is initiated by calling @code{GNUNET_NAT_register}. Given a set
+of addresses that the peer has locally bound to (TCP or UDP), the NAT library
+will return (via callback) a (possibly longer) list of addresses the peer
+@strong{might} be reachable under. Internally, depending on the configuration,
+the NAT library will try to punch a hole (using UPnP) or just "know" that the
+NAT was manually punched and generate the respective external IP address (the
+one that should be globally visible) based on the given information.
+The NAT library also supports ICMP-based NAT traversal. Here, the other peer
+can request connection-reversal by this peer (in this special case, the peer is
+even allowed to configure a port number of zero). If the NAT library detects a
+connection-reversal request, it returns the respective target address to the
+client as well. It should be noted that connection-reversal is currently only
+intended for TCP, so other plugins @strong{must} pass @code{NULL} for the
+reversal callback. Naturally, the NAT library also supports requesting
+connection reversal from a remote peer (@code{GNUNET_NAT_run_client}).
+Once initialized, the NAT handle can be used to test if a given address is
+possibly a valid address for this peer (@code{GNUNET_NAT_test_address}). This
+is used for validating our addresses when generating PONGs.
+Finally, the NAT library contains an API to test if our NAT configuration is
+correct. Using @code{GNUNET_NAT_test_start} @strong{before} binding to the
+respective port, the NAT library can be used to test if the configuration
+works. The test function act as a local client, initialize the NAT traversal
+and then contact a @code{gnunet-nat-server} (running by default on
+@code{gnunet.org}) and ask for a connection to be established. This way, it is
+easy to test if the current NAT configuration is valid.
+@node Distance-Vector plugin
+@section Distance-Vector plugin
+@c %**end of header
+The Distance Vector (DV) transport is a transport mechanism that allows peers
+to act as relays for each other, thereby connecting peers that would otherwise
+be unable to connect. This gives a larger connection set to applications that
+may work better with more peers to choose from (for example, File Sharing
+and/or DHT).
+The Distance Vector transport essentially has two functions. The first is
+"gossiping" connection information about more distant peers to directly
+connected peers. The second is taking messages intended for non-directly
+connected peers and encapsulating them in a DV wrapper that contains the
+required information for routing the message through forwarding peers. Via
+gossiping, optimal routes through the known DV neighborhood are discovered and
+utilized and the message encapsulation provides some benefits in addition to
+simply getting the message from the correct source to the proper destination.
+The gossiping function of DV provides an up to date routing table of peers that
+are available up to some number of hops. We call this a fisheye view of the
+network (like a fish, nearby objects are known while more distant ones
+unknown). Gossip messages are sent only to directly connected peers, but they
+are sent about other knowns peers within the "fisheye distance". Whenever two
+peers connect, they immediately gossip to each other about their appropriate
+other neighbors. They also gossip about the newly connected peer to previously
+connected neighbors. In order to keep the routing tables up to date, disconnect
+notifications are propogated as gossip as well (because disconnects may not be
+sent/received, timeouts are also used remove stagnant routing table entries).
+Routing of messages via DV is straightforward. When the DV transport is
+notified of a message destined for a non-direct neighbor, the appropriate
+forwarding peer is selected, and the base message is encapsulated in a DV
+message which contains information about the initial peer and the intended
+recipient. At each forwarding hop, the initial peer is validated (the
+forwarding peer ensures that it has the initial peer in its neighborhood,
+otherwise the message is dropped). Next the base message is re-encapsulated in
+a new DV message for the next hop in the forwarding chain (or delivered to the
+current peer, if it has arrived at the destination).
+Assume a three peer network with peers Alice, Bob and Carol. Assume that Alice
+<-> Bob and Bob <-> Carol are direct (e.g. over TCP or UDP transports)
+connections, but that Alice cannot directly connect to Carol. This may be the
+case due to NAT or firewall restrictions, or perhaps based on one of the peers
+respective configurations. If the Distance Vector transport is enabled on all
+three peers, it will automatically discover (from the gossip protocol) that
+Alice and Carol can connect via Bob and provide a "virtual" Alice <-> Carol
+connection. Routing between Alice and Carol happens as follows; Alice creates a
+message destined for Carol and notifies the DV transport about it. The DV
+transport at Alice looks up Carol in the routing table and finds that the
+message must be sent through Bob for Carol. The message is encapsulated setting
+Alice as the initiator and Carol as the destination and sent to Bob. Bob
+receives the messages, verifies both Alice and Carol are known to Bob, and
+re-wraps the message in a new DV message for Carol. The DV transport at Carol
+receives this message, unwraps the original message, and delivers it to Carol
+as though it came directly from Alice.
+@node SMTP plugin
+@section SMTP plugin
+@c %**end of header
+This page describes the new SMTP transport plugin for GNUnet as it exists in
+the 0.7.x and 0.8.x branch. SMTP support is currently not available in GNUnet
+0.9.x. This page also describes the transport layer abstraction (as it existed
+in 0.7.x and 0.8.x) in more detail and gives some benchmarking results. The
+performance results presented are quite old and maybe outdated at this point.
+@itemize @bullet
+@item Why use SMTP for a peer-to-peer transport?
+@item SMTPHow does it work?
+@item How do I configure my peer?
+@item How do I test if it works?
+@item How fast is it?
+@item Is there any additional documentation?
+@end itemize
+@menu
+* Why use SMTP for a peer-to-peer transport?::
+* How does it work?::
+* How do I configure my peer?::
+* How do I test if it works?::
+* How fast is it?::
+@end menu
+@node Why use SMTP for a peer-to-peer transport?
+@subsection Why use SMTP for a peer-to-peer transport?
+@c %**end of header
+There are many reasons why one would not want to use SMTP:
+@itemize @bullet
+@item SMTP is using more bandwidth than TCP, UDP or HTTP
+@item SMTP has a much higher latency.
+@item SMTP requires significantly more computation (encoding and decoding time)
+for the peers.
+@item SMTP is significantly more complicated to configure.
+@item SMTP may be abused by tricking GNUnet into sending mail to@
+non-participating third parties.
+@end itemize
+So why would anybody want to use SMTP?
+@itemize @bullet
+@item SMTP can be used to contact peers behind NAT boxes (in virtual private
+networks).
+@item SMTP can be used to circumvent policies that limit or prohibit
+peer-to-peer traffic by masking as "legitimate" traffic.
+@item SMTP uses E-mail addresses which are independent of a specific IP, which
+can be useful to address peers that use dynamic IP addresses.
+@item SMTP can be used to initiate a connection (e.g. initial address exchange)
+and peers can then negotiate the use of a more efficient protocol (e.g. TCP)
+for the actual communication.
+@end itemize
+In summary, SMTP can for example be used to send a message to a peer behind a
+NAT box that has a dynamic IP to tell the peer to establish a TCP connection
+to a peer outside of the private network. Even an extraordinary overhead for
+this first message would be irrelevant in this type of situation.
+@node How does it work?
+@subsection How does it work?
+@c %**end of header
+When a GNUnet peer needs to send a message to another GNUnet peer that has
+advertised (only) an SMTP transport address, GNUnet base64-encodes the message
+and sends it in an E-mail to the advertised address. The advertisement
+contains a filter which is placed in the E-mail header, such that the
+receiving host can filter the tagged E-mails and forward it to the GNUnet peer
+process. The filter can be specified individually by each peer and be changed
+over time. This makes it impossible to censor GNUnet E-mail messages by
+searching for a generic filter.
+@node How do I configure my peer?
+@subsection How do I configure my peer?
+@c %**end of header
+First, you need to configure @code{procmail} to filter your inbound E-mail for
+GNUnet traffic. The GNUnet messages must be delivered into a pipe, for example
+@code{/tmp/gnunet.smtp}. You also need to define a filter that is used by
+procmail to detect GNUnet messages. You are free to choose whichever filter
+you like, but you should make sure that it does not occur in your other
+E-mail. In our example, we will use @code{X-mailer: GNUnet}. The
+@code{~/.procmailrc} configuration file then looks like this:
+@example
+:0:
+* ^X-mailer: GNUnet
+/tmp/gnunet.smtp
+# where do you want your other e-mail delivered to (default: /var/spool/mail/)
+:0: /var/spool/mail/
+@end example
+After adding this file, first make sure that your regular E-mail still works
+(e.g. by sending an E-mail to yourself). Then edit the GNUnet configuration.
+In the section @code{SMTP} you need to specify your E-mail address under
+@code{EMAIL}, your mail server (for outgoing mail) under @code{SERVER}, the
+filter (X-mailer: GNUnet in the example) under @code{FILTER} and the name of
+the pipe under @code{PIPE}.@ The completed section could then look like this:
+@example
+EMAIL = me@@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER =
+"X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp
+@end example
+Finally, you need to add @code{smtp} to the list of @code{TRANSPORTS} in the
+@code{GNUNETD} section. GNUnet peers will use the E-mail address that you
+specified to contact your peer until the advertisement times out. Thus, if you
+are not sure if everything works properly or if you are not planning to be
+online for a long time, you may want to configure this timeout to be short,
+e.g. just one hour. For this, set @code{HELLOEXPIRES} to @code{1} in the
+@code{GNUNETD} section.
+This should be it, but you may probably want to test it first.@
+@node How do I test if it works?
+@subsection How do I test if it works?
+@c %**end of header
+Any transport can be subjected to some rudimentary tests using the
+@code{gnunet-transport-check} tool. The tool sends a message to the local node
+via the transport and checks that a valid message is received. While this test
+does not involve other peers and can not check if firewalls or other network
+obstacles prohibit proper operation, this is a great testcase for the SMTP
+transport since it tests pretty much nearly all of the functionality.
+@code{gnunet-transport-check} should only be used without running
+@code{gnunetd} at the same time. By default, @code{gnunet-transport-check}
+tests all transports that are specified in the configuration file. But you can
+specifically test SMTP by giving the option @code{--transport=smtp}.
+Note that this test always checks if a transport can receive and send. While
+you can configure most transports to only receive or only send messages, this
+test will only work if you have configured the transport to send and receive
+messages.
+@node How fast is it?
+@subsection How fast is it?
+@c %**end of header
+We have measured the performance of the UDP, TCP and SMTP transport layer
+directly and when used from an application using the GNUnet core. Measureing
+just the transport layer gives the better view of the actual overhead of the
+protocol, whereas evaluating the transport from the application puts the
+overhead into perspective from a practical point of view.
+The loopback measurements of the SMTP transport were performed on three
+different machines spanning a range of modern SMTP configurations. We used a
+PIII-800 running RedHat 7.3 with the Purdue Computer Science configuration
+which includes filters for spam. We also used a Xenon 2 GHZ with a vanilla
+RedHat 8.0 sendmail configuration. Furthermore, we used qmail on a PIII-1000
+running Sorcerer GNU Linux (SGL). The numbers for UDP and TCP are provided
+using the SGL configuration. The qmail benchmark uses qmail's internal
+filtering whereas the sendmail benchmarks relies on procmail to filter and
+deliver the mail. We used the transport layer to send a message of b bytes
+(excluding transport protocol headers) directly to the local machine. This
+way, network latency and packet loss on the wire have no impact on the
+timings. n messages were sent sequentially over the transport layer, sending
+message i+1 after the i-th message was received. All messages were sent over
+the same connection and the time to establish the connection was not taken
+into account since this overhead is miniscule in practice --- as long as a
+connection is used for a significant number of messages.
+@multitable @columnfractions .20 .15 .15 .15 .15 .15
+@headitem Transport @tab UDP @tab TCP @tab SMTP (Purdue sendmail) @tab SMTP (RH 8.0) @tab SMTP (SGL qmail)
+@item  11 bytes @tab 31 ms @tab 55 ms @tab  781 s @tab 77 s @tab 24 s
+@item  407 bytes @tab 37 ms @tab 62 ms @tab  789 s @tab 78 s @tab 25 s
+@item 1,221 bytes @tab 46 ms @tab 73 ms @tab  804 s @tab 78 s @tab 25 s
+@end multitable
+The benchmarks show that UDP and TCP are, as expected, both significantly
+faster compared with any of the SMTP services. Among the SMTP implementations,
+there can be significant differences depending on the SMTP configuration.
+Filtering with an external tool like procmail that needs to re-parse its
+configuration for each mail can be very expensive. Applying spam filters can
+also significantly impact the performance of the underlying SMTP
+implementation. The microbenchmark shows that SMTP can be a viable solution
+for initiating peer-to-peer sessions: a couple of seconds to connect to a peer
+are probably not even going to be noticed by users. The next benchmark
+measures the possible throughput for a transport. Throughput can be measured
+by sending multiple messages in parallel and measuring packet loss. Note that
+not only UDP but also the TCP transport can actually loose messages since the
+TCP implementation drops messages if the @code{write} to the socket would
+block. While the SMTP protocol never drops messages itself, it is often so
+slow that only a fraction of the messages can be sent and received in the
+given time-bounds. For this benchmark we report the message loss after
+allowing t time for sending m messages. If messages were not sent (or
+received) after an overall timeout of t, they were considered lost. The
+benchmark was performed using two Xeon 2 GHZ machines running RedHat 8.0 with
+sendmail. The machines were connected with a direct 100 MBit ethernet
+connection.@ Figures udp1200, tcp1200 and smtp-MTUs show that the throughput
+for messages of size 1,200 octects is 2,343 kbps, 3,310 kbps and 6 kbps for
+UDP, TCP and SMTP respectively. The high per-message overhead of SMTP can be
+improved by increasing the MTU, for example, an MTU of 12,000 octets improves
+the throughput to 13 kbps as figure smtp-MTUs shows. Our research paper) has
+some more details on the benchmarking results.
+@node Bluetooth plugin
+@section Bluetooth plugin
+@c %**end of header
+This page describes the new Bluetooth transport plugin for GNUnet. The plugin
+is still in the testing stage so don't expect it to work perfectly. If you
+have any questions or problems just post them here or ask on the IRC channel.
+@itemize @bullet
+@item What do I need to use the Bluetooth plugin transport?
+@item BluetoothHow does it work?
+@item What possible errors should I be aware of?
+@item How do I configure my peer?
+@item How can I test it?
+@end itemize
+@menu
+* What do I need to use the Bluetooth plugin transport?::
+* How does it work2?::
+* What possible errors should I be aware of?::
+* How do I configure my peer2?::
+* How can I test it?::
+* The implementation of the Bluetooth transport plugin::
+@end menu
+@node What do I need to use the Bluetooth plugin transport?
+@subsection What do I need to use the Bluetooth plugin transport?
+@c %**end of header
+If you are a Linux user and you want to use the Bluetooth transport plugin you
+should install the BlueZ development libraries (if they aren't already
+installed). For instructions about how to install the libraries you should
+check out the BlueZ site (@uref{http://www.bluez.org/, http://www.bluez.org}).
+If you don't know if you have the necesarry libraries, don't worry, just run
+the GNUnet configure script and you will be able to see a notification at the
+end which will warn you if you don't have the necessary libraries.
+If you are a Windows user you should have installed the
+@emph{MinGW}/@emph{MSys2} with the latest updates (especially the
+@emph{ws2bth} header). If this is your first build of GNUnet on Windows you
+should check out the SBuild repository. It will semi-automatically assembles a
+@emph{MinGW}/@emph{MSys2} installation with a lot of extra packages which are
+needed for the GNUnet build. So this will ease your work!@ Finally you just
+have to be sure that you have the correct drivers for your Bluetooth device
+installed and that your device is on and in a discoverable mode. The Windows
+Bluetooth Stack supports only the RFCOMM protocol so we cannot turn on your
+device programatically!
+@node How does it work2?
+@subsection How does it work2?
+@c %**end of header
+The Bluetooth transport plugin uses virtually the same code as the WLAN plugin
+and only the helper binary is different. The helper takes a single argument,
+which represents the interface name and is specified in the configuration
+file. Here are the basic steps that are followed by the helper binary used on
+Linux:
+@itemize @bullet
+@item it verifies if the name corresponds to a Bluetooth interface name
+@item it verifies if the iterface is up (if it is not, it tries to bring it up)
+@item it tries to enable the page and inquiry scan in order to make the device
+discoverable and to accept incoming connection requests
+@emph{The above operations require root access so you should start the
+transport plugin with root privileges.}
+@item it finds an available port number and registers a SDP service which will
+be used to find out on which port number is the server listening on and switch
+the socket in listening mode
+@item it sends a HELLO message with its address
+@item finally it forwards traffic from the reading sockets to the STDOUT and
+from the STDIN to the writing socket
+@end itemize
+Once in a while the device will make an inquiry scan to discover the nearby
+devices and it will send them randomly HELLO messages for peer discovery.
+@node What possible errors should I be aware of?
+@subsection What possible errors should I be aware of?
+@c %**end of header
+@emph{This section is dedicated for Linux users}
+Well there are many ways in which things could go wrong but I will try to
+present some tools that you could use to debug and some scenarios.
+@itemize @bullet
+@item @code{bluetoothd -n -d} : use this command to enable logging in the
+foreground and to print the logging messages
+@item @code{hciconfig}: can be used to configure the Bluetooth devices. If you
+run it without any arguments it will print information about the state of the
+interfaces. So if you receive an error that the device couldn't be brought up
+you should try to bring it manually and to see if it works (use @code{hciconfig
+-a hciX up}). If you can't and the Bluetooth address has the form
+00:00:00:00:00:00 it means that there is something wrong with the D-Bus daemon
+or with the Bluetooth daemon. Use @code{bluetoothd} tool to see the logs
+@item @code{sdptool} can be used to control and interogate SDP servers. If you
+encounter problems regarding the SDP server (like the SDP server is down) you
+should check out if the D-Bus daemon is running correctly and to see if the
+Bluetooth daemon started correctly(use @code{bluetoothd} tool). Also, sometimes
+the SDP service could work but somehow the device couldn't register his
+service. Use @code{sdptool browse [dev-address]} to see if the service is
+registered. There should be a service with the name of the interface and GNUnet
+as provider.
+@item @code{hcitool} : another useful tool which can be used to configure the
+device and to send some particular commands to it.
+@item @code{hcidump} : could be used for low level debugging
+@end itemize
+@node How do I configure my peer2?
+@subsection How do I configure my peer2?
+@c %**end of header
+On Linux, you just have to be sure that the interface name corresponds to the
+one that you want to use. Use the @code{hciconfig} tool to check that. By
+default it is set to hci0 but you can change it.
+A basic configuration looks like this:
+@example
+[transport-bluetooth]
+# Name of the interface (typically hciX)
+INTERFACE = hci0
+# Real hardware, no testing
+TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM;
+@end example
+In order to use the Bluetooth transport plugin when the transport service is
+started, you must add the plugin name to the default transport service plugins
+list. For example:
+@example
+[transport] ...  PLUGINS = dns bluetooth ...
+@end example
+If you want to use only the Bluetooth plugin set @emph{PLUGINS = bluetooth}
+On Windows, you cannot specify which device to use. The only thing that you
+should do is to add @emph{bluetooth} on the plugins list of the transport
+service.
+@node How can I test it?
+@subsection How can I test it?
+@c %**end of header
+If you have two Bluetooth devices on the same machine which use Linux you
+must:
+@itemize @bullet
+@item create two different file configuration (one which will use the first
+interface (@emph{hci0}) and the other which will use the second interface
+(@emph{hci1})). Let's name them @emph{peer1.conf} and @emph{peer2.conf}.
+@item run @emph{gnunet-peerinfo -c peerX.conf -s} in order to generate the
+peers private keys. The @strong{X} must be replace with 1 or 2.
+@item run @emph{gnunet-arm -c peerX.conf -s -i=transport} in order to start the
+transport service. (Make sure that you have "bluetooth" on the transport
+plugins list if the Bluetooth transport service doesn't start.)
+@item run @emph{gnunet-peerinfo -c peer1.conf -s} to get the first peer's ID.
+If you already know your peer ID (you saved it from the first command), this
+can be skipped.
+@item run @emph{gnunet-transport -c peer2.conf -p=PEER1_ID -s} to start sending
+data for benchmarking to the other peer.
+@end itemize
+This scenario will try to connect the second peer to the first one and then
+start sending data for benchmarking.
+On Windows you cannot test the plugin functionality using two Bluetooth devices
+from the same machine because after you install the drivers there will occur
+some conflicts between the Bluetooth stacks. (At least that is what happend on
+my machine : I wasn't able to use the Bluesoleil stack and the WINDCOMM one in
+the same time).
+If you have two different machines and your configuration files are good you
+can use the same scenario presented on the begining of this section.
+Another way to test the plugin functionality is to create your own application
+which will use the GNUnet framework with the Bluetooth transport service.
+@node The implementation of the Bluetooth transport plugin
+@subsection The implementation of the Bluetooth transport plugin
+@c %**end of header
+This page describes the implementation of the Bluetooth transport plugin.
+First I want to remind you that the Bluetooth transport plugin uses virtually
+the same code as the WLAN plugin and only the helper binary is different. Also
+the scope of the helper binary from the Bluetooth transport plugin is the same
+as the one used for the wlan transport plugin: it acceses the interface and
+then it forwards traffic in both directions between the Bluetooth interface
+and stdin/stdout of the process involved.
+The Bluetooth plugin transport could be used both on Linux and Windows
+platforms.
+@itemize @bullet
+@item Linux functionality
+@item Windows functionality
+@item Pending Features
+@end itemize
+@menu
+* Linux functionality::
+* THE INITIALIZATION::
+* THE LOOP::
+* Details about the broadcast implementation::
+* Windows functionality::
+* Pending features::
+@end menu
+@node Linux functionality
+@subsubsection Linux functionality
+@c %**end of header
+In order to implement the plugin functionality on Linux I used the BlueZ
+stack. For the communication with the other devices I used the RFCOMM
+protocol. Also I used the HCI protocol to gain some control over the device.
+The helper binary takes a single argument (the name of the Bluetooth
+interface) and is separated in two stages:
+@c %** 'THE INITIALIZATION' should be in bigger letters or stand out, not
+@c %** starting a new section?
+@node THE INITIALIZATION
+@subsubsection THE INITIALIZATION
+@itemize @bullet
+@item first, it checks if we have root privilegies (@emph{Remember that we need
+to have root privilegies in order to be able to bring the interface up if it is
+down or to change its state.}).
+@item second, it verifies if the interface with the given name exists.
+@strong{If the interface with that name exists and it is a Bluetooth
+interface:}
+@item it creates a RFCOMM socket which will be used for listening and call the
+@emph{open_device} method
+On the @emph{open_device} method:
+@itemize @bullet
+@item creates a HCI socket used to send control events to the the device
+@item searches for the device ID using the interface name
+@item saves the device MAC address
+@item checks if the interface is down and tries to bring it UP
+@item checks if the interface is in discoverable mode and tries to make it
+discoverable
+@item closes the HCI socket and binds the RFCOMM one
+@item switches the RFCOMM socket in listening mode
+@item registers the SDP service (the service will be used by the other devices
+to get the port on which this device is listening on)
+@end itemize
+@item drops the root privilegies
+@strong{If the interface is not a Bluetooth interface the helper exits with a
+suitable error}
+@end itemize
+@c %** Same as for @node entry above
+@node THE LOOP
+@subsubsection THE LOOP
+The helper binary uses a list where it saves all the connected neighbour
+devices (@emph{neighbours.devices}) and two buffers (@emph{write_pout} and
+@emph{write_std}). The first message which is send is a control message with
+the device's MAC address in order to announce the peer presence to the
+neighbours. Here are a short description of what happens in the main loop:
+@itemize @bullet
+@item Every time when it receives something from the STDIN it processes the
+data and saves the message in the first buffer (@emph{write_pout}). When it has
+something in the buffer, it gets the destination address from the buffer,
+searches the destination address in the list (if there is no connection with
+that device, it creates a new one and saves it to the list) and sends the
+message.
+@item Every time when it receives something on the listening socket it accepts
+the connection and saves the socket on a list with the reading sockets.
+@item Every time when it receives something from a reading socket it parses the
+message, verifies the CRC and saves it in the @emph{write_std} buffer in order
+to be sent later to the STDOUT.
+@end itemize
+So in the main loop we use the select function to wait until one of the file
+descriptor saved in one of the two file descriptors sets used is ready to use.
+The first set (@emph{rfds}) represents the reading set and it could contain the
+list with the reading sockets, the STDIN file descriptor or the listening
+socket. The second set (@emph{wfds}) is the writing set and it could contain
+the sending socket or the STDOUT file descriptor. After the select function
+returns, we check which file descriptor is ready to use and we do what is
+supposed to do on that kind of event. @emph{For example:} if it is the
+listening socket then we accept a new connection and save the socket in the
+reading list; if it is the STDOUT file descriptor, then we write to STDOUT the
+message from the @emph{write_std} buffer.
+To find out on which port a device is listening on we connect to the local SDP
+server and searche the registered service for that device.
+@emph{You should be aware of the fact that if the device fails to connect to
+another one when trying to send a message it will attempt one more time. If it
+fails again, then it skips the message.}
+@emph{Also you should know that the
+transport Bluetooth plugin has support for @strong{broadcast messages}.}
+@node Details about the broadcast implementation
+@subsubsection Details about the broadcast implementation
+@c %**end of header
+First I want to point out that the broadcast functionality for the CONTROL
+messages is not implemented in a conventional way. Since the inquiry scan time
+is too big and it will take some time to send a message to all the
+discoverable devices I decided to tackle the problem in a different way. Here
+is how I did it:
+@itemize @bullet
+@item If it is the first time when I have to broadcast a message I make an
+inquiry scan and save all the devices' addresses to a vector.
+@item After the inquiry scan ends I take the first address from the list and I
+try to connect to it. If it fails, I try to connect to the next one. If it
+succeeds, I save the socket to a list and send the message to the device.
+@item When I have to broadcast another message, first I search on the list for
+a new device which I'm not connected to. If there is no new device on the list
+I go to the beginning of the list and send the message to the old devices.
+After 5 cycles I make a new inquiry scan to check out if there are new
+discoverable devices and save them to the list. If there are no new
+discoverable devices I reset the cycling counter and go again through the old
+list and send messages to the devices saved in it.
+@end itemize
+@strong{Therefore}:
+@itemize @bullet
+@item every time when I have a broadcast message I look up on the list for a
+new device and send the message to it
+@item if I reached the end of the list for 5 times and I'm connected to all the
+devices from the list I make a new inquiry scan. @emph{The number of the list's
+cycles after an inquiry scan could be increased by redefining the MAX_LOOPS
+variable}
+@item when there are no new devices I send messages to the old ones.
+@end itemize
+Doing so, the broadcast control messages will reach the devices but with delay.
+@emph{NOTICE:} When I have to send a message to a certain device first I check
+on the broadcast list to see if we are connected to that device. If not we try
+to connect to it and in case of success we save the address and the socket on
+the list. If we are already connected to that device we simply use the socket.
+@node Windows functionality
+@subsubsection Windows functionality
+@c %**end of header
+For Windows I decided to use the Microsoft Bluetooth stack which has the
+advantage of coming standard from Windows XP SP2. The main disadvantage is
+that it only supports the RFCOMM protocol so we will not be able to have a low
+level control over the Bluetooth device. Therefore it is the user
+responsability to check if the device is up and in the discoverable mode. Also
+there are no tools which could be used for debugging in order to read the data
+coming from and going to a Bluetooth device, which obviously hindered my work.
+Another thing that slowed down the implementation of the plugin (besides that
+I wasn't too accomodated with the win32 API) was that there were some bugs on
+MinGW regarding the Bluetooth. Now they are solved but you should keep in mind
+that you should have the latest updates (especially the @emph{ws2bth} header).
+Besides the fact that it uses the Windows Sockets, the Windows implemenation
+follows the same principles as the Linux one:
+@itemize @bullet
+@item
+It has a initalization part where it initializes the Windows Sockets, creates a
+RFCOMM socket which will be binded and switched to the listening mode and
+registers a SDP service.
+In the Microsoft Bluetooth API there are two ways to work with the SDP:
+@itemize @bullet
+@item an easy way which works with very simple service records
+@item a hard way which is useful when you need to update or to delete the
+record
+@end itemize
+@end itemize
+Since I only needed the SDP service to find out on which port the device is
+listening on and that did not change, I decided to use the easy way. In order
+to register the service I used the @emph{WSASetService} function and I
+generated the @emph{Universally Unique Identifier} with the @emph{guidgen.exe}
+Windows's tool.
+In the loop section the only difference from the Linux implementation is that
+I used the GNUNET_NETWORK library for functions like @emph{accept},
+@emph{bind}, @emph{connect} or @emph{select}. I decided to use the
+GNUNET_NETWORK library because I also needed to interact with the STDIN and
+STDOUT handles and on Windows the select function is only defined for sockets,
+and it will not work for arbitrary file handles.
+Another difference between Linux and Windows implementation is that in Linux,
+the Bluetooth address is represented in 48 bits while in Windows is
+represented in 64 bits. Therefore I had to do some changes on
+@emph{plugin_transport_wlan} header.
+Also, currently on Windows the Bluetooth plugin doesn't have support for
+broadcast messages. When it receives a broadcast message it will skip it.
+@node Pending features
+@subsubsection Pending features
+@c %**end of header
+@itemize @bullet
+@item Implement the broadcast functionality on Windows @emph{(currently working
+on)}
+@item Implement a testcase for the helper :@ @emph{@ The testcase consists of a
+program which emaluates the plugin and uses the helper. It will simulate
+connections, disconnections and data transfers.@ }
+@end itemize
+If you have a new idea about a feature of the plugin or suggestions about how
+I could improve the implementation you are welcome to comment or to contact
+me.
+@node WLAN plugin
+@section WLAN plugin
+@c %**end of header
+This section documents how the wlan transport plugin works. Parts which are not
+implemented yet or could be better implemented are described at the end.
+@node The ATS Subsystem
+@section The ATS Subsystem
+@c %**end of header
+ATS stands for "automatic transport selection", and the function of ATS in
+GNUnet is to decide on which address (and thus transport plugin) should be used
+for two peers to communicate, and what bandwidth limits should be imposed on
+such an individual connection. To help ATS make an informed decision,
+higher-level services inform the ATS service about their requirements and the
+quality of the service rendered. The ATS service also interacts with the
+transport service to be appraised of working addresses and to communicate its
+resource allocation decisions. Finally, the ATS service's operation can be
+observed using a monitoring API.
+The main logic of the ATS service only collects the available addresses, their
+performance characteristics and the applications requirements, but does not
+make the actual allocation decision. This last critical step is left to an ATS
+plugin, as we have implemented (currently three) different allocation
+strategies which differ significantly in their performance and maturity, and it
+is still unclear if any particular plugin is generally superior.
+@node GNUnet's CORE Subsystem
+@section GNUnet's CORE Subsystem
+@c %**end of header
+The CORE subsystem in GNUnet is responsible for securing link-layer
+communications between nodes in the GNUnet overlay network. CORE builds on the
+TRANSPORT subsystem which provides for the actual, insecure, unreliable
+link-layer communication (for example, via UDP or WLAN), and then adds
+fundamental security to the connections:
+@itemize @bullet
+@item confidentiality with so-called perfect forward secrecy; we use
+@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman,
+ECDHE} powered by @uref{http://cr.yp.to/ecdh.html, Curve25519} for the key
+exchange and then use symmetric encryption, encrypting with both
+@uref{http://en.wikipedia.org/wiki/Rijndael, AES-256} and
+@uref{http://en.wikipedia.org/wiki/Twofish, Twofish}
+@item @uref{http://en.wikipedia.org/wiki/Authentication, authentication} is
+achieved by signing the ephemeral keys using @uref{http://ed25519.cr.yp.to/,
+Ed25519}, a deterministic variant of @uref{http://en.wikipedia.org/wiki/ECDSA,
+ECDSA}
+@item integrity protection (using @uref{http://en.wikipedia.org/wiki/SHA-2,
+SHA-512} to do @uref{http://en.wikipedia.org/wiki/Authenticated_encryption,
+encrypt-then-MAC)}
+@item @uref{http://en.wikipedia.org/wiki/Replay_attack, replay} protection
+(using nonces, timestamps, challenge-response, message counters and ephemeral
+keys)
+@item liveness (keep-alive messages, timeout)
+@end itemize
+@menu
+* Limitations::
+* When is a peer "connected"?::
+* libgnunetcore::
+* The CORE Client-Service Protocol::
+* The CORE Peer-to-Peer Protocol::
+@end menu
+@node Limitations
+@subsection Limitations
+@c %**end of header
+CORE does not perform @uref{http://en.wikipedia.org/wiki/Routing, routing};
+using CORE it is only possible to communicate with peers that happen to
+already be "directly" connected with each other. CORE also does not have an
+API to allow applications to establish such "direct" connections --- for this,
+applications can ask TRANSPORT, but TRANSPORT might not be able to establish a
+"direct" connection. The TOPOLOGY subsystem is responsible for trying to keep
+a few "direct" connections open at all times. Applications that need to talk
+to particular peers should use the CADET subsystem, as it can establish
+arbitrary "indirect" connections.
+Because CORE does not perform routing, CORE must only be used directly by
+applications that either perform their own routing logic (such as anonymous
+file-sharing) or that do not require routing, for example because they are
+based on flooding the network. CORE communication is unreliable and delivery
+is possibly out-of-order. Applications that require reliable communication
+should use the CADET service. Each application can only queue one message per
+target peer with the CORE service at any time; messages cannot be larger than
+approximately 63 kilobytes. If messages are small, CORE may group multiple
+messages (possibly from different applications) prior to encryption. If
+permitted by the application (using the @uref{http://baus.net/on-tcp_cork/,
+cork} option), CORE may delay transmissions to facilitate grouping of multiple
+small messages. If cork is not enabled, CORE will transmit the message as soon
+as TRANSPORT allows it (TRANSPORT is responsible for limiting bandwidth and
+congestion control). CORE does not allow flow control; applications are
+expected to process messages at line-speed. If flow control is needed,
+applications should use the CADET service.
+@node When is a peer "connected"?
+@subsection When is a peer "connected"?
+@c %**end of header
+In addition to the security features mentioned above, CORE also provides one
+additional key feature to applications using it, and that is a limited form of
+protocol-compatibility checking. CORE distinguishes between TRANSPORT-level
+connections (which enable communication with other peers) and
+application-level connections. Applications using the CORE API will
+(typically) learn about application-level connections from CORE, and not about
+TRANSPORT-level connections. When a typical application uses CORE, it will
+specify a set of message types (from @code{gnunet_protocols.h}) that it
+understands. CORE will then notify the application about connections it has
+with other peers if and only if those applications registered an intersecting
+set of message types with their CORE service. Thus, it is quite possible that
+CORE only exposes a subset of the established direct connections to a
+particular application --- and different applications running above CORE might
+see different sets of connections at the same time.
+A special case are applications that do not register a handler for any message
+type. CORE assumes that these applications merely want to monitor connections
+(or "all" messages via other callbacks) and will notify those applications
+about all connections. This is used, for example, by the @code{gnunet-core}
+command-line tool to display the active connections. Note that it is also
+possible that the TRANSPORT service has more active connections than the CORE
+service, as the CORE service first has to perform a key exchange with
+connecting peers before exchanging information about supported message types
+and notifying applications about the new connection.
+@node libgnunetcore
+@subsection libgnunetcore
+@c %**end of header
+The CORE API (defined in @code{gnunet_core_service.h}) is the basic messaging
+API used by P2P applications built using GNUnet. It provides applications the
+ability to send and receive encrypted messages to the peer's "directly"
+connected neighbours.
+As CORE connections are generally "direct" connections,@ applications must not
+assume that they can connect to arbitrary peers this way, as "direct"
+connections may not always be possible. Applications using CORE are notified
+about which peers are connected. Creating new "direct" connections must be
+done using the TRANSPORT API.
+The CORE API provides unreliable, out-of-order delivery. While the
+implementation tries to ensure timely, in-order delivery, both message losses
+and reordering are not detected and must be tolerated by the application. Most
+important, the core will NOT perform retransmission if messages could not be
+delivered.
+Note that CORE allows applications to queue one message per connected peer.
+The rate at which each connection operates is influenced by the preferences
+expressed by local application as well as restrictions imposed by the other
+peer. Local applications can express their preferences for particular
+connections using the "performance" API of the ATS service.
+Applications that require more sophisticated transmission capabilities such as
+TCP-like behavior, or if you intend to send messages to arbitrary remote
+peers, should use the CADET API.
+The typical use of the CORE API is to connect to the CORE service using
+@code{GNUNET_CORE_connect}, process events from the CORE service (such as
+peers connecting, peers disconnecting and incoming messages) and send messages
+to connected peers using @code{GNUNET_CORE_notify_transmit_ready}. Note that
+applications must cancel pending transmission requests if they receive a
+disconnect event for a peer that had a transmission pending; furthermore,
+queueing more than one transmission request per peer per application using the
+service is not permitted.
+The CORE API also allows applications to monitor all communications of the
+peer prior to encryption (for outgoing messages) or after decryption (for
+incoming messages). This can be useful for debugging, diagnostics or to
+establish the presence of cover traffic (for anonymity). As monitoring
+applications are often not interested in the payload, the monitoring callbacks
+can be configured to only provide the message headers (including the message
+type and size) instead of copying the full data stream to the monitoring
+client.
+The init callback of the @code{GNUNET_CORE_connect} function is called with
+the hash of the public key of the peer. This public key is used to identify
+the peer globally in the GNUnet network. Applications are encouraged to check
+that the provided hash matches the hash that they are using (as theoretically
+the application may be using a different configuration file with a different
+private key, which would result in hard to find bugs).
+As with most service APIs, the CORE API isolates applications from crashes of
+the CORE service. If the CORE service crashes, the application will see
+disconnect events for all existing connections. Once the connections are
+re-established, the applications will be receive matching connect events.
+@node The CORE Client-Service Protocol
+@subsection The CORE Client-Service Protocol
+@c %**end of header
+This section describes the protocol between an application using the CORE
+service (the client) and the CORE service process itself.
+@menu
+* Setup2::
+* Notifications::
+* Sending::
+@end menu
+@node Setup2
+@subsubsection Setup2
+@c %**end of header
+When a client connects to the CORE service, it first sends a
+@code{InitMessage} which specifies options for the connection and a set of
+message type values which are supported by the application. The options
+bitmask specifies which events the client would like to be notified about. The
+options include:
+@table @asis
+@item GNUNET_CORE_OPTION_NOTHING No notifications
+@item GNUNET_CORE_OPTION_STATUS_CHANGE Peers connecting and disconnecting
+@item GNUNET_CORE_OPTION_FULL_INBOUND All inbound messages (after decryption) with
+full payload
+@item GNUNET_CORE_OPTION_HDR_INBOUND Just the @code{MessageHeader}
+of all inbound messages
+@item GNUNET_CORE_OPTION_FULL_OUTBOUND All outbound
+messages (prior to encryption) with full payload
+@item GNUNET_CORE_OPTION_HDR_OUTBOUND Just the @code{MessageHeader} of all outbound
+messages
+@end table
+Typical applications will only monitor for connection status changes.
+The CORE service responds to the @code{InitMessage} with an
+@code{InitReplyMessage} which contains the peer's identity. Afterwards, both
+CORE and the client can send messages.
+@node Notifications
+@subsubsection Notifications
+@c %**end of header
+The CORE will send @code{ConnectNotifyMessage}s and
+@code{DisconnectNotifyMessage}s whenever peers connect or disconnect from the
+CORE (assuming their type maps overlap with the message types registered by
+the client). When the CORE receives a message that matches the set of message
+types specified during the @code{InitMessage} (or if monitoring is enabled in
+for inbound messages in the options), it sends a @code{NotifyTrafficMessage}
+with the peer identity of the sender and the decrypted payload. The same
+message format (except with @code{GNUNET_MESSAGE_TYPE_CORE_NOTIFY_OUTBOUND}
+for the message type) is used to notify clients monitoring outbound messages;
+here, the peer identity given is that of the receiver.
+@node Sending
+@subsubsection Sending
+@c %**end of header
+When a client wants to transmit a message, it first requests a transmission
+slot by sending a @code{SendMessageRequest} which specifies the priority,
+deadline and size of the message. Note that these values may be ignored by
+CORE. When CORE is ready for the message, it answers with a
+@code{SendMessageReady} response. The client can then transmit the payload
+with a @code{SendMessage} message. Note that the actual message size in the
+@code{SendMessage} is allowed to be smaller than the size in the original
+request. A client may at any time send a fresh @code{SendMessageRequest},
+which then superceeds the previous @code{SendMessageRequest}, which is then no
+longer valid. The client can tell which @code{SendMessageRequest} the CORE
+service's @code{SendMessageReady} message is for as all of these messages
+contain a "unique" request ID (based on a counter incremented by the client
+for each request).
+@node The CORE Peer-to-Peer Protocol
+@subsection The CORE Peer-to-Peer Protocol
+@c %**end of header
+@menu
+* Creating the EphemeralKeyMessage::
+* Establishing a connection::
+* Encryption and Decryption::
+* Type maps::
+@end menu
+@node Creating the EphemeralKeyMessage
+@subsubsection Creating the EphemeralKeyMessage
+@c %**end of header
+When the CORE service starts, each peer creates a fresh ephemeral (ECC)
+public-private key pair and signs the corresponding @code{EphemeralKeyMessage}
+with its long-term key (which we usually call the peer's identity; the hash of
+the public long term key is what results in a @code{struct
+GNUNET_PeerIdentity} in all GNUnet APIs. The ephemeral key is ONLY used for an
+@uref{http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman,
+ECDHE} exchange by the CORE service to establish symmetric session keys. A
+peer will use the same @code{EphemeralKeyMessage} for all peers for
+@code{REKEY_FREQUENCY}, which is usually 12 hours. After that time, it will
+create a fresh ephemeral key (forgetting the old one) and broadcast the new
+@code{EphemeralKeyMessage} to all connected peers, resulting in fresh
+symmetric session keys. Note that peers independently decide on when to
+discard ephemeral keys; it is not a protocol violation to discard keys more
+often. Ephemeral keys are also never stored to disk; restarting a peer will
+thus always create a fresh ephemeral key. The use of ephemeral keys is what
+provides @uref{http://en.wikipedia.org/wiki/Forward_secrecy, forward secrecy}.
+Just before transmission, the @code{EphemeralKeyMessage} is patched to reflect
+the current sender_status, which specifies the current state of the connection
+from the point of view of the sender. The possible values are:
+@table @asis
+@item KX_STATE_DOWN Initial value, never used on the network
+@item KX_STATE_KEY_SENT We sent our ephemeral key, do not know the key of the other
+peer
+@item KX_STATE_KEY_RECEIVED This peer has received a valid ephemeral key
+of the other peer, but we are waiting for the other peer to confirm it's
+authenticity (ability to decode) via challenge-response.
+@item KX_STATE_UP The
+connection is fully up from the point of view of the sender (now performing
+keep-alives)
+@item KX_STATE_REKEY_SENT The sender has initiated a rekeying
+operation; the other peer has so far failed to confirm a working connection
+using the new ephemeral key
+@end table
+@node Establishing a connection
+@subsubsection Establishing a connection
+@c %**end of header
+Peers begin their interaction by sending a @code{EphemeralKeyMessage} to the
+other peer once the TRANSPORT service notifies the CORE service about the
+connection. A peer receiving an @code{EphemeralKeyMessage} with a status
+indicating that the sender does not have the receiver's ephemeral key, the
+receiver's @code{EphemeralKeyMessage} is sent in response.@ Additionally, if
+the receiver has not yet confirmed the authenticity of the sender, it also
+sends an (encrypted)@code{PingMessage} with a challenge (and the identity of
+the target) to the other peer. Peers receiving a @code{PingMessage} respond
+with an (encrypted) @code{PongMessage} which includes the challenge. Peers
+receiving a @code{PongMessage} check the challenge, and if it matches set the
+connection to @code{KX_STATE_UP}.
+@node Encryption and Decryption
+@subsubsection Encryption and Decryption
+@c %**end of header
+All functions related to the key exchange and encryption/decryption of
+messages can be found in @code{gnunet-service-core_kx.c} (except for the
+cryptographic primitives, which are in @code{util/crypto*.c}).@ Given the key
+material from ECDHE, a
+@uref{http://en.wikipedia.org/wiki/Key_derivation_function, Key derivation
+function} is used to derive two pairs of encryption and decryption keys for
+AES-256 and TwoFish, as well as initialization vectors and authentication keys
+(for @uref{http://en.wikipedia.org/wiki/HMAC, HMAC}). The HMAC is computed
+over the encrypted payload. Encrypted messages include an iv_seed and the HMAC
+in the header.
+Each encrypted message in the CORE service includes a sequence number and a
+timestamp in the encrypted payload. The CORE service remembers the largest
+observed sequence number and a bit-mask which represents which of the previous
+32 sequence numbers were already used. Messages with sequence numbers lower
+than the largest observed sequence number minus 32 are discarded. Messages
+with a timestamp that is less than @code{REKEY_TOLERANCE} off (5 minutes) are
+also discarded. This of course means that system clocks need to be reasonably
+synchronized for peers to be able to communicate. Additionally, as the
+ephemeral key changes every 12h, a peer would not even be able to decrypt
+messages older than 12h.
+@node Type maps
+@subsubsection Type maps
+@c %**end of header
+Once an encrypted connection has been established, peers begin to exchange
+type maps. Type maps are used to allow the CORE service to determine which
+(encrypted) connections should be shown to which applications. A type map is
+an array of 65536 bits representing the different types of messages understood
+by applications using the CORE service. Each CORE service maintains this map,
+simply by setting the respective bit for each message type supported by any of
+the applications using the CORE service. Note that bits for message types
+embedded in higher-level protocols (such as MESH) will not be included in
+these type maps.
+Typically, the type map of a peer will be sparse. Thus, the CORE service
+attempts to compress its type map using @code{gzip}-style compression
+("deflate") prior to transmission. However, if the compression fails to
+compact the map, the map may also be transmitted without compression
+(resulting in @code{GNUNET_MESSAGE_TYPE_CORE_COMPRESSED_TYPE_MAP} or
+@code{GNUNET_MESSAGE_TYPE_CORE_BINARY_TYPE_MAP} messages respectively). Upon
+receiving a type map, the respective CORE service notifies applications about
+the connection to the other peer if they support any message type indicated in
+the type map (or no message type at all). If the CORE service experience a
+connect or disconnect event from an application, it updates its type map
+(setting or unsetting the respective bits) and notifies its neighbours about
+the change. The CORE services of the neighbours then in turn generate connect
+and disconnect events for the peer that sent the type map for their respective
+applications. As CORE messages may be lost, the CORE service confirms
+receiving a type map by sending back a
+@code{GNUNET_MESSAGE_TYPE_CORE_CONFIRM_TYPE_MAP}. If such a confirmation (with
+the correct hash of the type map) is not received, the sender will retransmit
+the type map (with exponential back-off).
+@node GNUnet's CADET subsystem
+@section GNUnet's CADET subsystem
+The CADET subsystem in GNUnet is responsible for secure end-to-end
+communications between nodes in the GNUnet overlay network. CADET builds on the
+CORE subsystem which provides for the link-layer communication and then adds
+routing, forwarding and additional security to the connections. CADET offers
+the same cryptographic services as CORE, but on an end-to-end level. This is
+done so peers retransmitting traffic on behalf of other peers cannot access the
+payload data.
+@itemize @bullet
+@item CADET provides confidentiality with so-called perfect forward secrecy; we
+use ECDHE powered by Curve25519 for the key exchange and then use symmetric
+encryption, encrypting with both AES-256 and Twofish
+@item authentication is achieved by signing the ephemeral keys using Ed25519, a
+deterministic variant of ECDSA
+@item integrity protection (using SHA-512 to do encrypt-then-MAC, although only
+256 bits are sent to reduce overhead)
+@item replay protection (using nonces, timestamps, challenge-response, message
+counters and ephemeral keys)
+@item liveness (keep-alive messages, timeout)
+@end itemize
+Additional to the CORE-like security benefits, CADET offers other properties
+that make it a more universal service than CORE.
+@itemize @bullet
+@item CADET can establish channels to arbitrary peers in GNUnet. If a peer is
+not immediately reachable, CADET will find a path through the network and ask
+other peers to retransmit the traffic on its behalf.
+@item CADET offers (optional) reliability mechanisms. In a reliable channel
+traffic is guaranteed to arrive complete, unchanged and in-order.
+@item CADET takes care of flow and congestion control mechanisms, not allowing
+the sender to send more traffic than the receiver or the network are able to
+process.
+@end itemize
+@menu
+* libgnunetcadet::
+@end menu
+@node libgnunetcadet
+@subsection libgnunetcadet
+The CADET API (defined in gnunet_cadet_service.h) is the messaging API used by
+P2P applications built using GNUnet. It provides applications the ability to
+send and receive encrypted messages to any peer participating in GNUnet. The
+API is heavily base on the CORE API.
+CADET delivers messages to other peers in "channels". A channel is a permanent
+connection defined by a destination peer (identified by its public key) and a
+port number. Internally, CADET tunnels all channels towards a destiantion peer
+using one session key and relays the data on multiple "connections",
+independent from the channels.
+Each channel has optional paramenters, the most important being the reliability
+flag. Should a message get lost on TRANSPORT/CORE level, if a channel is
+created with as reliable, CADET will retransmit the lost message and deliver it
+in order to the destination application.
+To communicate with other peers using CADET, it is necessary to first connect
+to the service using @code{GNUNET_CADET_connect}. This function takes several
+parameters in form of callbacks, to allow the client to react to various
+events, like incoming channels or channels that terminate, as well as specify a
+list of ports the client wishes to listen to (at the moment it is not possible
+to start listening on further ports once connected, but nothing prevents a
+client to connect several times to CADET, even do one connection per listening
+port). The function returns a handle which has to be used for any further
+interaction with the service.
+To connect to a remote peer a client has to call the
+@code{GNUNET_CADET_channel_create} function. The most important parameters
+given are the remote peer's identity (it public key) and a port, which
+specifies which application on the remote peer to connect to, similar to
+TCP/UDP ports. CADET will then find the peer in the GNUnet network and
+establish the proper low-level connections and do the necessary key exchanges
+to assure and authenticated, secure and verified communication. Similar to
+@code{GNUNET_CADET_connect},@code{GNUNET_CADET_create_channel} returns a handle
+to interact with the created channel.
+For every message the client wants to send to the remote application,
+@code{GNUNET_CADET_notify_transmit_ready} must be called, indicating the
+channel on which the message should be sent and the size of the message (but
+not the message itself!). Once CADET is ready to send the message, the provided
+callback will fire, and the message contents are provided to this callback.
+Please note the CADET does not provide an explicit notification of when a
+channel is connected. In loosely connected networks, like big wireless mesh
+networks, this can take several seconds, even minutes in the worst case. To be
+alerted when a channel is online, a client can call
+@code{GNUNET_CADET_notify_transmit_ready} immediately after
+@code{GNUNET_CADET_create_channel}. When the callback is activated, it means
+that the channel is online. The callback can give 0 bytes to CADET if no
+message is to be sent, this is ok.
+If a transmission was requested but before the callback fires it is no longer
+needed, it can be cancelled with
+@code{GNUNET_CADET_notify_transmit_ready_cancel}, which uses the handle given
+back by @code{GNUNET_CADET_notify_transmit_ready}. As in the case of CORE, only
+one message can be requested at a time: a client must not call
+@code{GNUNET_CADET_notify_transmit_ready} again until the callback is called or
+the request is cancelled.
+When a channel is no longer needed, a client can call
+@code{GNUNET_CADET_channel_destroy} to get rid of it. Note that CADET will try
+to transmit all pending traffic before notifying the remote peer of the
+destruction of the channel, including retransmitting lost messages if the
+channel was reliable.
+Incoming channels, channels being closed by the remote peer, and traffic on any
+incoming or outgoing channels are given to the client when CADET executes the
+callbacks given to it at the time of @code{GNUNET_CADET_connect}.
+Finally, when an application no longer wants to use CADET, it should call
+@code{GNUNET_CADET_disconnect}, but first all channels and pending
+transmissions must be closed (otherwise CADET will complain).
+@node GNUnet's NSE subsystem
+@section GNUnet's NSE subsystem
+NSE stands for Network Size Estimation. The NSE subsystem provides other
+subsystems and users with a rough estimate of the number of peers currently
+participating in the GNUnet overlay. The computed value is not a precise number
+as producing a precise number in a decentralized, efficient and secure way is
+impossible. While NSE's estimate is inherently imprecise, NSE also gives the
+expected range. For a peer that has been running in a stable network for a
+while, the real network size will typically (99.7% of the time) be in the range
+of [2/3 estimate, 3/2 estimate]. We will now give an overview of the algorithm
+used to calcualte the estimate; all of the details can be found in this
+technical report.
+@menu
+* Motivation::
+* Principle::
+* libgnunetnse::
+* The NSE Client-Service Protocol::
+* The NSE Peer-to-Peer Protocol::
+@end menu
+@node Motivation
+@subsection Motivation
+Some subsytems, like DHT, need to know the size of the GNUnet network to
+optimize some parameters of their own protocol. The decentralized nature of
+GNUnet makes efficient and securely counting the exact number of peers
+infeasable. Although there are several decentralized algorithms to count the
+number of peers in a system, so far there is none to do so securely. Other
+protocols may allow any malicious peer to manipulate the final result or to
+take advantage of the system to perform DoS (Denial of Service) attacks against
+the network. GNUnet's NSE protocol avoids these drawbacks.
+@menu
+* Security::
+@end menu
+@node Security
+@subsubsection Security
+The NSE subsystem is designed to be resilient against these attacks. It uses
+@uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proofs of work} to
+prevent one peer from impersonating a large number of participants, which would
+otherwise allow an adversary to artifically inflate the estimate. The DoS
+protection comes from the time-based nature of the protocol: the estimates are
+calculated periodically and out-of-time traffic is either ignored or stored for
+later retransmission by benign peers. In particular, peers cannot trigger
+global network communication at will.
+@node Principle
+@subsection Principle
+The algorithm calculates the estimate by finding the globally closest peer ID
+to a random, time-based value.
+The idea is that the closer the ID is to the random value, the more "densely
+packed" the ID space is, and therefore, more peers are in the network.
+@menu
+* Example::
+* Algorithm::
+* Target value::
+* Timing::
+* Controlled Flooding::
+* Calculating the estimate::
+@end menu
+@node Example
+@subsubsection Example
+Suppose all peers have IDs between 0 and 100 (our ID space), and the random
+value is 42. If the closest peer has the ID 70 we can imagine that the average
+"distance" between peers is around 30 and therefore the are around 3 peers in
+the whole ID space. On the other hand, if the closest peer has the ID 44, we
+can imagine that the space is rather packed with peers, maybe as much as 50 of
+them. Naturally, we could have been rather unlucky, and there is only one peer
+and happens to have the ID 44. Thus, the current estimate is calculated as the
+average over multiple rounds, and not just a single sample.
+@node Algorithm
+@subsubsection Algorithm
+Given that example, one can imagine that the job of the subsystem is to
+efficiently communicate the ID of the closest peer to the target value to all
+the other peers, who will calculate the estimate from it.
+@node Target value
+@subsubsection Target value
+@c %**end of header
+The target value itself is generated by hashing the current time, rounded down
+to an agreed value. If the rounding amount is 1h (default) and the time is
+12:34:56, the time to hash would be 12:00:00. The process is repeated each
+rouning amount (in this example would be every hour). Every repetition is
+called a round.
+@node Timing
+@subsubsection Timing
+@c %**end of header
+The NSE subsystem has some timing control to avoid everybody broadcasting its
+ID all at one. Once each peer has the target random value, it compares its own
+ID to the target and calculates the hypothetical size of the network if that
+peer were to be the closest. Then it compares the hypothetical size with the
+estimate from the previous rounds. For each value there is an assiciated point
+in the period, let's call it "broadcast time". If its own hypothetical estimate
+is the same as the previous global estimate, its "broadcast time" will be in
+the middle of the round. If its bigger it will be earlier and if its smaler
+(the most likely case) it will be later. This ensures that the peers closests
+to the target value start broadcasting their ID the first.
+@node Controlled Flooding
+@subsubsection Controlled Flooding
+@c %**end of header
+When a peer receives a value, first it verifies that it is closer than the
+closest value it had so far, otherwise it answers the incoming message with a
+message containing the better value. Then it checks a proof of work that must
+be included in the incoming message, to ensure that the other peer's ID is not
+made up (otherwise a malicious peer could claim to have an ID of exactly the
+target value every round). Once validated, it compares the brodcast time of the
+received value with the current time and if it's not too early, sends the
+received value to its neighbors. Otherwise it stores the value until the
+correct broadcast time comes. This prevents unnecessary traffic of sub-optimal
+values, since a better value can come before the broadcast time, rendering the
+previous one obsolete and saving the traffic that would have been used to
+broadcast it to the neighbors.
+@node Calculating the estimate
+@subsubsection Calculating the estimate
+@c %**end of header
+Once the closest ID has been spread across the network each peer gets the exact
+distance betweed this ID and the target value of the round and calculates the
+estimate with a mathematical formula described in the tech report. The estimate
+generated with this method for a single round is not very precise. Remember the
+case of the example, where the only peer is the ID 44 and we happen to generate
+the target value 42, thinking there are 50 peers in the network. Therefore, the
+NSE subsystem remembers the last 64 estimates and calculates an average over
+them, giving a result of which usually has one bit of uncertainty (the real
+size could be half of the estimate or twice as much). Note that the actual
+network size is calculated in powers of two of the raw input, thus one bit of
+uncertainty means a factor of two in the size estimate.
+@node libgnunetnse
+@subsection libgnunetnse
+@c %**end of header
+The NSE subsystem has the simplest API of all services, with only two calls:
+@code{GNUNET_NSE_connect} and @code{GNUNET_NSE_disconnect}.
+The connect call gets a callback function as a parameter and this function is
+called each time the network agrees on an estimate. This usually is once per
+round, with some exceptions: if the closest peer has a late local clock and
+starts spreading his ID after everyone else agreed on a value, the callback
+might be activated twice in a round, the second value being always bigger than
+the first. The default round time is set to 1 hour.
+The disconnect call disconnects from the NSE subsystem and the callback is no
+longer called with new estimates.
+@menu
+* Results::
+* Examples2::
+@end menu
+@node Results
+@subsubsection Results
+@c %**end of header
+The callback provides two values: the average and the
+@uref{http://en.wikipedia.org/wiki/Standard_deviation, standard deviation} of
+the last 64 rounds. The values provided by the callback function are
+logarithmic, this means that the real estimate numbers can be obtained by
+calculating 2 to the power of the given value (2average). From a statistics
+point of view this means that:
+@itemize @bullet
+@item 68% of the time the real size is included in the interval
+[(2average-stddev), 2]
+@item 95% of the time the real size is included in the interval
+[(2average-2*stddev, 2^average+2*stddev]
+@item 99.7% of the time the real size is included in the interval
+[(2average-3*stddev, 2average+3*stddev]
+@end itemize
+The expected standard variation for 64 rounds in a network of stable size is
+0.2. Thus, we can say that normally:
+@itemize @bullet
+@item 68% of the time the real size is in the range [-13%, +15%]
+@item 95% of the time the real size is in the range [-24%, +32%]
+@item 99.7% of the time the real size is in the range [-34%, +52%]
+@end itemize
+As said in the introduction, we can be quite sure that usually the real size is
+between one third and three times the estimate. This can of course vary with
+network conditions. Thus, applications may want to also consider the provided
+standard deviation value, not only the average (in particular, if the standard
+veriation is very high, the average maybe meaningless: the network size is
+changing rapidly).
+@node Examples2
+@subsubsection Examples2
+@c %**end of header
+Let's close with a couple examples.
+@table @asis
+@item Average: 10, std dev: 1 Here the estimate would be 2^10 = 1024 peers.@
+The range in which we can be 95% sure is: [2^8, 2^12] = [256, 4096]. We can be
+very (>99.7%) sure that the network is not a hundred peers and absolutely sure
+that it is not a million peers, but somewhere around a thousand.
+@item Average 22, std dev: 0.2 Here the estimate would be 2^22 = 4 Million peers.@
+The range in which we can be 99.7% sure is: [2^21.4, 2^22.6] = [2.8M, 6.3M].
+We can be sure that the network size is around four million, with absolutely
+way of it being 1 million.
+@end table
+To put this in perspective, if someone remembers the LHC Higgs boson results,
+were announced with "5 sigma" and "6 sigma" certainties. In this case a 5 sigma
+minimum would be 2 million and a 6 sigma minimum, 1.8 million.
+@node The NSE Client-Service Protocol
+@subsection The NSE Client-Service Protocol
+@c %**end of header
+As with the API, the client-service protocol is very simple, only has 2
+different messages, defined in @code{src/nse/nse.h}:
+@itemize @bullet
+@item @code{GNUNET_MESSAGE_TYPE_NSE_START}@ This message has no parameters and
+is sent from the client to the service upon connection.
+@item @code{GNUNET_MESSAGE_TYPE_NSE_ESTIMATE}@ This message is sent from the
+service to the client for every new estimate and upon connection. Contains a
+timestamp for the estimate, the average and the standard deviation for the
+respective round.
+@end itemize
+When the @code{GNUNET_NSE_disconnect} API call is executed, the client simply
+disconnects from the service, with no message involved.
+@node The NSE Peer-to-Peer Protocol
+@subsection The NSE Peer-to-Peer Protocol
+@c %**end of header
+The NSE subsystem only has one message in the P2P protocol, the
+@code{GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD} message.
+This message key contents are the timestamp to identify the round (differences
+in system clocks may cause some peers to send messages way too early or way too
+late, so the timestamp allows other peers to identify such messages easily),
+the @uref{http://en.wikipedia.org/wiki/Proof-of-work_system, proof of work}
+used to make it difficult to mount a
+@uref{http://en.wikipedia.org/wiki/Sybil_attack, Sybil attack}, and the public
+key, which is used to verify the signature on the message.
+Every peer stores a message for the previous, current and next round. The
+messages for the previous and current round are given to peers that connect to
+us. The message for the next round is simply stored until our system clock
+advances to the next round. The message for the current round is what we are
+flooding the network with right now. At the beginning of each round the peer
+does the following:
+@itemize @bullet
+@item calculates his own distance to the target value
+@item creates, signs and stores the message for the current round (unless it
+has a better message in the "next round" slot which came early in the previous
+round)
+@item calculates, based on the stored round message (own or received) when to
+stard flooding it to its neighbors
+@end itemize
+Upon receiving a message the peer checks the validity of the message (round,
+proof of work, signature). The next action depends on the contents of the
+incoming message:
+@itemize @bullet
+@item if the message is worse than the current stored message, the peer sends
+the current message back immediately, to stop the other peer from spreading
+suboptimal results
+@item if the message is better than the current stored message, the peer stores
+the new message and calculates the new target time to start spreading it to its
+neighbors (excluding the one the message came from)
+@item if the message is for the previous round, it is compared to the message
+stored in the "previous round slot", which may then be updated
+@item if the message is for the next round, it is compared to the message
+stored in the "next round slot", which again may then be updated
+@end itemize
+Finally, when it comes to send the stored message for the current round to the
+neighbors there is a random delay added for each neighbor, to avoid traffic
+spikes and minimize cross-messages.
+@node GNUnet's HOSTLIST subsystem
+@section GNUnet's HOSTLIST subsystem
+@c %**end of header
+Peers in the GNUnet overlay network need address information so that they can
+connect with other peers. GNUnet uses so called HELLO messages to store and
+exchange peer addresses. GNUnet provides several methods for peers to obtain
+this information:
+@itemize @bullet
+@item out-of-band exchange of HELLO messages (manually, using for example
+gnunet-peerinfo)
+@item HELLO messages shipped with GNUnet (automatic with distribution)
+@item UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast)
+@item topology gossiping (learning from other peers we already connected to),
+and
+@item the HOSTLIST daemon covered in this section, which is particularly
+relevant for bootstrapping new peers.
+@end itemize
+New peers have no existing connections (and thus cannot learn from gossip among
+peers), may not have other peers in their LAN and might be started with an
+outdated set of HELLO messages from the distribution. In this case, getting new
+peers to connect to the network requires either manual effort or the use of a
+HOSTLIST to obtain HELLOs.
+@menu
+* HELLOs::
+* Overview for the HOSTLIST subsystem::
+* Interacting with the HOSTLIST daemon::
+* Hostlist security address validation::
+* The HOSTLIST daemon::
+* The HOSTLIST server::
+* The HOSTLIST client::
+* Usage::
+@end menu
+@node HELLOs
+@subsection HELLOs
+@c %**end of header
+The basic information peers require to connect to other peers are contained in
+so called HELLO messages you can think of as a business card. Besides the
+identity of the peer (based on the cryptographic public key) a HELLO message
+may contain address information that specifies ways to contact a peer. By
+obtaining HELLO messages, a peer can learn how to contact other peers.
+@node Overview for the HOSTLIST subsystem
+@subsection Overview for the HOSTLIST subsystem
+@c %**end of header
+The HOSTLIST subsystem provides a way to distribute and obtain contact
+information to connect to other peers using a simple HTTP GET request. It's
+implementation is split in three parts, the main file for the daemon itself
+(gnunet-daemon-hostlist.c), the HTTP client used to download peer information
+(hostlist-client.c) and the server component used to provide this information
+to other peers (hostlist-server.c). The server is basically a small HTTP web
+server (based on GNU libmicrohttpd) which provides a list of HELLOs known to
+the local peer for download. The client component is basically a HTTP client
+(based on libcurl) which can download hostlists from one or more websites. The
+hostlist format is a binary blob containing a sequence of HELLO messages. Note
+that any HTTP server can theoretically serve a hostlist, the build-in hostlist
+server makes it simply convenient to offer this service.
+@menu
+* Features::
+* Limitations2::
+@end menu
+@node Features
+@subsubsection Features
+@c %**end of header
+The HOSTLIST daemon can:
+@itemize @bullet
+@item provide HELLO messages with validated addresses obtained from PEERINFO to
+download for other peers
+@item download HELLO messages and forward these message to the TRANSPORT
+subsystem for validation
+@item advertises the URL of this peer's hostlist address to other peers via
+gossip
+@item automatically learn about hostlist servers from the gossip of other peers
+@end itemize
+@node Limitations2
+@subsubsection Limitations2
+@c %**end of header
+The HOSTLIST daemon does not:
+@itemize @bullet
+@item verify the cryptographic information in the HELLO messages
+@item verify the address information in the HELLO messages
+@end itemize
+@node Interacting with the HOSTLIST daemon
+@subsection Interacting with the HOSTLIST daemon
+@c %**end of header
+The HOSTLIST subsystem is currently implemented as a daemon, so there is no
+need for the user to interact with it and therefore there is no command line
+tool and no API to communicate with the daemon. In the future, we can envision
+changing this to allow users to manually trigger the download of a hostlist.
+Since there is no command line interface to interact with HOSTLIST, the only
+way to interact with the hostlist is to use STATISTICS to obtain or modify
+information about the status of HOSTLIST:
+@example
+$ gnunet-statistics -s hostlist
+@end example
+In particular, HOSTLIST includes a @strong{persistent} value in statistics that
+specifies when the hostlist server might be queried next. As this value is
+exponentially increasing during runtime, developers may want to reset or
+manually adjust it. Note that HOSTLIST (but not STATISTICS) needs to be
+shutdown if changes to this value are to have any effect on the daemon (as
+HOSTLIST does not monitor STATISTICS for changes to the download
+frequency).
+@node Hostlist security address validation
+@subsection Hostlist security address validation
+@c %**end of header
+Since information obtained from other parties cannot be trusted without
+validation, we have to distinguish between @emph{validated} and @emph{not
+validated} addresses. Before using (and so trusting) information from other
+parties, this information has to be double-checked (validated). Address
+validation is not done by HOSTLIST but by the TRANSPORT service.
+The HOSTLIST component is functionally located between the PEERINFO and the
+TRANSPORT subsystem. When acting as a server, the daemon obtains valid
+(@emph{validated}) peer information (HELLO messages) from the PEERINFO service
+and provides it to other peers. When acting as a client, it contacts the
+HOSTLIST servers specified in the configuration, downloads the (unvalidated)
+list of HELLO messages and forwards these information to the TRANSPORT server
+to validate the addresses.
+@node The HOSTLIST daemon
+@subsection The HOSTLIST daemon
+@c %**end of header
+The hostlist daemon is the main component of the HOSTLIST subsystem. It is
+started by the ARM service and (if configured) starts the HOSTLIST client and
+server components.
+If the daemon provides a hostlist itself it can advertise it's own hostlist to
+other peers. To do so it sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
+message to other peers when they connect to this peer on the CORE level. This
+hostlist advertisement message contains the URL to access the HOSTLIST HTTP
+server of the sender. The daemon may also subscribe to this type of message
+from CORE service, and then forward these kind of message to the HOSTLIST
+client. The client then uses all available URLs to download peer information
+when necessary.
+When starting, the HOSTLIST daemon first connects to the CORE subsystem and if
+hostlist learning is enabled, registers a CORE handler to receive this kind of
+messages. Next it starts (if configured) the client and server. It passes
+pointers to CORE connect and disconnect and receive handlers where the client
+and server store their functions, so the daemon can notify them about CORE
+events.
+To clean up on shutdown, the daemon has a cleaning task, shutting down all
+subsystems and disconnecting from CORE.
+@node The HOSTLIST server
+@subsection The HOSTLIST server
+@c %**end of header
+The server provides a way for other peers to obtain HELLOs. Basically it is a
+small web server other peers can connect to and download a list of HELLOs using
+standard HTTP; it may also advertise the URL of the hostlist to other peers
+connecting on CORE level.
+@menu
+* The HTTP Server::
+* Advertising the URL::
+@end menu
+@node The HTTP Server
+@subsubsection The HTTP Server
+@c %**end of header
+During startup, the server starts a web server listening on the port specified
+with the HTTPPORT value (default 8080). In addition it connects to the PEERINFO
+service to obtain peer information. The HOSTLIST server uses the
+GNUNET_PEERINFO_iterate function to request HELLO information for all peers and
+adds their information to a new hostlist if they are suitable (expired
+addresses and HELLOs without addresses are both not suitable) and the maximum
+size for a hostlist is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When
+PEERINFO finishes (with a last NULL callback), the server destroys the previous
+hostlist response available for download on the web server and replaces it with
+the updated hostlist. The hostlist format is basically a sequence of HELLO
+messages (as obtained from PEERINFO) without any special tokenization. Since
+each HELLO message contains a size field, the response can easily be split into
+separate HELLO messages by the client.
+A HOSTLIST client connecting to the HOSTLIST server will receive the hostlist
+as a HTTP response and the the server will terminate the connection with the
+result code HTTP 200 OK. The connection will be closed immediately if no
+hostlist is available.
+@node Advertising the URL
+@subsubsection Advertising the URL
+@c %**end of header
+The server also advertises the URL to download the hostlist to other peers if
+hostlist advertisement is enabled. When a new peer connects and has hostlist
+learning enabled, the server sends a GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT
+message to this peer using the CORE service.
+@node The HOSTLIST client
+@subsection The HOSTLIST client
+@c %**end of header
+The client provides the functionality to download the list of HELLOs from a set
+of URLs. It performs a standard HTTP request to the URLs configured and learned
+from advertisement messages received from other peers. When a HELLO is
+downloaded, the HOSTLIST client forwards the HELLO to the TRANSPORT service for
+validation.
+The client supports two modes of operation: download of HELLOs (bootstrapping)
+and learning of URLs.
+@menu
+* Bootstrapping::
+* Learning::
+@end menu
+@node Bootstrapping
+@subsubsection Bootstrapping
+@c %**end of header
+For bootstrapping, it schedules a task to download the hostlist from the set of
+known URLs. The downloads are only performed if the number of current
+connections is smaller than a minimum number of connections (at the moment 4).
+The interval between downloads increases exponentially; however, the
+exponential growth is limited if it becomes longer than an hour. At that point,
+the frequency growth is capped at (#number of connections * 1h).
+Once the decision has been taken to download HELLOs, the daemon chooses a
+random URL from the list of known URLs. URLs can be configured in the
+configuration or be learned from advertisement messages. The client uses a HTTP
+client library (libcurl) to initiate the download using the libcurl multi
+interface. Libcurl passes the data to the callback_download function which
+stores the data in a buffer if space is available and the maximum size for a
+hostlist download is not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When a
+full HELLO was downloaded, the HOSTLIST client offers this HELLO message to the
+TRANSPORT service for validation. When the download is finished or failed,
+statistical information about the quality of this URL is updated.
+@node Learning
+@subsubsection Learning
+@c %**end of header
+The client also manages hostlist advertisements from other peers. The HOSTLIST
+daemon forwards GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT messages to the
+client subsystem, which extracts the URL from the message. Next, a test of the
+newly obtained URL is performed by triggering a download from the new URL. If
+the URL works correctly, it is added to the list of working URLs.
+The size of the list of URLs is restricted, so if an additional server is added
+and the list is full, the URL with the worst quality ranking (determined
+through successful downloads and number of HELLOs e.g.) is discarded. During
+shutdown the list of URLs is saved to a file for persistance and loaded on
+startup. URLs from the configuration file are never discarded.
+@node Usage
+@subsection Usage
+@c %**end of header
+To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES section
+for the ARM services. This is done in the default configuration.
+For more information on how to configure the HOSTLIST subsystem see the
+installation handbook:@ Configuring the hostlist to bootstrap@ Configuring your
+peer to provide a hostlist
+@node GNUnet's IDENTITY subsystem
+@section GNUnet's IDENTITY subsystem
+@c %**end of header
+Identities of "users" in GNUnet are called egos. Egos can be used as pseudonyms
+(fake names) or be tied to an organization (for example, GNU) or even the
+actual identity of a human. GNUnet users are expected to have many egos. They
+might have one tied to their real identity, some for organizations they manage,
+and more for different domains where they want to operate under a pseudonym.
+The IDENTITY service allows users to manage their egos. The identity service
+manages the private keys egos of the local user; it does not manage identities
+of other users (public keys). Public keys for other users need names to become
+manageable. GNUnet uses the GNU Name System (GNS) to give names to other users
+and manage their public keys securely. This chapter is about the IDENTITY
+service, which is about the management of private keys.
+On the network, an ego corresponds to an ECDSA key (over Curve25519, using RFC
+6979, as required by GNS). Thus, users can perform actions under a particular
+ego by using (signing with) a particular private key. Other users can then
+confirm that the action was really performed by that ego by checking the
+signature against the respective public key.
+The IDENTITY service allows users to associate a human-readable name with each
+ego. This way, users can use names that will remind them of the purpose of a
+particular ego. The IDENTITY service will store the respective private keys and
+allows applications to access key information by name. Users can change the
+name that is locally (!) associated with an ego. Egos can also be deleted,
+which means that the private key will be removed and it thus will not be
+possible to perform actions with that ego in the future.
+Additionally, the IDENTITY subsystem can associate service functions with egos.
+For example, GNS requires the ego that should be used for the shorten zone. GNS
+will ask IDENTITY for an ego for the "gns-short" service. The IDENTITY service
+has a mapping of such service strings to the name of the ego that the user
+wants to use for this service, for example "my-short-zone-ego".
+Finally, the IDENTITY API provides access to a special ego, the anonymous ego.
+The anonymous ego is special in that its private key is not really private, but
+fixed and known to everyone. Thus, anyone can perform actions as anonymous.
+This can be useful as with this trick, code does not have to contain a special
+case to distinguish between anonymous and pseudonymous egos.
+@menu
+* libgnunetidentity::
+* The IDENTITY Client-Service Protocol::
+@end menu
+@node libgnunetidentity
+@subsection libgnunetidentity
+@c %**end of header
+@menu
+* Connecting to the service::
+* Operations on Egos::
+* The anonymous Ego::
+* Convenience API to lookup a single ego::
+* Associating egos with service functions::
+@end menu
+@node Connecting to the service
+@subsubsection Connecting to the service
+@c %**end of header
+First, typical clients connect to the identity service using
+@code{GNUNET_IDENTITY_connect}. This function takes a callback as a parameter.
+If the given callback parameter is non-null, it will be invoked to notify the
+application about the current state of the identities in the system.
+@itemize @bullet
+@item First, it will be invoked on all known egos at the time of the
+connection. For each ego, a handle to the ego and the user's name for the ego
+will be passed to the callback. Furthermore, a @code{void **} context argument
+will be provided which gives the client the opportunity to associate some state
+with the ego.
+@item Second, the callback will be invoked with NULL for the ego, the name and
+the context. This signals that the (initial) iteration over all egos has
+completed.
+@item Then, the callback will be invoked whenever something changes about an
+ego. If an ego is renamed, the callback is invoked with the ego handle of the
+ego that was renamed, and the new name. If an ego is deleted, the callback is
+invoked with the ego handle and a name of NULL. In the deletion case, the
+application should also release resources stored in the context.
+@item When the application destroys the connection to the identity service
+using @code{GNUNET_IDENTITY_disconnect}, the callback is again invoked with the
+ego and a name of NULL (equivalent to deletion of the egos). This should again
+be used to clean up the per-ego context.
+@end itemize
+The ego handle passed to the callback remains valid until the callback is
+invoked with a name of NULL, so it is safe to store a reference to the ego's
+handle.
+@node Operations on Egos
+@subsubsection Operations on Egos
+@c %**end of header
+Given an ego handle, the main operations are to get its associated private key
+using @code{GNUNET_IDENTITY_ego_get_private_key} or its associated public key
+using @code{GNUNET_IDENTITY_ego_get_public_key}.
+The other operations on egos are pretty straightforward. Using
+@code{GNUNET_IDENTITY_create}, an application can request the creation of an
+ego by specifying the desired name. The operation will fail if that name is
+already in use. Using @code{GNUNET_IDENTITY_rename} the name of an existing ego
+can be changed. Finally, egos can be deleted using
+@code{GNUNET_IDENTITY_delete}. All of these operations will trigger updates to
+the callback given to the @code{GNUNET_IDENTITY_connect} function of all
+applications that are connected with the identity service at the time.
+@code{GNUNET_IDENTITY_cancel} can be used to cancel the operations before the
+respective continuations would be called. It is not guaranteed that the
+operation will not be completed anyway, only the continuation will no longer be
+called.
+@node The anonymous Ego
+@subsubsection The anonymous Ego
+@c %**end of header
+A special way to obtain an ego handle is to call
+@code{GNUNET_IDENTITY_ego_get_anonymous}, which returns an ego for the
+"anonymous" user --- anyone knows and can get the private key for this user, so
+it is suitable for operations that are supposed to be anonymous but require
+signatures (for example, to avoid a special path in the code). The anonymous
+ego is always valid and accessing it does not require a connection to the
+identity service.
+@node Convenience API to lookup a single ego
+@subsubsection Convenience API to lookup a single ego
+As applications commonly simply have to lookup a single ego, there is a
+convenience API to do just that. Use @code{GNUNET_IDENTITY_ego_lookup} to
+lookup a single ego by name. Note that this is the user's name for the ego, not
+the service function. The resulting ego will be returned via a callback and
+will only be valid during that callback. The operation can be cancelled via
+@code{GNUNET_IDENTITY_ego_lookup_cancel} (cancellation is only legal before the
+callback is invoked).
+@node Associating egos with service functions
+@subsubsection Associating egos with service functions
+The @code{GNUNET_IDENTITY_set} function is used to associate a particular ego
+with a service function. The name used by the service and the ego are given as
+arguments. Afterwards, the service can use its name to lookup the associated
+ego using @code{GNUNET_IDENTITY_get}.
+@node The IDENTITY Client-Service Protocol
+@subsection The IDENTITY Client-Service Protocol
+@c %**end of header
+A client connecting to the identity service first sends a message with type
+@code{GNUNET_MESSAGE_TYPE_IDENTITY_START} to the service. After that, the
+client will receive information about changes to the egos by receiving messages
+of type @code{GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE}. Those messages contain the
+private key of the ego and the user's name of the ego (or zero bytes for the
+name to indicate that the ego was deleted). A special bit @code{end_of_list} is
+used to indicate the end of the initial iteration over the identity service's
+egos.
+The client can trigger changes to the egos by sending CREATE, RENAME or DELETE
+messages. The CREATE message contains the private key and the desired name. The
+RENAME message contains the old name and the new name. The DELETE message only
+needs to include the name of the ego to delete. The service responds to each of
+these messages with a RESULT_CODE message which indicates success or error of
+the operation, and possibly a human-readable error message.
+Finally, the client can bind the name of a service function to an ego by
+sending a SET_DEFAULT message with the name of the service function and the
+private key of the ego. Such bindings can then be resolved using a GET_DEFAULT
+message, which includes the name of the service function. The identity service
+will respond to a GET_DEFAULT request with a SET_DEFAULT message containing the
+respective information, or with a RESULT_CODE to indicate an error.
+@node GNUnet's NAMESTORE Subsystem
+@section GNUnet's NAMESTORE Subsystem
+@c %**end of header
+The NAMESTORE subsystem provides persistent storage for local GNS zone
+information. All local GNS zone information are managed by NAMESTORE. It
+provides both the functionality to administer local GNS information (e.g.
+delete and add records) as well as to retrieve GNS information (e.g to list
+name information in a client). NAMESTORE does only manage the persistent
+storage of zone information belonging to the user running the service: GNS
+information from other users obtained from the DHT are stored by the NAMECACHE
+subsystem.
+NAMESTORE uses a plugin-based database backend to store GNS information with
+good performance. Here sqlite, MySQL and PostgreSQL are supported database
+backends. NAMESTORE clients interact with the IDENTITY subsystem to obtain
+cryptographic information about zones based on egos as described with the
+IDENTITY subsystem., but internally NAMESTORE refers to zones using the ECDSA
+private key. In addition, it collaborates with the NAMECACHE subsystem and
+stores zone information when local information are modified in the GNS cache to
+increase look-up performance for local information.
+NAMESTORE provides functionality to look-up and store records, to iterate over
+a specific or all zones and to monitor zones for changes. NAMESTORE
+functionality can be accessed using the NAMESTORE api or the NAMESTORE command
+line tool.
+@menu
+* libgnunetnamestore::
+@end menu
+@node libgnunetnamestore
+@subsection libgnunetnamestore
+@c %**end of header
+To interact with NAMESTORE clients first connect to the NAMESTORE service using
+the @code{GNUNET_NAMESTORE_connect} passing a configuration handle. As a result
+they obtain a NAMESTORE handle, they can use for operations, or NULL is
+returned if the connection failed.
+To disconnect from NAMESTORE, clients use @code{GNUNET_NAMESTORE_disconnect}
+and specify the handle to disconnect.
+NAMESTORE internally uses the ECDSA private key to refer to zones. These
+private keys can be obtained from the IDENTITY subsytem. Here @emph{egos@emph{
+can be used to refer to zones or the default ego assigned to the GNS subsystem
+can be used to obtained the master zone's private key.}}
+@menu
+* Editing Zone Information::
+* Iterating Zone Information::
+* Monitoring Zone Information::
+@end menu
+@node Editing Zone Information
+@subsubsection Editing Zone Information
+@c %**end of header
+NAMESTORE provides functions to lookup records stored under a label in a zone
+and to store records under a label in a zone.
+To store (and delete) records, the client uses the
+@code{GNUNET_NAMESTORE_records_store} function and has to provide namestore
+handle to use, the private key of the zone, the label to store the records
+under, the records and number of records plus an callback function. After the
+operation is performed NAMESTORE will call the provided callback function with
+the result GNUNET_SYSERR on failure (including timeout/queue drop/failure to
+validate), GNUNET_NO if content was already there or not found GNUNET_YES (or
+other positive value) on success plus an additional error message.
+Records are deleted by using the store command with 0 records to store. It is
+important to note, that records are not merged when records exist with the
+label. So a client has first to retrieve records, merge with existing records
+and then store the result.
+To perform a lookup operation, the client uses the
+@code{GNUNET_NAMESTORE_records_store} function. Here he has to pass the
+namestore handle, the private key of the zone and the label. He also has to
+provide a callback function which will be called with the result of the lookup
+operation: the zone for the records, the label, and the records including the
+number of records included.
+A special operation is used to set the preferred nickname for a zone. This
+nickname is stored with the zone and is automatically merged with all labels
+and records stored in a zone. Here the client uses the
+@code{GNUNET_NAMESTORE_set_nick} function and passes the private key of the
+zone, the nickname as string plus a the callback with the result of the
+operation.
+@node Iterating Zone Information
+@subsubsection Iterating Zone Information
+@c %**end of header
+A client can iterate over all information in a zone or all zones managed by
+NAMESTORE. Here a client uses the @code{GNUNET_NAMESTORE_zone_iteration_start}
+function and passes the namestore handle, the zone to iterate over and a
+callback function to call with the result. If the client wants to iterate over
+all the, he passes NULL for the zone. A @code{GNUNET_NAMESTORE_ZoneIterator}
+handle is returned to be used to continue iteration.
+NAMESTORE calls the callback for every result and expects the client to call@
+@code{GNUNET_NAMESTORE_zone_iterator_next} to continue to iterate or
+@code{GNUNET_NAMESTORE_zone_iterator_stop} to interrupt the iteration. When
+NAMESTORE reached the last item it will call the callback with a NULL value to
+indicate.
+@node Monitoring Zone Information
+@subsubsection Monitoring Zone Information
+@c %**end of header
+Clients can also monitor zones to be notified about changes. Here the clients
+uses the @code{GNUNET_NAMESTORE_zone_monitor_start} function and passes the
+private key of the zone and and a callback function to call with updates for a
+zone. The client can specify to obtain zone information first by iterating over
+the zone and specify a synchronization callback to be called when the client
+and the namestore are synced.
+On an update, NAMESTORE will call the callback with the private key of the
+zone, the label and the records and their number.
+To stop monitoring, the client call @code{GNUNET_NAMESTORE_zone_monitor_stop}
+and passes the handle obtained from the function to start the monitoring.
+@node GNUnet's PEERINFO subsystem
+@section GNUnet's PEERINFO subsystem
+@c %**end of header
+The PEERINFO subsystem is used to store verified (validated) information about
+known peers in a persistent way. It obtains these addresses for example from
+TRANSPORT service which is in charge of address validation. Validation means
+that the information in the HELLO message are checked by connecting to the
+addresses and performing a cryptographic handshake to authenticate the peer
+instance stating to be reachable with these addresses. Peerinfo does not
+validate the HELLO messages itself but only stores them and gives them to
+interested clients.
+As future work, we think about moving from storing just HELLO messages to
+providing a generic persistent per-peer information store. More and more
+subsystems tend to need to store per-peer information in persistent way. To not
+duplicate this functionality we plan to provide a PEERSTORE service providing
+this functionality
+@menu
+* Features2::
+* Limitations3::
+* DeveloperPeer Information::
+* Startup::
+* Managing Information::
+* Obtaining Information::
+* The PEERINFO Client-Service Protocol::
+* libgnunetpeerinfo::
+@end menu
+@node Features2
+@subsection Features2
+@c %**end of header
+@itemize @bullet
+@item Persistent storage
+@item Client notification mechanism on update
+@item Periodic clean up for expired information
+@item Differentiation between public and friend-only HELLO
+@end itemize
+@node Limitations3
+@subsection Limitations3
+@itemize @bullet
+@item Does not perform HELLO validation
+@end itemize
+@node DeveloperPeer Information
+@subsection DeveloperPeer Information
+@c %**end of header
+The PEERINFO subsystem stores these information in the form of HELLO messages
+you can think of as business cards. These HELLO messages contain the public key
+of a peer and the addresses a peer can be reached under. The addresses include
+an expiration date describing how long they are valid. This information is
+updated regularly by the TRANSPORT service by revalidating the address. If an
+address is expired and not renewed, it can be removed from the HELLO message.
+Some peer do not want to have their HELLO messages distributed to other peers ,
+especially when GNUnet's friend-to-friend modus is enabled. To prevent this
+undesired distribution. PEERINFO distinguishes between @emph{public} and
+@emph{friend-only} HELLO messages. Public HELLO messages can be freely
+distributed to other (possibly unknown) peers (for example using the hostlist,
+gossiping, broadcasting), whereas friend-only HELLO messages may not be
+distributed to other peers. Friend-only HELLO messages have an additional flag
+@code{friend_only} set internally. For public HELLO message this flag is not
+set. PEERINFO does and cannot not check if a client is allowed to obtain a
+specific HELLO type.
+The HELLO messages can be managed using the GNUnet HELLO library. Other GNUnet
+systems can obtain these information from PEERINFO and use it for their
+purposes. Clients are for example the HOSTLIST component providing these
+information to other peers in form of a hostlist or the TRANSPORT subsystem
+using these information to maintain connections to other peers.
+@node Startup
+@subsection Startup
+@c %**end of header
+During startup the PEERINFO services loads persistent HELLOs from disk. First
+PEERINFO parses the directory configured in the HOSTS value of the
+@code{PEERINFO} configuration section to store PEERINFO information.@ For all
+files found in this directory valid HELLO messages are extracted. In addition
+it loads HELLO messages shipped with the GNUnet distribution. These HELLOs are
+used to simplify network bootstrapping by providing valid peer information with
+the distribution. The use of these HELLOs can be prevented by setting the
+@code{USE_INCLUDED_HELLOS} in the @code{PEERINFO} configuration section to
+@code{NO}. Files containing invalid information are removed.
+@node Managing Information
+@subsection Managing Information
+@c %**end of header
+The PEERINFO services stores information about known PEERS and a single HELLO
+message for every peer. A peer does not need to have a HELLO if no information
+are available. HELLO information from different sources, for example a HELLO
+obtained from a remote HOSTLIST and a second HELLO stored on disk, are combined
+and merged into one single HELLO message per peer which will be given to
+clients. During this merge process the HELLO is immediately written to disk to
+ensure persistence.
+PEERINFO in addition periodically scans the directory where information are
+stored for empty HELLO messages with expired TRANSPORT addresses.@ This
+periodic task scans all files in the directory and recreates the HELLO messages
+it finds. Expired TRANSPORT addresses are removed from the HELLO and if the
+HELLO does not contain any valid addresses, it is discarded and removed from
+disk.
+@node Obtaining Information
+@subsection Obtaining Information
+@c %**end of header
+When a client requests information from PEERINFO, PEERINFO performs a lookup
+for the respective peer or all peers if desired and transmits this information
+to the client. The client can specify if friend-only HELLOs have to be included
+or not and PEERINFO filters the respective HELLO messages before transmitting
+information.
+To notify clients about changes to PEERINFO information, PEERINFO maintains a
+list of clients interested in this notifications. Such a notification occurs if
+a HELLO for a peer was updated (due to a merge for example) or a new peer was
+added.
+@node The PEERINFO Client-Service Protocol
+@subsection The PEERINFO Client-Service Protocol
+@c %**end of header
+To connect and disconnect to and from the PEERINFO Service PEERINFO utilizes
+the util client/server infrastructure, so no special messages types are used
+here.
+To add information for a peer, the plain HELLO message is transmitted to the
+service without any wrapping. Alle information required are stored within the
+HELLO message. The PEERINFO service provides a message handler accepting and
+processing these HELLO messages.
+When obtaining PEERINFO information using the iterate functionality specific
+messages are used. To obtain information for all peers, a @code{struct
+ListAllPeersMessage} with message type
+@code{GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL} and a flag include_friend_only to
+indicate if friend-only HELLO messages should be included are transmitted. If
+information for a specific peer is required a @code{struct ListAllPeersMessage}
+with @code{GNUNET_MESSAGE_TYPE_PEERINFO_GET} containing the peer identity is
+used.
+For both variants the PEERINFO service replies for each HELLO message he wants
+to transmit with a @code{struct ListAllPeersMessage} with type
+@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO} containing the plain HELLO. The final
+message is @code{struct GNUNET_MessageHeader} with type
+@code{GNUNET_MESSAGE_TYPE_PEERINFO_INFO}. If the client receives this message,
+he can proceed with the next request if any is pending
+@node libgnunetpeerinfo
+@subsection libgnunetpeerinfo
+@c %**end of header
+The PEERINFO API consists mainly of three different functionalities:
+maintaining a connection to the service, adding new information and retrieving
+information form the PEERINFO service.
+@menu
+* Connecting to the Service::
+* Adding Information::
+* Obtaining Information2::
+@end menu
+@node Connecting to the Service
+@subsubsection Connecting to the Service
+@c %**end of header
+To connect to the PEERINFO service the function @code{GNUNET_PEERINFO_connect}
+is used, taking a configuration handle as an argument, and to disconnect from
+PEERINFO the function @code{GNUNET_PEERINFO_disconnect}, taking the PEERINFO
+handle returned from the connect function has to be called.
+@node Adding Information
+@subsubsection Adding Information
+@c %**end of header
+@code{GNUNET_PEERINFO_add_peer} adds a new peer to the PEERINFO subsystem
+storage. This function takes the PEERINFO handle as an argument, the HELLO
+message to store and a continuation with a closure to be called with the result
+of the operation. The @code{GNUNET_PEERINFO_add_peer} returns a handle to this
+operation allowing to cancel the operation with the respective cancel function
+@code{GNUNET_PEERINFO_add_peer_cancel}. To retrieve information from PEERINFO
+you can iterate over all information stored with PEERINFO or you can tell
+PEERINFO to notify if new peer information are available.
+@node Obtaining Information2
+@subsubsection Obtaining Information2
+@c %**end of header
+To iterate over information in PEERINFO you use @code{GNUNET_PEERINFO_iterate}.
+This function expects the PEERINFO handle, a flag if HELLO messages intended
+for friend only mode should be included, a timeout how long the operation
+should take and a callback with a callback closure to be called for the
+results. If you want to obtain information for a specific peer, you can specify
+the peer identity, if this identity is NULL, information for all peers are
+returned. The function returns a handle to allow to cancel the operation using
+@code{GNUNET_PEERINFO_iterate_cancel}.
+To get notified when peer information changes, you can use
+@code{GNUNET_PEERINFO_notify}. This function expects a configuration handle and
+a flag if friend-only HELLO messages should be included. The PEERINFO service
+will notify you about every change and the callback function will be called to
+notify you about changes. The function returns a handle to cancel notifications
+with @code{GNUNET_PEERINFO_notify_cancel}.
+@node GNUnet's PEERSTORE subsystem
+@section GNUnet's PEERSTORE subsystem
+@c %**end of header
+GNUnet's PEERSTORE subsystem offers persistent per-peer storage for other
+GNUnet subsystems. GNUnet subsystems can use PEERSTORE to persistently store
+and retrieve arbitrary data. Each data record stored with PEERSTORE contains
+the following fields:
+@itemize @bullet
+@item subsystem: Name of the subsystem responsible for the record.
+@item peerid: Identity of the peer this record is related to.
+@item key: a key string identifying the record.
+@item value: binary record value.
+@item expiry: record expiry date.
+@end itemize
+@menu
+* Functionality::
+* Architecture::
+* libgnunetpeerstore::
+@end menu
+@node Functionality
+@subsection Functionality
+@c %**end of header
+Subsystems can store any type of value under a (subsystem, peerid, key)
+combination. A "replace" flag set during store operations forces the PEERSTORE
+to replace any old values stored under the same (subsystem, peerid, key)
+combination with the new value. Additionally, an expiry date is set after which
+the record is *possibly* deleted by PEERSTORE.
+Subsystems can iterate over all values stored under any of the following
+combination of fields:
+@itemize @bullet
+@item (subsystem)
+@item (subsystem, peerid)
+@item (subsystem, key)
+@item (subsystem, peerid, key)
+@end itemize
+Subsystems can also request to be notified about any new values stored under a
+(subsystem, peerid, key) combination by sending a "watch" request to
+PEERSTORE.
+@node Architecture
+@subsection Architecture
+@c %**end of header
+PEERSTORE implements the following components:
+@itemize @bullet
+@item PEERSTORE service: Handles store, iterate and watch operations.
+@item PEERSTORE API: API to be used by other subsystems to communicate and
+issue commands to the PEERSTORE service.
+@item PEERSTORE plugins: Handles the persistent storage. At the moment, only an
+"sqlite" plugin is implemented.
+@end itemize
+@node libgnunetpeerstore
+@subsection libgnunetpeerstore
+@c %**end of header
+libgnunetpeerstore is the library containing the PEERSTORE API. Subsystems
+wishing to communicate with the PEERSTORE service use this API to open a
+connection to PEERSTORE. This is done by calling
+@code{GNUNET_PEERSTORE_connect} which returns a handle to the newly created
+connection. This handle has to be used with any further calls to the API.
+To store a new record, the function @code{GNUNET_PEERSTORE_store} is to be used
+which requires the record fields and a continuation function that will be
+called by the API after the STORE request is sent to the PEERSTORE service.
+Note that calling the continuation function does not mean that the record is
+successfully stored, only that the STORE request has been successfully sent to
+the PEERSTORE service. @code{GNUNET_PEERSTORE_store_cancel} can be called to
+cancel the STORE request only before the continuation function has been called.
+To iterate over stored records, the function @code{GNUNET_PEERSTORE_iterate} is
+to be used. @emph{peerid} and @emph{key} can be set to NULL. An iterator
+callback function will be called with each matching record found and a NULL
+record at the end to signal the end of result set.
+@code{GNUNET_PEERSTORE_iterate_cancel} can be used to cancel the ITERATE
+request before the iterator callback is called with a NULL record.
+To be notified with new values stored under a (subsystem, peerid, key)
+combination, the function @code{GNUNET_PEERSTORE_watch} is to be used. This
+will register the watcher with the PEERSTORE service, any new records matching
+the given combination will trigger the callback function passed to
+@code{GNUNET_PEERSTORE_watch}. This continues until
+@code{GNUNET_PEERSTORE_watch_cancel} is called or the connection to the service
+is destroyed.
+After the connection is no longer needed, the function
+@code{GNUNET_PEERSTORE_disconnect} can be called to disconnect from the
+PEERSTORE service. Any pending ITERATE or WATCH requests will be destroyed. If
+the @code{sync_first} flag is set to @code{GNUNET_YES}, the API will delay the
+disconnection until all pending STORE requests are sent to the PEERSTORE
+service, otherwise, the pending STORE requests will be destroyed as well.
+@node GNUnet's SET Subsystem
+@section GNUnet's SET Subsystem
+@c %**end of header
+The SET service implements efficient set operations between two peers over a
+mesh tunnel. Currently, set union and set intersection are the only supported
+operations. Elements of a set consist of an @emph{element type} and arbitrary
+binary @emph{data}. The size of an element's data is limited to around 62
+KB.
+@menu
+* Local Sets::
+* Set Modifications::
+* Set Operations::
+* Result Elements::
+* libgnunetset::
+* The SET Client-Service Protocol::
+* The SET Intersection Peer-to-Peer Protocol::
+* The SET Union Peer-to-Peer Protocol::
+@end menu
+@node Local Sets
+@subsection Local Sets
+@c %**end of header
+Sets created by a local client can be modified and reused for multiple
+operations. As each set operation requires potentially expensive special
+auxilliary data to be computed for each element of a set, a set can only
+participate in one type of set operation (i.e. union or intersection). The type
+of a set is determined upon its creation. If a the elements of a set are needed
+for an operation of a different type, all of the set's element must be copied
+to a new set of appropriate type.
+@node Set Modifications
+@subsection Set Modifications
+@c %**end of header
+Even when set operations are active, one can add to and remove elements from a
+set. However, these changes will only be visible to operations that have been
+created after the changes have taken place. That is, every set operation only
+sees a snapshot of the set from the time the operation was started. This
+mechanism is @emph{not} implemented by copying the whole set, but by attaching
+@emph{generation information} to each element and operation.
+@node Set Operations
+@subsection Set Operations
+@c %**end of header
+Set operations can be started in two ways: Either by accepting an operation
+request from a remote peer, or by requesting a set operation from a remote
+peer. Set operations are uniquely identified by the involved @emph{peers}, an
+@emph{application id} and the @emph{operation type}.
+The client is notified of incoming set operations by @emph{set listeners}. A
+set listener listens for incoming operations of a specific operation type and
+application id. Once notified of an incoming set request, the client can
+accept the set request (providing a local set for the operation) or reject
+it.
+@node Result Elements
+@subsection Result Elements
+@c %**end of header
+The SET service has three @emph{result modes} that determine how an operation's
+result set is delivered to the client:
+@itemize @bullet
+@item @strong{Full Result Set.} All elements of set resulting from the set
+operation are returned to the client.
+@item @strong{Added Elements.} Only elements that result from the operation and
+are not already in the local peer's set are returned. Note that for some
+operations (like set intersection) this result mode will never return any
+elements. This can be useful if only the remove peer is actually interested in
+the result of the set operation.
+@item @strong{Removed Elements.} Only elements that are in the local peer's
+initial set but not in the operation's result set are returned. Note that for
+some operations (like set union) this result mode will never return any
+elements. This can be useful if only the remove peer is actually interested in
+the result of the set operation.
+@end itemize
+@node libgnunetset
+@subsection libgnunetset
+@c %**end of header
+@menu
+* Sets::
+* Listeners::
+* Operations::
+* Supplying a Set::
+* The Result Callback::
+@end menu
+@node Sets
+@subsubsection Sets
+@c %**end of header
+New sets are created with @code{GNUNET_SET_create}. Both the local peer's
+configuration (as each set has its own client connection) and the operation
+type must be specified. The set exists until either the client calls
+@code{GNUNET_SET_destroy} or the client's connection to the service is
+disrupted. In the latter case, the client is notified by the return value of
+functions dealing with sets. This return value must always be checked.
+Elements are added and removed with @code{GNUNET_SET_add_element} and
+@code{GNUNET_SET_remove_element}.
+@node Listeners
+@subsubsection Listeners
+@c %**end of header
+Listeners are created with @code{GNUNET_SET_listen}. Each time time a remote
+peer suggests a set operation with an application id and operation type
+matching a listener, the listener's callack is invoked. The client then must
+synchronously call either @code{GNUNET_SET_accept} or @code{GNUNET_SET_reject}.
+Note that the operation will not be started until the client calls
+@code{GNUNET_SET_commit} (see Section "Supplying a Set").
+@node Operations
+@subsubsection Operations
+@c %**end of header
+Operations to be initiated by the local peer are created with
+@code{GNUNET_SET_prepare}. Note that the operation will not be started until
+the client calls @code{GNUNET_SET_commit} (see Section "Supplying a
+Set").
+@node Supplying a Set
+@subsubsection Supplying a Set
+@c %**end of header
+To create symmetry between the two ways of starting a set operation (accepting
+and nitiating it), the operation handles returned by @code{GNUNET_SET_accept}
+and @code{GNUNET_SET_prepare} do not yet have a set to operate on, thus they
+can not do any work yet.
+The client must call @code{GNUNET_SET_commit} to specify a set to use for an
+operation. @code{GNUNET_SET_commit} may only be called once per set
+operation.
+@node The Result Callback
+@subsubsection The Result Callback
+@c %**end of header
+Clients must specify both a result mode and a result callback with
+@code{GNUNET_SET_accept} and @code{GNUNET_SET_prepare}. The result callback
+with a status indicating either that an element was received, or the operation
+failed or succeeded. The interpretation of the received element depends on the
+result mode. The callback needs to know which result mode it is used in, as the
+arguments do not indicate if an element is part of the full result set, or if
+it is in the difference between the original set and the final set.
+@node The SET Client-Service Protocol
+@subsection The SET Client-Service Protocol
+@c %**end of header
+@menu
+* Creating Sets::
+* Listeners2::
+* Initiating Operations::
+* Modifying Sets::
+* Results and Operation Status::
+* Iterating Sets::
+@end menu
+@node Creating Sets
+@subsubsection Creating Sets
+@c %**end of header
+For each set of a client, there exists a client connection to the service. Sets
+are created by sending the @code{GNUNET_SERVICE_SET_CREATE} message over a new
+client connection. Multiple operations for one set are multiplexed over one
+client connection, using a request id supplied by the client.
+@node Listeners2
+@subsubsection Listeners2
+@c %**end of header
+Each listener also requires a seperate client connection. By sending the
+@code{GNUNET_SERVICE_SET_LISTEN} message, the client notifies the service of
+the application id and operation type it is interested in. A client rejects an
+incoming request by sending @code{GNUNET_SERVICE_SET_REJECT} on the listener's
+client connection. In contrast, when accepting an incoming request, a a
+@code{GNUNET_SERVICE_SET_ACCEPT} message must be sent over the@ set that is
+supplied for the set operation.
+@node Initiating Operations
+@subsubsection Initiating Operations
+@c %**end of header
+Operations with remote peers are initiated by sending a
+@code{GNUNET_SERVICE_SET_EVALUATE} message to the service. The@ client
+connection that this message is sent by determines the set to use.
+@node Modifying Sets
+@subsubsection Modifying Sets
+@c %**end of header
+Sets are modified with the @code{GNUNET_SERVICE_SET_ADD} and
+@code{GNUNET_SERVICE_SET_REMOVE} messages.
+@c %@menu
+@c %* Results and Operation Status::
+@c %* Iterating Sets::
+@c %@end menu   
+@node Results and Operation Status
+@subsubsection Results and Operation Status
+@c %**end of header
+The service notifies the client of result elements and success/failure of a set
+operation with the @code{GNUNET_SERVICE_SET_RESULT} message.
+@node Iterating Sets
+@subsubsection Iterating Sets
+@c %**end of header
+All elements of a set can be requested by sending
+@code{GNUNET_SERVICE_SET_ITER_REQUEST}. The server responds with
+@code{GNUNET_SERVICE_SET_ITER_ELEMENT} and eventually terminates the iteration
+with @code{GNUNET_SERVICE_SET_ITER_DONE}. After each received element, the
+client@ must send @code{GNUNET_SERVICE_SET_ITER_ACK}. Note that only one set
+iteration may be active for a set at any given time.
+@node The SET Intersection Peer-to-Peer Protocol
+@subsection The SET Intersection Peer-to-Peer Protocol
+@c %**end of header
+The intersection protocol operates over CADET and starts with a
+GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
+the operation to the peer listening for inbound requests. It includes the
+number of elements of the initiating peer, which is used to decide which side
+will send a Bloom filter first.
+The listening peer checks if the operation type and application identifier are
+acceptable for its current state. If not, it responds with a
+GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
+terminates the CADET channel).
+If the application accepts the request, the listener sends back a@
+GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO if it has more elements
+in the set than the client. Otherwise, it immediately starts with the Bloom
+filter exchange. If the initiator receives a
+GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO response, it beings the
+Bloom filter exchange, unless the set size is indicated to be zero, in which
+case the intersection is considered finished after just the initial
+handshake.
+@menu
+* The Bloom filter exchange::
+* Salt::
+@end menu
+@node The Bloom filter exchange
+@subsubsection The Bloom filter exchange
+@c %**end of header
+In this phase, each peer transmits a Bloom filter over the remaining keys of
+the local set to the other peer using a
+GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF message. This message additionally
+includes the number of elements left in the sender's set, as well as the XOR
+over all of the keys in that set.
+The number of bits 'k' set per element in the Bloom filter is calculated based
+on the relative size of the two sets. Furthermore, the size of the Bloom filter
+is calculated based on 'k' and the number of elements in the set to maximize
+the amount of data filtered per byte transmitted on the wire (while avoiding an
+excessively high number of iterations).
+The receiver of the message removes all elements from its local set that do not
+pass the Bloom filter test. It then checks if the set size of the sender and
+the XOR over the keys match what is left of his own set. If they do, he sends
+a@ GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE back to indicate that the
+latest set is the final result. Otherwise, the receiver starts another Bloom
+fitler exchange, except this time as the sender.
+@node Salt
+@subsubsection Salt
+@c %**end of header
+Bloomfilter operations are probablistic: With some non-zero probability the
+test may incorrectly say an element is in the set, even though it is not.
+To mitigate this problem, the intersection protocol iterates exchanging Bloom
+filters using a different random 32-bit salt in each iteration (the salt is
+also included in the message). With different salts, set operations may fail
+for different elements. Merging the results from the executions, the
+probability of failure drops to zero.
+The iterations terminate once both peers have established that they have sets
+of the same size, and where the XOR over all keys computes the same 512-bit
+value (leaving a failure probability of 2-511).
+@node The SET Union Peer-to-Peer Protocol
+@subsection The SET Union Peer-to-Peer Protocol
+@c %**end of header
+The SET union protocol is based on Eppstein's efficient set reconciliation
+without prior context. You should read this paper first if you want to
+understand the protocol.
+The union protocol operates over CADET and starts with a
+GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer initiating
+the operation to the peer listening for inbound requests. It includes the
+number of elements of the initiating peer, which is currently not used.
+The listening peer checks if the operation type and application identifier are
+acceptable for its current state. If not, it responds with a
+GNUNET_MESSAGE_TYPE_SET_RESULT and a status of GNUNET_SET_STATUS_FAILURE (and
+terminates the CADET channel).
+If the application accepts the request, it sends back a strata estimator using
+a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The initiator evaluates
+the strata estimator and initiates the exchange of invertible Bloom filters,
+sending a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
+During the IBF exchange, if the receiver cannot invert the Bloom filter or
+detects a cycle, it sends a larger IBF in response (up to a defined maximum
+limit; if that limit is reached, the operation fails). Elements decoded while
+processing the IBF are transmitted to the other peer using
+GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the other peer using
+GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, depending on the sign
+observed during decoding of the IBF. Peers respond to a
+GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message with the respective
+element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS message. If the IBF fully
+decodes, the peer responds with a GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE
+message instead of another GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF.
+All Bloom filter operations use a salt to mingle keys before hasing them into
+buckets, such that future iterations have a fresh chance of succeeding if they
+failed due to collisions before.
+@node GNUnet's STATISTICS subsystem
+@section GNUnet's STATISTICS subsystem
+@c %**end of header
+In GNUnet, the STATISTICS subsystem offers a central place for all subsystems
+to publish unsigned 64-bit integer run-time statistics. Keeping this
+information centrally means that there is a unified way for the user to obtain
+data on all subsystems, and individual subsystems do not have to always include
+a custom data export method for performance metrics and other statistics. For
+example, the TRANSPORT system uses STATISTICS to update information about the
+number of directly connected peers and the bandwidth that has been consumed by
+the various plugins. This information is valuable for diagnosing connectivity
+and performance issues.
+Following the GNUnet service architecture, the STATISTICS subsystem is divided
+into an API which is exposed through the header
+@strong{gnunet_statistics_service.h} and the STATISTICS service
+@strong{gnunet-service-statistics}. The @strong{gnunet-statistics} command-line
+tool can be used to obtain (and change) information about the values stored by
+the STATISTICS service. The STATISTICS service does not communicate with other
+peers.
+Data is stored in the STATISTICS service in the form of tuples
+@strong{(subsystem, name, value, persistence)}. The subsystem determines to
+which other GNUnet's subsystem the data belongs. name is the name through which
+value is associated. It uniquely identifies the record from among other records
+belonging to the same subsystem. In some parts of the code, the pair
+@strong{(subsystem, name)} is called a @strong{statistic} as it identifies the
+values stored in the STATISTCS service.The persistence flag determines if the
+record has to be preserved across service restarts. A record is said to be
+persistent if this flag is set for it; if not, the record is treated as a
+non-persistent record and it is lost after service restart. Persistent records
+are written to and read from the file @strong{statistics.data} before shutdown
+and upon startup. The file is located in the HOME directory of the peer.
+An anomaly of the STATISTICS service is that it does not terminate immediately
+upon receiving a shutdown signal if it has any clients connected to it. It
+waits for all the clients that are not monitors to close their connections
+before terminating itself. This is to prevent the loss of data during peer
+shutdown --- delaying the STATISTICS service shutdown helps other services to
+store important data to STATISTICS during shutdown.
+@menu
+* libgnunetstatistics::
+* The STATISTICS Client-Service Protocol::
+@end menu
+@node libgnunetstatistics
+@subsection libgnunetstatistics
+@c %**end of header
+@strong{libgnunetstatistics} is the library containing the API for the
+STATISTICS subsystem. Any process requiring to use STATISTICS should use this
+API by to open a connection to the STATISTICS service. This is done by calling
+the function @code{GNUNET_STATISTICS_create()}. This function takes the
+subsystem's name which is trying to use STATISTICS and a configuration. All
+values written to STATISTICS with this connection will be placed in the section
+corresponding to the given subsystem's name. The connection to STATISTICS can
+be destroyed with the function GNUNET_STATISTICS_destroy(). This function
+allows for the connection to be destroyed immediately or upon transferring all
+pending write requests to the service.
+Note: STATISTICS subsystem can be disabled by setting @code{DISABLE = YES}
+under the @code{[STATISTICS]} section in the configuration. With such a
+configuration all calls to @code{GNUNET_STATISTICS_create()} return @code{NULL}
+as the STATISTICS subsystem is unavailable and no other functions from the API
+can be used.
+@menu
+* Statistics retrieval::
+* Setting statistics and updating them::
+* Watches::
+@end menu
+@node Statistics retrieval
+@subsubsection Statistics retrieval
+@c %**end of header
+Once a connection to the statistics service is obtained, information about any
+other system which uses statistics can be retrieved with the function
+GNUNET_STATISTICS_get(). This function takes the connection handle, the name of
+the subsystem whose information we are interested in (a @code{NULL} value will
+retrieve information of all available subsystems using STATISTICS), the name of
+the statistic we are interested in (a @code{NULL} value will retrieve all
+available statistics), a continuation callback which is called when all of
+requested information is retrieved, an iterator callback which is called for
+each parameter in the retrieved information and a closure for the
+aforementioned callbacks. The library then invokes the iterator callback for
+each value matching the request.
+Call to @code{GNUNET_STATISTICS_get()} is asynchronous and can be canceled with
+the function @code{GNUNET_STATISTICS_get_cancel()}. This is helpful when
+retrieving statistics takes too long and especially when we want to shutdown
+and cleanup everything.
+@node Setting statistics and updating them
+@subsubsection Setting statistics and updating them
+@c %**end of header
+So far we have seen how to retrieve statistics, here we will learn how we can
+set statistics and update them so that other subsystems can retrieve them.
+A new statistic can be set using the function @code{GNUNET_STATISTICS_set()}.
+This function takes the name of the statistic and its value and a flag to make
+the statistic persistent. The value of the statistic should be of the type
+@code{uint64_t}. The function does not take the name of the subsystem; it is
+determined from the previous @code{GNUNET_STATISTICS_create()} invocation. If
+the given statistic is already present, its value is overwritten.
+An existing statistics can be updated, i.e its value can be increased or
+decreased by an amount with the function @code{GNUNET_STATISTICS_update()}. The
+parameters to this function are similar to @code{GNUNET_STATISTICS_set()},
+except that it takes the amount to be changed as a type @code{int64_t} instead
+of the value.
+The library will combine multiple set or update operations into one message if
+the client performs requests at a rate that is faster than the available IPC
+with the STATISTICS service. Thus, the client does not have to worry about
+sending requests too quickly.
+@node Watches
+@subsubsection Watches
+@c %**end of header
+As interesting feature of STATISTICS lies in serving notifications whenever a
+statistic of our interest is modified. This is achieved by registering a watch
+through the function @code{GNUNET_STATISTICS_watch()}. The parameters of this
+function are similar to those of @code{GNUNET_STATISTICS_get()}. Changes to the
+respective statistic's value will then cause the given iterator callback to be
+called. Note: A watch can only be registered for a specific statistic. Hence
+the subsystem name and the parameter name cannot be @code{NULL} in a call to
+@code{GNUNET_STATISTICS_watch()}.
+A registered watch will keep notifying any value changes until
+@code{GNUNET_STATISTICS_watch_cancel()} is called with the same parameters that
+are used for registering the watch.
+@node The STATISTICS Client-Service Protocol
+@subsection The STATISTICS Client-Service Protocol
+@c %**end of header
+@menu
+* Statistics retrieval2::
+* Setting and updating statistics::
+* Watching for updates::
+@end menu
+@node Statistics retrieval2
+@subsubsection Statistics retrieval2
+@c %**end of header
+To retrieve statistics, the client transmits a message of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_GET} containing the given subsystem name
+and statistic parameter to the STATISTICS service. The service responds with a
+message of type @code{GNUNET_MESSAGE_TYPE_STATISTICS_VALUE} for each of the
+statistics parameters that match the client request for the client. The end of
+information retrieved is signaled by the service by sending a message of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_END}.
+@node Setting and updating statistics
+@subsubsection Setting and updating statistics
+@c %**end of header
+The subsystem name, parameter name, its value and the persistence flag are
+communicated to the service through the message
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}.
+When the service receives a message of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET}, it retrieves the subsystem name and
+checks for a statistic parameter with matching the name given in the message.
+If a statistic parameter is found, the value is overwritten by the new value
+from the message; if not found then a new statistic parameter is created with
+the given name and value.
+In addition to just setting an absolute value, it is possible to perform a
+relative update by sending a message of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_SET} with an update flag
+(@code{GNUNET_STATISTICS_SETFLAG_RELATIVE}) signifying that the value in the
+message should be treated as an update value.
+@node Watching for updates
+@subsubsection Watching for updates
+@c %**end of header
+The function registers the watch at the service by sending a message of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH}. The service then sends
+notifications through messages of type
+@code{GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE} whenever the statistic
+parameter's value is changed.
+@node GNUnet's Distributed Hash Table (DHT)
+@section GNUnet's Distributed Hash Table (DHT)
+@c %**end of header
+GNUnet includes a generic distributed hash table that can be used by developers
+building P2P applications in the framework. This section documents high-level
+features and how developers are expected to use the DHT. We have a research
+paper detailing how the DHT works. Also, Nate's thesis includes a detailed
+description and performance analysis (in chapter 6).
+Key features of GNUnet's DHT include:
+@itemize @bullet
+@item stores key-value pairs with values up to (approximately) 63k in size
+@item works with many underlay network topologies (small-world, random graph),
+underlay does not need to be a full mesh / clique
+@item support for extended queries (more than just a simple 'key'), filtering
+duplicate replies within the network (bloomfilter) and content validation (for
+details, please read the subsection on the block library)
+@item can (optionally) return paths taken by the PUT and GET operations to the
+application
+@item provides content replication to handle churn
+@end itemize
+GNUnet's DHT is randomized and unreliable. Unreliable means that there is no
+strict guarantee that a value stored in the DHT is always found --- values are
+only found with high probability. While this is somewhat true in all P2P DHTs,
+GNUnet developers should be particularly wary of this fact (this will help you
+write secure, fault-tolerant code). Thus, when writing any application using
+the DHT, you should always consider the possibility that a value stored in the
+DHT by you or some other peer might simply not be returned, or returned with a
+significant delay. Your application logic must be written to tolerate this
+(naturally, some loss of performance or quality of service is expected in this
+case).
+@menu
+* Block library and plugins::
+* libgnunetdht::
+* The DHT Client-Service Protocol::
+* The DHT Peer-to-Peer Protocol::
+@end menu
+@node Block library and plugins
+@subsection Block library and plugins
+@c %**end of header
+@menu
+* What is a Block?::
+* The API of libgnunetblock::
+* Queries::
+* Sample Code::
+* Conclusion2::
+@end menu
+@node What is a Block?
+@subsubsection What is a Block?
+@c %**end of header
+Blocks are small (< 63k) pieces of data stored under a key (struct
+GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which defines
+their data format. Blocks are used in GNUnet as units of static data exchanged
+between peers and stored (or cached) locally. Uses of blocks include
+file-sharing (the files are broken up into blocks), the VPN (DNS information is
+stored in blocks) and the DHT (all information in the DHT and meta-information
+for the maintenance of the DHT are both stored using blocks). The block
+subsystem provides a few common functions that must be available for any type
+of block.
+@node The API of libgnunetblock
+@subsubsection The API of libgnunetblock
+@c %**end of header
+The block library requires for each (family of) block type(s) a block plugin
+(implementing gnunet_block_plugin.h) that provides basic functions that are
+needed by the DHT (and possibly other subsystems) to manage the block. These
+block plugins are typically implemented within their respective subsystems.@
+The main block library is then used to locate, load and query the appropriate
+block plugin. Which plugin is appropriate is determined by the block type
+(which is just a 32-bit integer). Block plugins contain code that specifies
+which block types are supported by a given plugin. The block library loads all
+block plugins that are installed at the local peer and forwards the application
+request to the respective plugin.
+The central functions of the block APIs (plugin and main library) are to allow
+the mapping of blocks to their respective key (if possible) and the ability to
+check that a block is well-formed and matches a given request (again, if
+possible). This way, GNUnet can avoid storing invalid blocks, storing blocks
+under the wrong key and forwarding blocks in response to a query that they do
+not answer.
+One key function of block plugins is that it allows GNUnet to detect duplicate
+replies (via the Bloom filter). All plugins MUST support detecting duplicate
+replies (by adding the current response to the Bloom filter and rejecting it if
+it is encountered again). If a plugin fails to do this, responses may loop in
+the network.
+@node Queries
+@subsubsection Queries
+@c %**end of header
+The query format for any block in GNUnet consists of four main components.
+First, the type of the desired block must be specified. Second, the query must
+contain a hash code. The hash code is used for lookups in hash tables and
+databases and must not be unique for the block (however, if possible a unique
+hash should be used as this would be best for performance). Third, an optional
+Bloom filter can be specified to exclude known results; replies that hash to
+the bits set in the Bloom filter are considered invalid. False-positives can be
+eliminated by sending the same query again with a different Bloom filter
+mutator value, which parameterizes the hash function that is used. Finally, an
+optional application-specific "eXtended query" (xquery) can be specified to
+further constrain the results. It is entirely up to the type-specific plugin to
+determine whether or not a given block matches a query (type, hash, Bloom
+filter, and xquery). Naturally, not all xquery's are valid and some types of
+blocks may not support Bloom filters either, so the plugin also needs to check
+if the query is valid in the first place.
+Depending on the results from the plugin, the DHT will then discard the
+(invalid) query, forward the query, discard the (invalid) reply, cache the
+(valid) reply, and/or forward the (valid and non-duplicate) reply.
+@node Sample Code
+@subsubsection Sample Code
+@c %**end of header
+The source code in @strong{plugin_block_test.c} is a good starting point for
+new block plugins --- it does the minimal work by implementing a plugin that
+performs no validation at all. The respective @strong{Makefile.am} shows how to
+build and install a block plugin.
+@node Conclusion2
+@subsubsection Conclusion2
+@c %**end of header
+In conclusion, GNUnet subsystems that want to use the DHT need to define a
+block format and write a plugin to match queries and replies. For testing, the
+"GNUNET_BLOCK_TYPE_TEST" block type can be used; it accepts any query as valid
+and any reply as matching any query. This type is also used for the DHT command
+line tools. However, it should NOT be used for normal applications due to the
+lack of error checking that results from this primitive implementation.
+@node libgnunetdht
+@subsection libgnunetdht
+@c %**end of header
+The DHT API itself is pretty simple and offers the usual GET and PUT functions
+that work as expected. The specified block type refers to the block library
+which allows the DHT to run application-specific logic for data stored in the
+network.
+@menu
+* GET::
+* PUT::
+* MONITOR::
+* DHT Routing Options::
+@end menu
+@node GET
+@subsubsection GET
+@c %**end of header
+When using GET, the main consideration for developers (other than the block
+library) should be that after issuing a GET, the DHT will continuously cause
+(small amounts of) network traffic until the operation is explicitly canceled.
+So GET does not simply send out a single network request once; instead, the
+DHT will continue to search for data. This is needed to achieve good success
+rates and also handles the case where the respective PUT operation happens
+after the GET operation was started. Developers should not cancel an existing
+GET operation and then explicitly re-start it to trigger a new round of
+network requests; this is simply inefficient, especially as the internal
+automated version can be more efficient, for example by filtering results in
+the network that have already been returned.
+If an application that performs a GET request has a set of replies that it
+already knows and would like to filter, it can call@
+@code{GNUNET_DHT_get_filter_known_results} with an array of hashes over the
+respective blocks to tell the DHT that these results are not desired (any
+more). This way, the DHT will filter the respective blocks using the block
+library in the network, which may result in a significant reduction in
+bandwidth consumption.
+@node PUT
+@subsubsection PUT
+@c %**end of header
+In contrast to GET operations, developers @strong{must} manually re-run PUT
+operations periodically (if they intend the content to continue to be
+available). Content stored in the DHT expires or might be lost due to churn.
+Furthermore, GNUnet's DHT typically requires multiple rounds of PUT operations
+before a key-value pair is consistently available to all peers (the DHT
+randomizes paths and thus storage locations, and only after multiple rounds of
+PUTs there will be a sufficient number of replicas in large DHTs). An explicit
+PUT operation using the DHT API will only cause network traffic once, so in
+order to ensure basic availability and resistance to churn (and adversaries),
+PUTs must be repeated. While the exact frequency depends on the application, a
+rule of thumb is that there should be at least a dozen PUT operations within
+the content lifetime. Content in the DHT typically expires after one day, so
+DHT PUT operations should be repeated at least every 1-2 hours.
+@node MONITOR
+@subsubsection MONITOR
+@c %**end of header
+The DHT API also allows applications to monitor messages crossing the local
+DHT service. The types of messages used by the DHT are GET, PUT and RESULT
+messages. Using the monitoring API, applications can choose to monitor these
+requests, possibly limiting themselves to requests for a particular block
+type.
+The monitoring API is not only usefu only for diagnostics, it can also be used
+to trigger application operations based on PUT operations. For example, an
+application may use PUTs to distribute work requests to other peers. The
+workers would then monitor for PUTs that give them work, instead of looking
+for work using GET operations. This can be beneficial, especially if the
+workers have no good way to guess the keys under which work would be stored.
+Naturally, additional protocols might be needed to ensure that the desired
+number of workers will process the distributed workload.
+@node DHT Routing Options
+@subsubsection DHT Routing Options
+@c %**end of header
+There are two important options for GET and PUT requests:
+@table @asis
+@item GNUNET_DHT_RO_DEMULITPLEX_EVERYWHERE This option means that all peers
+should process the request, even if their peer ID is not closest to the key.
+For a PUT request, this means that all peers that a request traverses may make
+a copy of the data. Similarly for a GET request, all peers will check their
+local database for a result. Setting this option can thus significantly improve
+caching and reduce bandwidth consumption --- at the expense of a larger DHT
+database. If in doubt, we recommend that this option should be used.
+@item GNUNET_DHT_RO_RECORD_ROUTE This option instructs the DHT to record the path
+that a GET or a PUT request is taking through the overlay network. The
+resulting paths are then returned to the application with the respective
+result. This allows the receiver of a result to construct a path to the
+originator of the data, which might then be used for routing. Naturally,
+setting this option requires additional bandwidth and disk space, so
+applications should only set this if the paths are needed by the application
+logic.
+@item GNUNET_DHT_RO_FIND_PEER This option is an internal option used by
+the DHT's peer discovery mechanism and should not be used by applications.
+@item GNUNET_DHT_RO_BART This option is currently not implemented. It may in
+the future offer performance improvements for clique topologies.
+@end table
+@node The DHT Client-Service Protocol
+@subsection The DHT Client-Service Protocol
+@c %**end of header
+@menu
+* PUTting data into the DHT::
+* GETting data from the DHT::
+* Monitoring the DHT::
+@end menu
+@node PUTting data into the DHT
+@subsubsection PUTting data into the DHT
+@c %**end of header
+To store (PUT) data into the DHT, the client sends a@ @code{struct
+GNUNET_DHT_ClientPutMessage} to the service. This message specifies the block
+type, routing options, the desired replication level, the expiration time, key,
+value and a 64-bit unique ID for the operation. The service responds with a@
+@code{struct GNUNET_DHT_ClientPutConfirmationMessage} with the same 64-bit
+unique ID. Note that the service sends the confirmation as soon as it has
+locally processed the PUT request. The PUT may still be propagating through the
+network at this time.
+In the future, we may want to change this to provide (limited) feedback to the
+client, for example if we detect that the PUT operation had no effect because
+the same key-value pair was already stored in the DHT. However, changing this
+would also require additional state and messages in the P2P
+interaction.
+@node GETting data from the DHT
+@subsubsection GETting data from the DHT
+@c %**end of header
+To retrieve (GET) data from the DHT, the client sends a@ @code{struct
+GNUNET_DHT_ClientGetMessage} to the service. The message specifies routing
+options, a replication level (for replicating the GET, not the content), the
+desired block type, the key, the (optional) extended query and unique 64-bit
+request ID.
+Additionally, the client may send any number of@ @code{struct
+GNUNET_DHT_ClientGetResultSeenMessage}s to notify the service about results
+that the client is already aware of. These messages consist of the key, the
+unique 64-bit ID of the request, and an arbitrary number of hash codes over the
+blocks that the client is already aware of. As messages are restricted to 64k,
+a client that already knows more than about a thousand blocks may need to send
+several of these messages. Naturally, the client should transmit these messages
+as quickly as possible after the original GET request such that the DHT can
+filter those results in the network early on. Naturally, as these messages are
+send after the original request, it is conceivalbe that the DHT service may
+return blocks that match those already known to the client anyway.
+In response to a GET request, the service will send @code{struct
+GNUNET_DHT_ClientResultMessage}s to the client. These messages contain the
+block type, expiration, key, unique ID of the request and of course the value
+(a block). Depending on the options set for the respective operations, the
+replies may also contain the path the GET and/or the PUT took through the
+network.
+A client can stop receiving replies either by disconnecting or by sending a
+@code{struct GNUNET_DHT_ClientGetStopMessage} which must contain the key and
+the 64-bit unique ID of the original request. Using an explicit "stop" message
+is more common as this allows a client to run many concurrent GET operations
+over the same connection with the DHT service --- and to stop them
+individually.
+@node Monitoring the DHT
+@subsubsection Monitoring the DHT
+@c %**end of header
+To begin monitoring, the client sends a @code{struct
+GNUNET_DHT_MonitorStartStop} message to the DHT service. In this message, flags
+can be set to enable (or disable) monitoring of GET, PUT and RESULT messages
+that pass through a peer. The message can also restrict monitoring to a
+particular block type or a particular key. Once monitoring is enabled, the DHT
+service will notify the client about any matching event using @code{struct
+GNUNET_DHT_MonitorGetMessage}s for GET events, @code{struct
+GNUNET_DHT_MonitorPutMessage} for PUT events and@ @code{struct
+GNUNET_DHT_MonitorGetRespMessage} for RESULTs. Each of these messages contains
+all of the information about the event.
+@node The DHT Peer-to-Peer Protocol
+@subsection The DHT Peer-to-Peer Protocol
+@c %**end of header
+@menu
+* Routing GETs or PUTs::
+* PUTting data into the DHT2::
+* GETting data from the DHT2::
+@end menu
+@node Routing GETs or PUTs
+@subsubsection Routing GETs or PUTs
+@c %**end of header
+When routing GETs or PUTs, the DHT service selects a suitable subset of
+neighbours for forwarding. The exact number of neighbours can be zero or more
+and depends on the hop counter of the query (initially zero) in relation to the
+(log of) the network size estimate, the desired replication level and the
+peer's connectivity. Depending on the hop counter and our network size
+estimate, the selection of the peers maybe randomized or by proximity to the
+key. Furthermore, requests include a set of peers that a request has already
+traversed; those peers are also excluded from the selection.
+@node PUTting data into the DHT2
+@subsubsection PUTting data into the DHT2
+@c %**end of header
+To PUT data into the DHT, the service sends a @code{struct PeerPutMessage} of
+type @code{GNUNET_MESSAGE_TYPE_DHT_P2P_PUT} to the respective neighbour. In
+addition to the usual information about the content (type, routing options,
+desired replication level for the content, expiration time, key and value), the
+message contains a fixed-size Bloom filter with information about which peers
+(may) have already seen this request. This Bloom filter is used to ensure that
+DHT messages never loop back to a peer that has already processed the request.
+Additionally, the message includes the current hop counter and, depending on
+the routing options, the message may include the full path that the message has
+taken so far. The Bloom filter should already contain the identity of the
+previous hop; however, the path should not include the identity of the previous
+hop and the receiver should append the identity of the sender to the path, not
+its own identity (this is done to reduce bandwidth).
+@node GETting data from the DHT2
+@subsubsection GETting data from the DHT2
+@c %**end of header
+A peer can search the DHT by sending @code{struct PeerGetMessage}s of type
+@code{GNUNET_MESSAGE_TYPE_DHT_P2P_GET} to other peers. In addition to the usual
+information about the request (type, routing options, desired replication level
+for the request, the key and the extended query), a GET request also again
+contains a hop counter, a Bloom filter over the peers that have processed the
+request already and depending on the routing options the full path traversed by
+the GET. Finally, a GET request includes a variable-size second Bloom filter
+and a so-called Bloom filter mutator value which together indicate which
+replies the sender has already seen. During the lookup, each block that matches
+they block type, key and extended query is additionally subjected to a test
+against this Bloom filter. The block plugin is expected to take the hash of the
+block and combine it with the mutator value and check if the result is not yet
+in the Bloom filter. The originator of the query will from time to time modify
+the mutator to (eventually) allow false-positives filtered by the Bloom filter
+to be returned.
+Peers that receive a GET request perform a local lookup (depending on their
+proximity to the key and the query options) and forward the request to other
+peers. They then remember the request (including the Bloom filter for blocking
+duplicate results) and when they obtain a matching, non-filtered response a
+@code{struct PeerResultMessage} of type@
+@code{GNUNET_MESSAGE_TYPE_DHT_P2P_RESULT} is forwarded to the previous hop.
+Whenver a result is forwarded, the block plugin is used to update the Bloom
+filter accordingly, to ensure that the same result is never forwarded more than
+once. The DHT service may also cache forwarded results locally if the
+"CACHE_RESULTS" option is set to "YES" in the configuration.
+@node The GNU Name System (GNS)
+@section The GNU Name System (GNS)
+@c %**end of header
+The GNU Name System (GNS) is a decentralized database that enables users to
+securely resolve names to values. Names can be used to identify other users
+(for example, in social networking), or network services (for example, VPN
+services running at a peer in GNUnet, or purely IP-based services on the
+Internet). Users interact with GNS by typing in a hostname that ends in ".gnu"
+or ".zkey".
+Videos giving an overview of most of the GNS and the motivations behind it is
+available here and here. The remainder of this chapter targets developers that
+are familiar with high level concepts of GNS as presented in these talks.
+GNS-aware applications should use the GNS resolver to obtain the respective
+records that are stored under that name in GNS. Each record consists of a type,
+value, expiration time and flags.
+The type specifies the format of the value. Types below 65536 correspond to DNS
+record types, larger values are used for GNS-specific records. Applications can
+define new GNS record types by reserving a number and implementing a plugin
+(which mostly needs to convert the binary value representation to a
+human-readable text format and vice-versa). The expiration time specifies how
+long the record is to be valid. The GNS API ensures that applications are only
+given non-expired values. The flags are typically irrelevant for applications,
+as GNS uses them internally to control visibility and validity of records.
+Records are stored along with a signature. The signature is generated using the
+private key of the authoritative zone. This allows any GNS resolver to verify
+the correctness of a name-value mapping.
+Internally, GNS uses the NAMECACHE to cache information obtained from other
+users, the NAMESTORE to store information specific to the local users, and the
+DHT to exchange data between users. A plugin API is used to enable applications
+to define new GNS record types.
+@menu
+* libgnunetgns::
+* libgnunetgnsrecord::
+* GNS plugins::
+* The GNS Client-Service Protocol::
+* Hijacking the DNS-Traffic using gnunet-service-dns::
+* Serving DNS lookups via GNS on W32::
+@end menu
+@node libgnunetgns
+@subsection libgnunetgns
+@c %**end of header
+The GNS API itself is extremely simple. Clients first connec to the GNS service
+using @code{GNUNET_GNS_connect}. They can then perform lookups using
+@code{GNUNET_GNS_lookup} or cancel pending lookups using
+@code{GNUNET_GNS_lookup_cancel}. Once finished, clients disconnect using
+@code{GNUNET_GNS_disconnect}.
+@menu
+* Looking up records::
+* Accessing the records::
+* Creating records::
+* Future work::
+@end menu
+@node Looking up records
+@subsubsection Looking up records
+@c %**end of header
+@code{GNUNET_GNS_lookup} takes a number of arguments:
+@table @asis
+@item handle This is simply the GNS connection handle from
+@code{GNUNET_GNS_connect}.
+@item name The client needs to specify the name to
+be resolved. This can be any valid DNS or GNS hostname.
+@item zone The client
+needs to specify the public key of the GNS zone against which the resolution
+should be done (the ".gnu" zone). Note that a key must be provided, even if the
+name ends in ".zkey". This should typically be the public key of the
+master-zone of the user.
+@item type This is the desired GNS or DNS record type
+to look for. While all records for the given name will be returned, this can be
+important if the client wants to resolve record types that themselves delegate
+resolution, such as CNAME, PKEY or GNS2DNS. Resolving a record of any of these
+types will only work if the respective record type is specified in the request,
+as the GNS resolver will otherwise follow the delegation and return the records
+from the respective destination, instead of the delegating record.
+@item only_cached This argument should typically be set to @code{GNUNET_NO}. Setting
+it to @code{GNUNET_YES} disables resolution via the overlay network.
+@item shorten_zone_key If GNS encounters new names during resolution, their
+respective zones can automatically be learned and added to the "shorten zone".
+If this is desired, clients must pass the private key of the shorten zone. If
+NULL is passed, shortening is disabled.
+@item proc This argument identifies
+the function to call with the result. It is given proc_cls, the number of
+records found (possilby zero) and the array of the records as arguments. proc
+will only be called once. After proc,> has been called, the lookup must no
+longer be cancelled.
+@item proc_cls The closure for proc.
+@end table
+@node Accessing the records
+@subsubsection Accessing the records
+@c %**end of header
+The @code{libgnunetgnsrecord} library provides an API to manipulate the GNS
+record array that is given to proc. In particular, it offers functions such as
+converting record values to human-readable strings (and back). However, most
+@code{libgnunetgnsrecord} functions are not interesting to GNS client
+applications.
+For DNS records, the @code{libgnunetdnsparser} library provides functions for
+parsing (and serializing) common types of DNS records.
+@node Creating records
+@subsubsection Creating records
+@c %**end of header
+Creating GNS records is typically done by building the respective record
+information (possibly with the help of @code{libgnunetgnsrecord} and
+@code{libgnunetdnsparser}) and then using the @code{libgnunetnamestore} to
+publish the information. The GNS API is not involved in this
+operation.
+@node Future work
+@subsubsection Future work
+@c %**end of header
+In the future, we want to expand @code{libgnunetgns} to allow applications to
+observe shortening operations performed during GNS resolution, for example so
+that users can receive visual feedback when this happens.
+@node libgnunetgnsrecord
+@subsection libgnunetgnsrecord
+@c %**end of header
+The @code{libgnunetgnsrecord} library is used to manipulate GNS records (in
+plaintext or in their encrypted format). Applications mostly interact with
+@code{libgnunetgnsrecord} by using the functions to convert GNS record values
+to strings or vice-versa, or to lookup a GNS record type number by name (or
+vice-versa). The library also provides various other functions that are mostly
+used internally within GNS, such as converting keys to names, checking for
+expiration, encrypting GNS records to GNS blocks, verifying GNS block
+signatures and decrypting GNS records from GNS blocks.
+We will now discuss the four commonly used functions of the API.@
+@code{libgnunetgnsrecord} does not perform these operations itself, but instead
+uses plugins to perform the operation. GNUnet includes plugins to support
+common DNS record types as well as standard GNS record types.
+@menu
+* Value handling::
+* Type handling::
+@end menu
+@node Value handling
+@subsubsection Value handling
+@c %**end of header
+@code{GNUNET_GNSRECORD_value_to_string} can be used to convert the (binary)
+representation of a GNS record value to a human readable, 0-terminated UTF-8
+string. NULL is returned if the specified record type is not supported by any
+available plugin.
+@code{GNUNET_GNSRECORD_string_to_value} can be used to try to convert a human
+readable string to the respective (binary) representation of a GNS record
+value.
+@node Type handling
+@subsubsection Type handling
+@c %**end of header
+@code{GNUNET_GNSRECORD_typename_to_number} can be used to obtain the numeric
+value associated with a given typename. For example, given the typename "A"
+(for DNS A reocrds), the function will return the number 1. A list of common
+DNS record types is
+@uref{http://en.wikipedia.org/wiki/List_of_DNS_record_types, here. Note that
+not all DNS record types are supported by GNUnet GNSRECORD plugins at this
+time.}
+@code{GNUNET_GNSRECORD_number_to_typename} can be used to obtain the typename
+associated with a given numeric value. For example, given the type number 1,
+the function will return the typename "A".
+@node GNS plugins
+@subsection GNS plugins
+@c %**end of header
+Adding a new GNS record type typically involves writing (or extending) a
+GNSRECORD plugin. The plugin needs to implement the
+@code{gnunet_gnsrecord_plugin.h} API which provides basic functions that are
+needed by GNSRECORD to convert typenames and values of the respective record
+type to strings (and back). These gnsrecord plugins are typically implemented
+within their respective subsystems. Examples for such plugins can be found in
+the GNSRECORD, GNS and CONVERSATION subsystems.
+The @code{libgnunetgnsrecord} library is then used to locate, load and query
+the appropriate gnsrecord plugin. Which plugin is appropriate is determined by
+the record type (which is just a 32-bit integer). The @code{libgnunetgnsrecord}
+library loads all block plugins that are installed at the local peer and
+forwards the application request to the plugins. If the record type is not
+supported by the plugin, it should simply return an error code.
+The central functions of the block APIs (plugin and main library) are the same
+four functions for converting between values and strings, and typenames and
+numbers documented in the previous subsection.
+@node The GNS Client-Service Protocol
+@subsection The GNS Client-Service Protocol
+@c %**end of header
+The GNS client-service protocol consists of two simple messages, the
+@code{LOOKUP} message and the @code{LOOKUP_RESULT}. Each @code{LOOKUP} message
+contains a unique 32-bit identifier, which will be included in the
+corresponding response. Thus, clients can send many lookup requests in parallel
+and receive responses out-of-order. A @code{LOOKUP} request also includes the
+public key of the GNS zone, the desired record type and fields specifying
+whether shortening is enabled or networking is disabled. Finally, the
+@code{LOOKUP} message includes the name to be resolved.
+The response includes the number of records and the records themselves in the
+format created by @code{GNUNET_GNSRECORD_records_serialize}. They can thus be
+deserialized using @code{GNUNET_GNSRECORD_records_deserialize}.
+@node Hijacking the DNS-Traffic using gnunet-service-dns
+@subsection Hijacking the DNS-Traffic using gnunet-service-dns
+@c %**end of header
+This section documents how the gnunet-service-dns (and the gnunet-helper-dns)
+intercepts DNS queries from the local system.@ This is merely one method for
+how we can obtain GNS queries. It is also possible to change @code{resolv.conf}
+to point to a machine running @code{gnunet-dns2gns} or to modify libc's name
+system switch (NSS) configuration to include a GNS resolution plugin. The
+method described in this chaper is more of a last-ditch catch-all approach.
+@code{gnunet-service-dns} enables intercepting DNS traffic using policy based
+routing. We MARK every outgoing DNS-packet if it was not sent by our
+application. Using a second routing table in the Linux kernel these marked
+packets are then routed through our virtual network interface and can thus be
+captured unchanged.
+Our application then reads the query and decides how to handle it: A query to
+an address ending in ".gnu" or ".zkey" is hijacked by @code{gnunet-service-gns}
+and resolved internally using GNS. In the future, a reverse query for an
+address of the configured virtual network could be answered with records kept
+about previous forward queries. Queries that are not hijacked by some
+application using the DNS service will be sent to the original recipient. The
+answer to the query will always be sent back through the virtual interface with
+the original nameserver as source address.
+@menu
+* Network Setup Details::
+@end menu
+@node Network Setup Details
+@subsubsection Network Setup Details
+@c %**end of header
+The DNS interceptor adds the following rules to the Linux kernel:
+@example
+iptables -t mangle -I OUTPUT 1 -p udp --sport $LOCALPORT --dport 53 -j
+ACCEPT iptables -t mangle -I OUTPUT 2 -p udp --dport 53 -j MARK --set-mark 3 ip
+rule add fwmark 3 table2 ip route add default via $VIRTUALDNS table2
+@end example
+Line 1 makes sure that all packets coming from a port our application opened
+beforehand (@code{$LOCALPORT}) will be routed normally. Line 2 marks every
+other packet to a DNS-Server with mark 3 (chosen arbitrarily). The third line
+adds a routing policy based on this mark 3 via the routing table.
+@node Serving DNS lookups via GNS on W32
+@subsection Serving DNS lookups via GNS on W32
+@c %**end of header
+This section documents how the libw32nsp (and gnunet-gns-helper-service-w32) do
+DNS resolutions of DNS queries on the local system. This only applies to GNUnet
+running on W32.
+W32 has a concept of "Namespaces" and "Namespace providers". These are used to
+present various name systems to applications in a generic way. Namespaces
+include DNS, mDNS, NLA and others. For each namespace any number of providers
+could be registered, and they are queried in an order of priority (which is
+adjustable).
+Applications can resolve names by using WSALookupService*() family of
+functions.
+However, these are WSA-only facilities. Common BSD socket functions for
+namespace resolutions are gethostbyname and getaddrinfo (among others). These
+functions are implemented internally (by default - by mswsock, which also
+implements the default DNS provider) as wrappers around WSALookupService*()
+functions (see "Sample Code for a Service Provider" on MSDN).
+On W32 GNUnet builds a libw32nsp - a namespace provider, which can then be
+installed into the system by using w32nsp-install (and uninstalled by
+w32nsp-uninstall), as described in "Installation Handbook".
+libw32nsp is very simple and has almost no dependencies. As a response to
+NSPLookupServiceBegin(), it only checks that the provider GUID passed to it by
+the caller matches GNUnet DNS Provider GUID, checks that name being resolved
+ends in ".gnu" or ".zkey", then connects to gnunet-gns-helper-service-w32 at
+127.0.0.1:5353 (hardcoded) and sends the name resolution request there,
+returning the connected socket to the caller.
+When the caller invokes NSPLookupServiceNext(), libw32nsp reads a completely
+formed reply from that socket, unmarshalls it, then gives it back to the
+caller.
+At the moment gnunet-gns-helper-service-w32 is implemented to ever give only
+one reply, and subsequent calls to NSPLookupServiceNext() will fail with
+WSA_NODATA (first call to NSPLookupServiceNext() might also fail if GNS failed
+to find the name, or there was an error connecting to it).
+gnunet-gns-helper-service-w32 does most of the processing:
+@itemize @bullet
+@item Maintains a connection to GNS.
+@item Reads GNS config and loads appropriate keys.
+@item Checks service GUID and decides on the type of record to look up,
+refusing to make a lookup outright when unsupported service GUID is passed.
+@item Launches the lookup
+@end itemize
+When lookup result arrives, gnunet-gns-helper-service-w32 forms a complete
+reply (including filling a WSAQUERYSETW structure and, possibly, a binary blob
+with a hostent structure for gethostbyname() client), marshalls it, and sends
+it back to libw32nsp. If no records were found, it sends an empty header.
+This works for most normal applications that use gethostbyname() or
+getaddrinfo() to resolve names, but fails to do anything with applications that
+use alternative means of resolving names (such as sending queries to a DNS
+server directly by themselves). This includes some of well known utilities,
+like "ping" and "nslookup".
+@node The GNS Namecache
+@section The GNS Namecache
+@c %**end of header
+The NAMECACHE subsystem is responsible for caching (encrypted) resolution
+results of the GNU Name System (GNS). GNS makes zone information available to
+other users via the DHT. However, as accessing the DHT for every lookup is
+expensive (and as the DHT's local cache is lost whenever the peer is
+restarted), GNS uses the NAMECACHE as a more persistent cache for DHT lookups.
+Thus, instead of always looking up every name in the DHT, GNS first checks if
+the result is already available locally in the NAMECACHE. Only if there is no
+result in the NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the
+same (encrypted) format as the DHT. It thus makes no sense to iterate over all
+items in the NAMECACHE --- the NAMECACHE does not have a way to provide the
+keys required to decrypt the entries.
+Blocks in the NAMECACHE share the same expiration mechanism as blocks in the
+DHT --- the block expires wheneever any of the records in the (encrypted) block
+expires. The expiration time of the block is the only information stored in
+plaintext. The NAMECACHE service internally performs all of the required work
+to expire blocks, clients do not have to worry about this. Also, given that
+NAMECACHE stores only GNS blocks that local users requested, there is no
+configuration option to limit the size of the NAMECACHE. It is assumed to be
+always small enough (a few MB) to fit on the drive.
+The NAMECACHE supports the use of different database backends via a plugin API.
+@menu
+* libgnunetnamecache::
+* The NAMECACHE Client-Service Protocol::
+* The NAMECACHE Plugin API::
+@end menu
+@node libgnunetnamecache
+@subsection libgnunetnamecache
+@c %**end of header
+The NAMECACHE API consists of five simple functions. First, there is
+@code{GNUNET_NAMECACHE_connect} to connect to the NAMECACHE service. This
+returns the handle required for all other operations on the NAMECACHE. Using
+@code{GNUNET_NAMECACHE_block_cache} clients can insert a block into the cache.
+@code{GNUNET_NAMECACHE_lookup_block} can be used to lookup blocks that were
+stored in the NAMECACHE. Both operations can be cancelled using
+@code{GNUNET_NAMECACHE_cancel}. Note that cancelling a
+@code{GNUNET_NAMECACHE_block_cache} operation can result in the block being
+stored in the NAMECACHE --- or not. Cancellation primarily ensures that the
+continuation function with the result of the operation will no longer be
+invoked. Finally, @code{GNUNET_NAMECACHE_disconnect} closes the connection to
+the NAMECACHE.
+The maximum size of a block that can be stored in the NAMECACHE is
+@code{GNUNET_NAMECACHE_MAX_VALUE_SIZE}, which is defined to be 63 kB.
+@node The NAMECACHE Client-Service Protocol
+@subsection The NAMECACHE Client-Service Protocol
+@c %**end of header
+All messages in the NAMECACHE IPC protocol start with the @code{struct
+GNUNET_NAMECACHE_Header} which adds a request ID (32-bit integer) to the
+standard message header. The request ID is used to match requests with the
+respective responses from the NAMECACHE, as they are allowed to happen
+out-of-order.
+@menu
+* Lookup::
+* Store::
+@end menu
+@node Lookup
+@subsubsection Lookup
+@c %**end of header
+The @code{struct LookupBlockMessage} is used to lookup a block stored in the
+cache. It contains the query hash. The NAMECACHE always responds with a
+@code{struct LookupBlockResponseMessage}. If the NAMECACHE has no response, it
+sets the expiration time in the response to zero. Otherwise, the response is
+expected to contain the expiration time, the ECDSA signature, the derived key
+and the (variable-size) encrypted data of the block.
+@node Store
+@subsubsection Store
+@c %**end of header
+The @code{struct BlockCacheMessage} is used to cache a block in the NAMECACHE.
+It has the same structure as the @code{struct LookupBlockResponseMessage}. The
+service responds with a @code{struct BlockCacheResponseMessage} which contains
+the result of the operation (success or failure). In the future, we might want
+to make it possible to provide an error message as well.
+@node The NAMECACHE Plugin API
+@subsection The NAMECACHE Plugin API
+@c %**end of header
+The NAMECACHE plugin API consists of two functions, @code{cache_block} to store
+a block in the database, and @code{lookup_block} to lookup a block in the
+database.
+@menu
+* Lookup2::
+* Store2::
+@end menu
+@node Lookup2
+@subsubsection Lookup2
+@c %**end of header
+The @code{lookup_block} function is expected to return at most one block to the
+iterator, and return @code{GNUNET_NO} if there were no non-expired results. If
+there are multiple non-expired results in the cache, the lookup is supposed to
+return the result with the largest expiration time.
+@node Store2
+@subsubsection Store2
+@c %**end of header
+The @code{cache_block} function is expected to try to store the block in the
+database, and return @code{GNUNET_SYSERR} if this was not possible for any
+reason. Furthermore, @code{cache_block} is expected to implicitly perform cache
+maintenance and purge blocks from the cache that have expired. Note that
+@code{cache_block} might encounter the case where the database already has
+another block stored under the same key. In this case, the plugin must ensure
+that the block with the larger expiration time is preserved. Obviously, this
+can done either by simply adding new blocks and selecting for the most recent
+expiration time during lookup, or by checking which block is more recent during
+the store operation.
+@node The REVOCATION Subsystem
+@section The REVOCATION Subsystem
+@c %**end of header
+The REVOCATION subsystem is responsible for key revocation of Egos. If a user
+learns that his private key has been compromised or has lost it, he can use the
+REVOCATION system to inform all of the other users that this private key is no
+longer valid. The subsystem thus includes ways to query for the validity of
+keys and to propagate revocation messages.
+@menu
+* Dissemination::
+* Revocation Message Design Requirements::
+* libgnunetrevocation::
+* The REVOCATION Client-Service Protocol::
+* The REVOCATION Peer-to-Peer Protocol::
+@end menu
+@node Dissemination
+@subsection Dissemination
+@c %**end of header
+When a revocation is performed, the revocation is first of all disseminated by
+flooding the overlay network. The goal is to reach every peer, so that when a
+peer needs to check if a key has been revoked, this will be purely a local
+operation where the peer looks at his local revocation list. Flooding the
+network is also the most robust form of key revocation --- an adversary would
+have to control a separator of the overlay graph to restrict the propagation of
+the revocation message. Flooding is also very easy to implement --- peers that
+receive a revocation message for a key that they have never seen before simply
+pass the message to all of their neighbours.
+Flooding can only distribute the revocation message to peers that are online.
+In order to notify peers that join the network later, the revocation service
+performs efficient set reconciliation over the sets of known revocation
+messages whenever two peers (that both support REVOCATION dissemination)
+connect. The SET service is used to perform this operation
+efficiently.
+@node Revocation Message Design Requirements
+@subsection Revocation Message Design Requirements
+@c %**end of header
+However, flooding is also quite costly, creating O(|E|) messages on a network
+with |E| edges. Thus, revocation messages are required to contain a
+proof-of-work, the result of an expensive computation (which, however, is cheap
+to verify). Only peers that have expended the CPU time necessary to provide
+this proof will be able to flood the network with the revocation message. This
+ensures that an attacker cannot simply flood the network with millions of
+revocation messages. The proof-of-work required by GNUnet is set to take days
+on a typical PC to compute; if the ability to quickly revoke a key is needed,
+users have the option to pre-compute revocation messages to store off-line and
+use instantly after their key has expired.
+Revocation messages must also be signed by the private key that is being
+revoked. Thus, they can only be created while the private key is in the
+possession of the respective user. This is another reason to create a
+revocation message ahead of time and store it in a secure location.
+@node libgnunetrevocation
+@subsection libgnunetrevocation
+@c %**end of header
+The REVOCATION API consists of two parts, to query and to issue
+revocations.
+@menu
+* Querying for revoked keys::
+* Preparing revocations::
+* Issuing revocations::
+@end menu
+@node Querying for revoked keys
+@subsubsection Querying for revoked keys
+@c %**end of header
+@code{GNUNET_REVOCATION_query} is used to check if a given ECDSA public key has
+been revoked. The given callback will be invoked with the result of the check.
+The query can be cancelled using @code{GNUNET_REVOCATION_query_cancel} on the
+return value.
+@node Preparing revocations
+@subsubsection Preparing revocations
+@c %**end of header
+It is often desirable to create a revocation record ahead-of-time and store it
+in an off-line location to be used later in an emergency. This is particularly
+true for GNUnet revocations, where performing the revocation operation itself
+is computationally expensive and thus is likely to take some time. Thus, if
+users want the ability to perform revocations quickly in an emergency, they
+must pre-compute the revocation message. The revocation API enables this with
+two functions that are used to compute the revocation message, but not trigger
+the actual revocation operation.
+@code{GNUNET_REVOCATION_check_pow} should be used to calculate the
+proof-of-work required in the revocation message. This function takes the
+public key, the required number of bits for the proof of work (which in GNUnet
+is a network-wide constant) and finally a proof-of-work number as arguments.
+The function then checks if the given proof-of-work number is a valid proof of
+work for the given public key. Clients preparing a revocation are expected to
+call this function repeatedly (typically with a monotonically increasing
+sequence of numbers of the proof-of-work number) until a given number satisfies
+the check. That number should then be saved for later use in the revocation
+operation.
+@code{GNUNET_REVOCATION_sign_revocation} is used to generate the signature that
+is required in a revocation message. It takes the private key that (possibly in
+the future) is to be revoked and returns the signature. The signature can again
+be saved to disk for later use, which will then allow performing a revocation
+even without access to the private key.
+@node Issuing revocations
+@subsubsection Issuing revocations
+Given a ECDSA public key, the signature from @code{GNUNET_REVOCATION_sign} and
+the proof-of-work, @code{GNUNET_REVOCATION_revoke} can be used to perform the
+actual revocation. The given callback is called upon completion of the
+operation. @code{GNUNET_REVOCATION_revoke_cancel} can be used to stop the
+library from calling the continuation; however, in that case it is undefined
+whether or not the revocation operation will be executed.
+@node The REVOCATION Client-Service Protocol
+@subsection The REVOCATION Client-Service Protocol
+The REVOCATION protocol consists of four simple messages.
+A @code{QueryMessage} containing a public ECDSA key is used to check if a
+particular key has been revoked. The service responds with a
+@code{QueryResponseMessage} which simply contains a bit that says if the given
+public key is still valid, or if it has been revoked.
+The second possible interaction is for a client to revoke a key by passing a
+@code{RevokeMessage} to the service. The @code{RevokeMessage} contains the
+ECDSA public key to be revoked, a signature by the corresponding private key
+and the proof-of-work, The service responds with a
+@code{RevocationResponseMessage} which can be used to indicate that the
+@code{RevokeMessage} was invalid (i.e. proof of work incorrect), or otherwise
+indicates that the revocation has been processed successfully.
+@node The REVOCATION Peer-to-Peer Protocol
+@subsection The REVOCATION Peer-to-Peer Protocol
+@c %**end of header
+Revocation uses two disjoint ways to spread revocation information among peers.
+First of all, P2P gossip exchanged via CORE-level neighbours is used to quickly
+spread revocations to all connected peers. Second, whenever two peers (that
+both support revocations) connect, the SET service is used to compute the union
+of the respective revocation sets.
+In both cases, the exchanged messages are @code{RevokeMessage}s which contain
+the public key that is being revoked, a matching ECDSA signature, and a
+proof-of-work. Whenever a peer learns about a new revocation this way, it first
+validates the signature and the proof-of-work, then stores it to disk
+(typically to a file $GNUNET_DATA_HOME/revocation.dat) and finally spreads the
+information to all directly connected neighbours.
+For computing the union using the SET service, the peer with the smaller hashed
+peer identity will connect (as a "client" in the two-party set protocol) to the
+other peer after one second (to reduce traffic spikes on connect) and initiate
+the computation of the set union. All revocation services use a common hash to
+identify the SET operation over revocation sets.
+The current implementation accepts revocation set union operations from all
+peers at any time; however, well-behaved peers should only initiate this
+operation once after establishing a connection to a peer with a larger hashed
+peer identity.
+@node GNUnet's File-sharing (FS) Subsystem
+@section GNUnet's File-sharing (FS) Subsystem
+@c %**end of header
+This chapter describes the details of how the file-sharing service works. As
+with all services, it is split into an API (libgnunetfs), the service process
+(gnunet-service-fs) and user interface(s). The file-sharing service uses the
+datastore service to store blocks and the DHT (and indirectly datacache) for
+lookups for non-anonymous file-sharing.@ Furthermore, the file-sharing service
+uses the block library (and the block fs plugin) for validation of DHT
+operations.
+In contrast to many other services, libgnunetfs is rather complex since the
+client library includes a large number of high-level abstractions; this is
+necessary since the Fs service itself largely only operates on the block level.
+The FS library is responsible for providing a file-based abstraction to
+applications, including directories, meta data, keyword search, verification,
+and so on.
+The method used by GNUnet to break large files into blocks and to use keyword
+search is called the "Encoding for Censorship Resistant Sharing" (ECRS). ECRS
+is largely implemented in the fs library; block validation is also reflected in
+the block FS plugin and the FS service. ECRS on-demand encoding is implemented
+in the FS service.
+NOTE: The documentation in this chapter is quite incomplete.
+@menu
+* Encoding for Censorship-Resistant Sharing (ECRS)::
+* File-sharing persistence directory structure::
+@end menu
+@node Encoding for Censorship-Resistant Sharing (ECRS)
+@subsection Encoding for Censorship-Resistant Sharing (ECRS)
+@c %**end of header
+When GNUnet shares files, it uses a content encoding that is called ECRS, the
+Encoding for Censorship-Resistant Sharing. Most of ECRS is described in the
+(so far unpublished) research paper attached to this page. ECRS obsoletes the
+previous ESED and ESED II encodings which were used in GNUnet before version
+0.7.0.@ @ The rest of this page assumes that the reader is familiar with the
+attached paper. What follows is a description of some minor extensions that
+GNUnet makes over what is described in the paper. The reason why these
+extensions are not in the paper is that we felt that they were obvious or
+trivial extensions to the original scheme and thus did not warrant space in
+the research report.
+@menu
+* Namespace Advertisements::
+* KSBlocks::
+@end menu
+@node Namespace Advertisements
+@subsubsection Namespace Advertisements
+@c %**end of header
+An @code{SBlock} with identifier â²all zerosâ² is a signed
+advertisement for a namespace. This special @code{SBlock} contains metadata
+describing the content of the namespace. Instead of the name of the identifier
+for a potential update, it contains the identifier for the root of the
+namespace. The URI should always be empty. The @code{SBlock} is signed with
+the content provderâ²s RSA private key (just like any other SBlock). Peers
+can search for @code{SBlock}s in order to find out more about a namespace.
+@node KSBlocks
+@subsubsection KSBlocks
+@c %**end of header
+GNUnet implements @code{KSBlocks} which are @code{KBlocks} that, instead of
+encrypting a CHK and metadata, encrypt an @code{SBlock} instead. In other
+words, @code{KSBlocks} enable GNUnet to find @code{SBlocks} using the global
+keyword search. Usually the encrypted @code{SBlock} is a namespace
+advertisement. The rationale behind @code{KSBlock}s and @code{SBlock}s is to
+enable peers to discover namespaces via keyword searches, and, to associate
+useful information with namespaces. When GNUnet finds @code{KSBlocks} during a
+normal keyword search, it adds the information to an internal list of
+discovered namespaces. Users looking for interesting namespaces can then
+inspect this list, reducing the need for out-of-band discovery of namespaces.
+Naturally, namespaces (or more specifically, namespace advertisements) can
+also be referenced from directories, but @code{KSBlock}s should make it easier
+to advertise namespaces for the owner of the pseudonym since they eliminate
+the need to first create a directory.
+Collections are also advertised using @code{KSBlock}s.
+@table @asis
+@item Attachment Size
+@item  ecrs.pdf 270.68 KB
+@item https://gnunet.org/sites/default/files/ecrs.pdf
+@end table
+@node File-sharing persistence directory structure
+@subsection File-sharing persistence directory structure
+@c %**end of header
+This section documents how the file-sharing library implements persistence of
+file-sharing operations and specifically the resulting directory structure.
+This code is only active if the @code{GNUNET_FS_FLAGS_PERSISTENCE} flag was set
+when calling @code{GNUNET_FS_start}. In this case, the file-sharing library
+will try hard to ensure that all major operations (searching, downloading,
+publishing, unindexing) are persistent, that is, can live longer than the
+process itself. More specifically, an operation is supposed to live until it is
+explicitly stopped.
+If @code{GNUNET_FS_stop} is called before an operation has been stopped, a
+@code{SUSPEND} event is generated and then when the process calls
+@code{GNUNET_FS_start} next time, a @code{RESUME} event is generated.
+Additionally, even if an application crashes (segfault, SIGKILL, system crash)
+and hence @code{GNUNET_FS_stop} is never called and no @code{SUSPEND} events
+are generated, operations are still resumed (with @code{RESUME} events). This
+is implemented by constantly writing the current state of the file-sharing
+operations to disk. Specifically, the current state is always written to disk
+whenever anything significant changes (the exception are block-wise progress in
+publishing and unindexing, since those operations would be slowed down
+significantly and can be resumed cheaply even without detailed accounting).
+Note that@ if the process crashes (or is killed) during a serialization
+operation, FS does not guarantee that this specific operation is recoverable
+(no strict transactional semantics, again for performance reasons). However,
+all other unrelated operations should resume nicely.
+Since we need to serialize the state continuously and want to recover as much
+as possible even after crashing during a serialization operation, we do not use
+one large file for serialization. Instead, several directories are used for the
+various operations. When @code{GNUNET_FS_start} executes, the master
+directories are scanned for files describing operations to resume. Sometimes,
+these operations can refer to related operations in child directories which may
+also be resumed at this point. Note that corrupted files are cleaned up
+automatically. However, dangling files in child directories (those that are not
+referenced by files from the master directories) are not automatically removed.
+Persistence data is kept in a directory that begins with the "STATE_DIR" prefix
+from the configuration file (by default, "$SERVICEHOME/persistence/") followed
+by the name of the client as given to @code{GNUNET_FS_start} (for example,
+"gnunet-gtk") followed by the actual name of the master or child directory.
+The names for the master directories follow the names of the operations:
+@itemize @bullet
+@item "search"
+@item "download"
+@item "publish"
+@item "unindex"
+@end itemize
+Each of the master directories contains names (chosen at random) for each
+active top-level (master) operation. Note that a download that is associated
+with a search result is not a top-level operation.
+In contrast to the master directories, the child directories are only consulted
+when another operation refers to them. For each search, a subdirectory (named
+after the master search synchronization file) contains the search results.
+Search results can have an associated download, which is then stored in the
+general "download-child" directory. Downloads can be recursive, in which case
+children are stored in subdirectories mirroring the structure of the recursive
+download (either starting in the master "download" directory or in the
+"download-child" directory depending on how the download was initiated). For
+publishing operations, the "publish-file" directory contains information about
+the individual files and directories that are part of the publication. However,
+this directory structure is flat and does not mirror the structure of the
+publishing operation. Note that unindex operations cannot have associated child
+operations.
+@node GNUnet's REGEX Subsystem
+@section GNUnet's REGEX Subsystem
+@c %**end of header
+Using the REGEX subsystem, you can discover peers that offer a particular
+service using regular expressions. The peers that offer a service specify it
+using a regular expressions. Peers that want to patronize a service search
+using a string. The REGEX subsystem will then use the DHT to return a set of
+matching offerers to the patrons.
+For the technical details, we have "Max's defense talk and Max's Master's
+thesis. An additional publication is under preparation and available to team
+members (in Git).
+@menu
+* How to run the regex profiler::
+@end menu
+@node How to run the regex profiler
+@subsection How to run the regex profiler
+@c %**end of header
+The gnunet-regex-profiler can be used to profile the usage of mesh/regex for a
+given set of regular expressions and strings. Mesh/regex allows you to announce
+your peer ID under a certain regex and search for peers matching a particular
+regex using a string. See https://gnunet.org/szengel2012ms for a full
+introduction.
+First of all, the regex profiler uses GNUnet testbed, thus all the implications
+for testbed also apply to the regex profiler (for example you need
+password-less ssh login to the machines listed in your hosts file).
+@strong{Configuration}
+Moreover, an appropriate configuration file is needed. Generally you can refer
+to SVN HEAD: contrib/regex_profiler_infiniband.conf for an example
+configuration. In the following paragraph the important details are
+highlighted.
+Announcing of the regular expressions is done by the
+gnunet-daemon-regexprofiler, therefore you have to make sure it is started, by
+adding it to the AUTOSTART set of ARM:@
+@code{
+[regexprofiler]@
+AUTOSTART = YES@
+}
+Furthermore you have to specify the location of the binary:
+@example
+[regexprofiler]
+# Location of the gnunet-daemon-regexprofiler binary.
+BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler
+# Regex prefix that will be applied to all regular expressions and
+# search string.
+REGEX_PREFIX = "GNVPN-0001-PAD"
+@end example
+When running the profiler with a large scale deployment, you probably want to
+reduce the workload of each peer. Use the following options to do this.@
+@example
+[dht]@
+# Force network size estimation@
+FORCE_NSE = 1
+[dhtcache]
+DATABASE = heap@
+# Disable RC-file for Bloom filter? (for benchmarking with limited IO
+# availability)@
+DISABLE_BF_RC = YES@
+# Disable Bloom filter entirely@
+DISABLE_BF = YES
+[nse]@
+# Minimize proof-of-work CPU consumption by NSE@
+WORKBITS = 1
+@end example
+@strong{Options}
+To finally run the profiler some options and the input data need to be
+specified on the command line.
+@code{@ gnunet-regex-profiler -c config-file -d
+log-file -n num-links -p@ path-compression-length -s search-delay -t
+matching-timeout -a num-search-strings hosts-file policy-dir
+search-strings-file@ }
+@code{config-file} the configuration file created earlier.@ @code{log-file}
+file where to write statistics output.@ @code{num-links} number of random links
+between started peers.@ @code{path-compression-length} maximum path compression
+length in the DFA.@ @code{search-delay} time to wait between peers finished
+linking and@ starting to match strings.@ @code{matching-timeout} timeout after
+witch to cancel the searching.@ @code{num-search-strings} number of strings in
+the search-strings-file.
+The @code{hosts-file} should contain a list of hosts for the testbed, one per
+line in the following format. @code{user@@host_ip:port}.
+The @code{policy-dir} is a folder containing text files containing one or more
+regular expressions. A peer is started for each file in that folder and the
+regular expressions in the corresponding file are announced by this peer.
+The @code{search-strings-file} is a text file containing search strings, one in
+each line.
+You can create regular expressions and search strings for every AS in the@
+Internet using the attached scripts. You need one of the
+@uref{http://data.caida.org/datasets/routing/routeviews-prefix2as/, CAIDA
+routeviews prefix2as} data files for this. Run @code{create_regex.py <filename>
+<output path>} to create the regular expressions and @code{create_strings.py
+<input path> <outfile>} to create a search strings file from the previously
+created regular expressions.