gnunet-handbook

The GNUnet Handbook
Log | Files | Refs

commit 06df88eec42f035578bcc7ba9a02e58e1b813b14
parent e7d0874d9b3107577b8014865d06de0a4534ab9e
Author: Martin Schanzenbach <schanzen@gnunet.org>
Date:   Sat,  7 Oct 2023 16:16:32 +0200

Reorder subsystem overview

Diffstat:
Mconf.py | 1+
Mdevelopers/apis/dht.rst | 2--
Mdevelopers/apis/gns.rst | 7++-----
Mdevelopers/apis/identity.rst | 56++++----------------------------------------------------
Mdevelopers/apis/index.rst | 5-----
Mdevelopers/apis/messenger.rst | 82+++----------------------------------------------------------------------------
Mdevelopers/apis/namecache.rst | 38++++----------------------------------
Mdevelopers/apis/namestore.rst | 40++++------------------------------------
Mdevelopers/apis/nse.rst | 159++-----------------------------------------------------------------------------
Ddevelopers/apis/peerinfo.rst | 217-------------------------------------------------------------------------------
Mdevelopers/apis/peerstore.rst | 63+++------------------------------------------------------------
Ddevelopers/apis/regex.rst | 152-------------------------------------------------------------------------------
Mdevelopers/apis/rest.rst | 22+++-------------------
Mdevelopers/apis/revocation.rst | 61+++----------------------------------------------------------
Ddevelopers/apis/rps.rst | 76----------------------------------------------------------------------------
Mdevelopers/apis/set/set.rst | 88+++----------------------------------------------------------------------------
Mdevelopers/apis/seti/seti.rst | 67+++----------------------------------------------------------------
Mdevelopers/apis/setu/setu.rst | 72++++--------------------------------------------------------------------
Mdevelopers/apis/statistics.rst | 53++++-------------------------------------------------
Ddevelopers/apis/transport-ng.rst | 303-------------------------------------------------------------------------------
Mdevelopers/apis/transport.rst | 54+++---------------------------------------------------
Ddevelopers/apis/vpnstack.rst | 6------
Mindex.rst | 1-
Dsubsystems/cadet.rst | 43-------------------------------------------
Dsubsystems/core.rst | 106-------------------------------------------------------------------------------
Dsubsystems/dht.rst | 69---------------------------------------------------------------------
Dsubsystems/fs.rst | 54------------------------------------------------------
Dsubsystems/gns.rst | 47-----------------------------------------------
Dsubsystems/hostlist.rst | 293-------------------------------------------------------------------------------
Dsubsystems/identity.rst | 183-------------------------------------------------------------------------------
Dsubsystems/index.rst | 31-------------------------------
Dsubsystems/messenger.rst | 241-------------------------------------------------------------------------------
Dsubsystems/namecache.rst | 128-------------------------------------------------------------------------------
Dsubsystems/namestore.rst | 168-------------------------------------------------------------------------------
Dsubsystems/nse.rst | 315-------------------------------------------------------------------------------
Dsubsystems/peerinfo.rst | 217-------------------------------------------------------------------------------
Dsubsystems/peerstore.rst | 110-------------------------------------------------------------------------------
Dsubsystems/regex.rst | 152-------------------------------------------------------------------------------
Dsubsystems/rest.rst | 53-----------------------------------------------------
Dsubsystems/revocation.rst | 173-------------------------------------------------------------------------------
Dsubsystems/rps.rst | 76----------------------------------------------------------------------------
Dsubsystems/set/set.rst | 337-------------------------------------------------------------------------------
Dsubsystems/seti/seti.rst | 258-------------------------------------------------------------------------------
Dsubsystems/setops.rst | 11-----------
Dsubsystems/setu/setu.rst | 232-------------------------------------------------------------------------------
Dsubsystems/statistics.rst | 193-------------------------------------------------------------------------------
Dsubsystems/transport-ng.rst | 303-------------------------------------------------------------------------------
Dsubsystems/transport.rst | 843-------------------------------------------------------------------------------
Dsubsystems/vpnstack.rst | 6------
Musers/index.rst | 1+
Ausers/subsystems.rst | 2111+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
51 files changed, 2159 insertions(+), 6220 deletions(-)

diff --git a/conf.py b/conf.py @@ -80,6 +80,7 @@ html_theme_options = { #"navbar_start": ["navbar-logo"], "header_links_before_dropdown": 8, "article_header_start": ["breadcrumbs.html"], + "show_toc_level": 0, #"navbar_center": ["navbar-nav"], "navbar_end": [], "navbar_persistent": [], diff --git a/developers/apis/dht.rst b/developers/apis/dht.rst @@ -350,5 +350,3 @@ Bloom filter accordingly, to ensure that the same result is never forwarded more than once. The DHT service may also cache forwarded results locally if the \"CACHE_RESULTS\" option is set to \"YES\" in the configuration. - -.. [R5N2011] https://bib.gnunet.org/date.html#R5N diff --git a/developers/apis/gns.rst b/developers/apis/gns.rst @@ -4,11 +4,8 @@ .. _GNU-Name-System-Dev: -GNS — the GNU Name System -========================= - -libgnunetgns ------------- +GNS +=== The GNS API itself is extremely simple. Clients first connect to the GNS service using ``GNUNET_GNS_connect``. They can then perform lookups diff --git a/developers/apis/identity.rst b/developers/apis/identity.rst @@ -2,58 +2,10 @@ .. index:: double: IDENTITY; subsystem -.. _IDENTITY-Subsystem: - -IDENTITY — Ego management -========================= - -Identities of \"users\" in GNUnet are called egos. Egos can be used as -pseudonyms (\"fake names\") or be tied to an organization (for example, -\"GNU\") or even the actual identity of a human. GNUnet users are -expected to have many egos. They might have one tied to their real -identity, some for organizations they manage, and more for different -domains where they want to operate under a pseudonym. - -The IDENTITY service allows users to manage their egos. The identity -service manages the private keys egos of the local user; it does not -manage identities of other users (public keys). Public keys for other -users need names to become manageable. GNUnet uses the GNU Name System -(GNS) to give names to other users and manage their public keys -securely. This chapter is about the IDENTITY service, which is about the -management of private keys. - -On the network, an ego corresponds to an ECDSA key (over Curve25519, -using RFC 6979, as required by GNS). Thus, users can perform actions -under a particular ego by using (signing with) a particular private key. -Other users can then confirm that the action was really performed by -that ego by checking the signature against the respective public key. - -The IDENTITY service allows users to associate a human-readable name -with each ego. This way, users can use names that will remind them of -the purpose of a particular ego. The IDENTITY service will store the -respective private keys and allows applications to access key -information by name. Users can change the name that is locally (!) -associated with an ego. Egos can also be deleted, which means that the -private key will be removed and it thus will not be possible to perform -actions with that ego in the future. - -Additionally, the IDENTITY subsystem can associate service functions -with egos. For example, GNS requires the ego that should be used for the -shorten zone. GNS will ask IDENTITY for an ego for the \"gns-short\" -service. The IDENTITY service has a mapping of such service strings to -the name of the ego that the user wants to use for this service, for -example \"my-short-zone-ego\". - -Finally, the IDENTITY API provides access to a special ego, the -anonymous ego. The anonymous ego is special in that its private key is -not really private, but fixed and known to everyone. Thus, anyone can -perform actions as anonymous. This can be useful as with this trick, -code does not have to contain a special case to distinguish between -anonymous and pseudonymous egos. - -:index:`libgnunetidentity <single: libgnunet; identity>` -libgnunetidentity ------------------ +.. _IDENTITY-Subsystem-Dev: + +IDENTITY +======== .. _Connecting-to-the-identity-service: diff --git a/developers/apis/index.rst b/developers/apis/index.rst @@ -15,16 +15,11 @@ peer-to-peer applications to use. namecache.rst namestore.rst nse.rst - peerinfo.rst peerstore.rst - regex.rst rest.rst revocation.rst - rps.rst setops.rst statistics.rst - transport-ng.rst transport.rst - vpnstack.rst diff --git a/developers/apis/messenger.rst b/developers/apis/messenger.rst @@ -1,86 +1,10 @@ .. index:: double: subsystem; MESSENGER -.. _MESSENGER-Subsystem: +.. _MESSENGER-Subsystem-Dev: -MESSENGER — Room-based end-to-end messaging -=========================================== - -The MESSENGER subsystem is responsible for secure end-to-end -communication in groups of nodes in the GNUnet overlay network. -MESSENGER builds on the CADET subsystem which provides a reliable and -secure end-to-end communication between the nodes inside of these -groups. - -Additionally to the CADET security benefits, MESSENGER provides -following properties designed for application level usage: - -- MESSENGER provides integrity by signing the messages with the users - provided ego - -- MESSENGER adds (optional) forward secrecy by replacing the key pair - of the used ego and signing the propagation of the new one with old - one (chaining egos) - -- MESSENGER provides verification of a original sender by checking - against all used egos from a member which are currently in active use - (active use depends on the state of a member session) - -- MESSENGER offsers (optional) decentralized message forwarding between - all nodes in a group to improve availability and prevent MITM-attacks - -- MESSENGER handles new connections and disconnections from nodes in - the group by reconnecting them preserving an efficient structure for - message distribution (ensuring availability and accountablity) - -- MESSENGER provides replay protection (messages can be uniquely - identified via SHA-512, include a timestamp and the hash of the last - message) - -- MESSENGER allows detection for dropped messages by chaining them - (messages refer to the last message by their hash) improving - accountability - -- MESSENGER allows requesting messages from other peers explicitly to - ensure availability - -- MESSENGER provides confidentiality by padding messages to few - different sizes (512 bytes, 4096 bytes, 32768 bytes and maximal - message size from CADET) - -- MESSENGER adds (optional) confidentiality with ECDHE to exchange and - use symmetric encryption, encrypting with both AES-256 and Twofish - but allowing only selected members to decrypt (using the receivers - ego for ECDHE) - -Also MESSENGER provides multiple features with privacy in mind: - -- MESSENGER allows deleting messages from all peers in the group by the - original sender (uses the MESSENGER provided verification) - -- MESSENGER allows using the publicly known anonymous ego instead of - any unique identifying ego - -- MESSENGER allows your node to decide between acting as host of the - used messaging room (sharing your peer's identity with all nodes in - the group) or acting as guest (sharing your peer's identity only with - the nodes you explicitly open a connection to) - -- MESSENGER handles members independently of the peer's identity making - forwarded messages indistinguishable from directly received ones ( - complicating the tracking of messages and identifying its origin) - -- MESSENGER allows names of members being not unique (also names are - optional) - -- MESSENGER does not include information about the selected receiver of - an explicitly encrypted message in its header, complicating it for - other members to draw conclusions from communication partners - - -:index:`libgnunetmessenger <single: libgnunet; messenger>` -libgnunetmessenger ------------------- +MESSENGER +========= The MESSENGER API (defined in ``gnunet_messenger_service.h``) allows P2P applications built using GNUnet to communicate with specified kinds of diff --git a/developers/apis/namecache.rst b/developers/apis/namecache.rst @@ -3,40 +3,10 @@ single: GNS; name cache double: subsystem; NAMECACHE -.. _GNS-Namecache: - -NAMECACHE — DHT caching of GNS results -====================================== - -The NAMECACHE subsystem is responsible for caching (encrypted) -resolution results of the GNU Name System (GNS). GNS makes zone -information available to other users via the DHT. However, as accessing -the DHT for every lookup is expensive (and as the DHT's local cache is -lost whenever the peer is restarted), GNS uses the NAMECACHE as a more -persistent cache for DHT lookups. Thus, instead of always looking up -every name in the DHT, GNS first checks if the result is already -available locally in the NAMECACHE. Only if there is no result in the -NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the same -(encrypted) format as the DHT. It thus makes no sense to iterate over -all items in the NAMECACHE – the NAMECACHE does not have a way to -provide the keys required to decrypt the entries. - -Blocks in the NAMECACHE share the same expiration mechanism as blocks in -the DHT – the block expires wheneever any of the records in the -(encrypted) block expires. The expiration time of the block is the only -information stored in plaintext. The NAMECACHE service internally -performs all of the required work to expire blocks, clients do not have -to worry about this. Also, given that NAMECACHE stores only GNS blocks -that local users requested, there is no configuration option to limit -the size of the NAMECACHE. It is assumed to be always small enough (a -few MB) to fit on the drive. - -The NAMECACHE supports the use of different database backends via a -plugin API. - -:index:`libgnunetnamecache <single: libgnunet; namecache>` -libgnunetnamecache ------------------- +.. _GNS-Namecache-Dev: + +NAMECACHE +========= The NAMECACHE API consists of five simple functions. First, there is ``GNUNET_NAMECACHE_connect`` to connect to the NAMECACHE service. This diff --git a/developers/apis/namestore.rst b/developers/apis/namestore.rst @@ -2,42 +2,10 @@ .. index:: double: subsystem; NAMESTORE -.. _NAMESTORE-Subsystem: - -NAMESTORE — Storage of local GNS zones -====================================== - -The NAMESTORE subsystem provides persistent storage for local GNS zone -information. All local GNS zone information are managed by NAMESTORE. It -provides both the functionality to administer local GNS information -(e.g. delete and add records) as well as to retrieve GNS information -(e.g to list name information in a client). NAMESTORE does only manage -the persistent storage of zone information belonging to the user running -the service: GNS information from other users obtained from the DHT are -stored by the NAMECACHE subsystem. - -NAMESTORE uses a plugin-based database backend to store GNS information -with good performance. Here sqlite and PostgreSQL are supported -database backends. NAMESTORE clients interact with the IDENTITY -subsystem to obtain cryptographic information about zones based on egos -as described with the IDENTITY subsystem, but internally NAMESTORE -refers to zones using the respective private key. - -NAMESTORE is queried and monitored by the ZONEMASTER service which periodically -publishes public records of GNS zones. ZONEMASTER also -collaborates with the NAMECACHE subsystem and stores zone information -when local information are modified in the NAMECACHE cache to increase look-up -performance for local information and to enable local access to private records -in zones through GNS. - -NAMESTORE provides functionality to look-up and store records, to -iterate over a specific or all zones and to monitor zones for changes. -NAMESTORE functionality can be accessed using the NAMESTORE C API, the NAMESTORE -REST API, or the NAMESTORE command line tool. - -:index:`libgnunetnamestore <single: libgnunet; namestore>` -libgnunetnamestore ------------------- +.. _NAMESTORE-Subsystem-Dev: + +NAMESTORE +========= To interact with NAMESTORE clients first connect to the NAMESTORE service using the ``GNUNET_NAMESTORE_connect`` passing a configuration diff --git a/developers/apis/nse.rst b/developers/apis/nse.rst @@ -2,163 +2,10 @@ single: subsystem; Network size estimation see: NSE; Network size estimation -.. _NSE-Subsystem: - -NSE — Network size estimation -============================= - -NSE stands for Network Size Estimation. The NSE subsystem provides other -subsystems and users with a rough estimate of the number of peers -currently participating in the GNUnet overlay. The computed value is not -a precise number as producing a precise number in a decentralized, -efficient and secure way is impossible. While NSE's estimate is -inherently imprecise, NSE also gives the expected range. For a peer that -has been running in a stable network for a while, the real network size -will typically (99.7% of the time) be in the range of [2/3 estimate, 3/2 -estimate]. We will now give an overview of the algorithm used to -calculate the estimate; all of the details can be found in this -technical report. - -.. todo:: link to the report. - -.. _Motivation: - -Motivation ----------- - -Some subsystems, like DHT, need to know the size of the GNUnet network -to optimize some parameters of their own protocol. The decentralized -nature of GNUnet makes efficient and securely counting the exact number -of peers infeasible. Although there are several decentralized algorithms -to count the number of peers in a system, so far there is none to do so -securely. Other protocols may allow any malicious peer to manipulate the -final result or to take advantage of the system to perform Denial of -Service (DoS) attacks against the network. GNUnet's NSE protocol avoids -these drawbacks. - -NSE security -.. _Security: - -:index:`Security <single: NSE; security>` -Security -^^^^^^^^ - -The NSE subsystem is designed to be resilient against these attacks. It -uses `proofs of -work <http://en.wikipedia.org/wiki/Proof-of-work_system>`__ to prevent -one peer from impersonating a large number of participants, which would -otherwise allow an adversary to artificially inflate the estimate. The -DoS protection comes from the time-based nature of the protocol: the -estimates are calculated periodically and out-of-time traffic is either -ignored or stored for later retransmission by benign peers. In -particular, peers cannot trigger global network communication at will. - -.. _Principle: - -:index:`Principle <single: NSE; principle of operation>` -Principle ---------- - -The algorithm calculates the estimate by finding the globally closest -peer ID to a random, time-based value. - -The idea is that the closer the ID is to the random value, the more -\"densely packed\" the ID space is, and therefore, more peers are in the -network. - -.. _Example: - -Example -^^^^^^^ +.. _NSE-Subsystem-Dev: -Suppose all peers have IDs between 0 and 100 (our ID space), and the -random value is 42. If the closest peer has the ID 70 we can imagine -that the average \"distance\" between peers is around 30 and therefore -the are around 3 peers in the whole ID space. On the other hand, if the -closest peer has the ID 44, we can imagine that the space is rather -packed with peers, maybe as much as 50 of them. Naturally, we could have -been rather unlucky, and there is only one peer and happens to have the -ID 44. Thus, the current estimate is calculated as the average over -multiple rounds, and not just a single sample. - -.. _Algorithm: - -Algorithm -^^^^^^^^^ - -Given that example, one can imagine that the job of the subsystem is to -efficiently communicate the ID of the closest peer to the target value -to all the other peers, who will calculate the estimate from it. - -.. _Target-value: - -Target value -^^^^^^^^^^^^ - -The target value itself is generated by hashing the current time, -rounded down to an agreed value. If the rounding amount is 1h (default) -and the time is 12:34:56, the time to hash would be 12:00:00. The -process is repeated each rounding amount (in this example would be every -hour). Every repetition is called a round. - -.. _Timing: - -Timing -^^^^^^ - -The NSE subsystem has some timing control to avoid everybody -broadcasting its ID all at one. Once each peer has the target random -value, it compares its own ID to the target and calculates the -hypothetical size of the network if that peer were to be the closest. -Then it compares the hypothetical size with the estimate from the -previous rounds. For each value there is an associated point in the -period, let's call it \"broadcast time\". If its own hypothetical -estimate is the same as the previous global estimate, its \"broadcast -time\" will be in the middle of the round. If its bigger it will be -earlier and if its smaller (the most likely case) it will be later. This -ensures that the peers closest to the target value start broadcasting -their ID the first. - -.. _Controlled-Flooding: - -Controlled Flooding -^^^^^^^^^^^^^^^^^^^ - -When a peer receives a value, first it verifies that it is closer than -the closest value it had so far, otherwise it answers the incoming -message with a message containing the better value. Then it checks a -proof of work that must be included in the incoming message, to ensure -that the other peer's ID is not made up (otherwise a malicious peer -could claim to have an ID of exactly the target value every round). Once -validated, it compares the broadcast time of the received value with the -current time and if it's not too early, sends the received value to its -neighbors. Otherwise it stores the value until the correct broadcast -time comes. This prevents unnecessary traffic of sub-optimal values, -since a better value can come before the broadcast time, rendering the -previous one obsolete and saving the traffic that would have been used -to broadcast it to the neighbors. - -.. _Calculating-the-estimate: - -Calculating the estimate -^^^^^^^^^^^^^^^^^^^^^^^^ - -Once the closest ID has been spread across the network each peer gets -the exact distance between this ID and the target value of the round and -calculates the estimate with a mathematical formula described in the -tech report. The estimate generated with this method for a single round -is not very precise. Remember the case of the example, where the only -peer is the ID 44 and we happen to generate the target value 42, -thinking there are 50 peers in the network. Therefore, the NSE subsystem -remembers the last 64 estimates and calculates an average over them, -giving a result of which usually has one bit of uncertainty (the real -size could be half of the estimate or twice as much). Note that the -actual network size is calculated in powers of two of the raw input, -thus one bit of uncertainty means a factor of two in the size estimate. - -:index:`libgnunetnse <single: libgnunet; nse>` -libgnunetnse ------------- +NSE +=== The NSE subsystem has the simplest API of all services, with only two calls: ``GNUNET_NSE_connect`` and ``GNUNET_NSE_disconnect``. diff --git a/developers/apis/peerinfo.rst b/developers/apis/peerinfo.rst @@ -1,217 +0,0 @@ - -.. index:: - double: subsystem; PEERINFO - -.. _PEERINFO-Subsystem: - -PEERINFO — Persistent HELLO storage -=================================== - -The PEERINFO subsystem is used to store verified (validated) information -about known peers in a persistent way. It obtains these addresses for -example from TRANSPORT service which is in charge of address validation. -Validation means that the information in the HELLO message are checked -by connecting to the addresses and performing a cryptographic handshake -to authenticate the peer instance stating to be reachable with these -addresses. Peerinfo does not validate the HELLO messages itself but only -stores them and gives them to interested clients. - -As future work, we think about moving from storing just HELLO messages -to providing a generic persistent per-peer information store. More and -more subsystems tend to need to store per-peer information in persistent -way. To not duplicate this functionality we plan to provide a PEERSTORE -service providing this functionality. - -.. _PEERINFO-_002d-Features: - -PEERINFO - Features -------------------- - -- Persistent storage - -- Client notification mechanism on update - -- Periodic clean up for expired information - -- Differentiation between public and friend-only HELLO - -.. _PEERINFO-_002d-Limitations: - -PEERINFO - Limitations ----------------------- - -- Does not perform HELLO validation - -.. _DeveloperPeer-Information: - -DeveloperPeer Information -------------------------- - -The PEERINFO subsystem stores these information in the form of HELLO -messages you can think of as business cards. These HELLO messages -contain the public key of a peer and the addresses a peer can be reached -under. The addresses include an expiration date describing how long they -are valid. This information is updated regularly by the TRANSPORT -service by revalidating the address. If an address is expired and not -renewed, it can be removed from the HELLO message. - -Some peer do not want to have their HELLO messages distributed to other -peers, especially when GNUnet's friend-to-friend modus is enabled. To -prevent this undesired distribution. PEERINFO distinguishes between -*public* and *friend-only* HELLO messages. Public HELLO messages can be -freely distributed to other (possibly unknown) peers (for example using -the hostlist, gossiping, broadcasting), whereas friend-only HELLO -messages may not be distributed to other peers. Friend-only HELLO -messages have an additional flag ``friend_only`` set internally. For -public HELLO message this flag is not set. PEERINFO does and cannot not -check if a client is allowed to obtain a specific HELLO type. - -The HELLO messages can be managed using the GNUnet HELLO library. Other -GNUnet systems can obtain these information from PEERINFO and use it for -their purposes. Clients are for example the HOSTLIST component providing -these information to other peers in form of a hostlist or the TRANSPORT -subsystem using these information to maintain connections to other -peers. - -.. _Startup: - -Startup -------- - -During startup the PEERINFO services loads persistent HELLOs from disk. -First PEERINFO parses the directory configured in the HOSTS value of the -``PEERINFO`` configuration section to store PEERINFO information. For -all files found in this directory valid HELLO messages are extracted. In -addition it loads HELLO messages shipped with the GNUnet distribution. -These HELLOs are used to simplify network bootstrapping by providing -valid peer information with the distribution. The use of these HELLOs -can be prevented by setting the ``USE_INCLUDED_HELLOS`` in the -``PEERINFO`` configuration section to ``NO``. Files containing invalid -information are removed. - -.. _Managing-Information: - -Managing Information --------------------- - -The PEERINFO services stores information about known PEERS and a single -HELLO message for every peer. A peer does not need to have a HELLO if no -information are available. HELLO information from different sources, for -example a HELLO obtained from a remote HOSTLIST and a second HELLO -stored on disk, are combined and merged into one single HELLO message -per peer which will be given to clients. During this merge process the -HELLO is immediately written to disk to ensure persistence. - -PEERINFO in addition periodically scans the directory where information -are stored for empty HELLO messages with expired TRANSPORT addresses. -This periodic task scans all files in the directory and recreates the -HELLO messages it finds. Expired TRANSPORT addresses are removed from -the HELLO and if the HELLO does not contain any valid addresses, it is -discarded and removed from the disk. - -.. _Obtaining-Information: - -Obtaining Information ---------------------- - -When a client requests information from PEERINFO, PEERINFO performs a -lookup for the respective peer or all peers if desired and transmits -this information to the client. The client can specify if friend-only -HELLOs have to be included or not and PEERINFO filters the respective -HELLO messages before transmitting information. - -To notify clients about changes to PEERINFO information, PEERINFO -maintains a list of clients interested in this notifications. Such a -notification occurs if a HELLO for a peer was updated (due to a merge -for example) or a new peer was added. - -.. _The-PEERINFO-Client_002dService-Protocol: - -The PEERINFO Client-Service Protocol ------------------------------------- - -To connect and disconnect to and from the PEERINFO Service PEERINFO -utilizes the util client/server infrastructure, so no special messages -types are used here. - -To add information for a peer, the plain HELLO message is transmitted to -the service without any wrapping. All pieces of information required are -stored within the HELLO message. The PEERINFO service provides a message -handler accepting and processing these HELLO messages. - -When obtaining PEERINFO information using the iterate functionality -specific messages are used. To obtain information for all peers, a -``struct ListAllPeersMessage`` with message type -``GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL`` and a flag include_friend_only -to indicate if friend-only HELLO messages should be included are -transmitted. If information for a specific peer is required a -``struct ListAllPeersMessage`` with ``GNUNET_MESSAGE_TYPE_PEERINFO_GET`` -containing the peer identity is used. - -For both variants the PEERINFO service replies for each HELLO message it -wants to transmit with a ``struct ListAllPeersMessage`` with type -``GNUNET_MESSAGE_TYPE_PEERINFO_INFO`` containing the plain HELLO. The -final message is ``struct GNUNET_MessageHeader`` with type -``GNUNET_MESSAGE_TYPE_PEERINFO_INFO``. If the client receives this -message, it can proceed with the next request if any is pending. - -:index:`libgnunetpeerinfo <single: libgnunet; peerinfo>` -libgnunetpeerinfo ------------------ - -The PEERINFO API consists mainly of three different functionalities: - -- maintaining a connection to the service - -- adding new information to the PEERINFO service - -- retrieving information from the PEERINFO service - -.. _Connecting-to-the-PEERINFO-Service: - -Connecting to the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To connect to the PEERINFO service the function -``GNUNET_PEERINFO_connect`` is used, taking a configuration handle as an -argument, and to disconnect from PEERINFO the function -``GNUNET_PEERINFO_disconnect``, taking the PEERINFO handle returned from -the connect function has to be called. - -.. _Adding-Information-to-the-PEERINFO-Service: - -Adding Information to the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -``GNUNET_PEERINFO_add_peer`` adds a new peer to the PEERINFO subsystem -storage. This function takes the PEERINFO handle as an argument, the -HELLO message to store and a continuation with a closure to be called -with the result of the operation. The ``GNUNET_PEERINFO_add_peer`` -returns a handle to this operation allowing to cancel the operation with -the respective cancel function ``GNUNET_PEERINFO_add_peer_cancel``. To -retrieve information from PEERINFO you can iterate over all information -stored with PEERINFO or you can tell PEERINFO to notify if new peer -information are available. - -.. _Obtaining-Information-from-the-PEERINFO-Service: - -Obtaining Information from the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To iterate over information in PEERINFO you use -``GNUNET_PEERINFO_iterate``. This function expects the PEERINFO handle, -a flag if HELLO messages intended for friend only mode should be -included, a timeout how long the operation should take and a callback -with a callback closure to be called for the results. If you want to -obtain information for a specific peer, you can specify the peer -identity, if this identity is NULL, information for all peers are -returned. The function returns a handle to allow to cancel the operation -using ``GNUNET_PEERINFO_iterate_cancel``. - -To get notified when peer information changes, you can use -``GNUNET_PEERINFO_notify``. This function expects a configuration handle -and a flag if friend-only HELLO messages should be included. The -PEERINFO service will notify you about every change and the callback -function will be called to notify you about changes. The function -returns a handle to cancel notifications with -``GNUNET_PEERINFO_notify_cancel``. diff --git a/developers/apis/peerstore.rst b/developers/apis/peerstore.rst @@ -2,68 +2,11 @@ .. index:: double: subsystem; PEERSTORE -.. _PEERSTORE-Subsystem: +.. _PEERSTORE-Subsystem-Dev: -PEERSTORE — Extensible local persistent data storage -==================================================== +PEERSTORE +========= -GNUnet's PEERSTORE subsystem offers persistent per-peer storage for -other GNUnet subsystems. GNUnet subsystems can use PEERSTORE to -persistently store and retrieve arbitrary data. Each data record stored -with PEERSTORE contains the following fields: - -- subsystem: Name of the subsystem responsible for the record. - -- peerid: Identity of the peer this record is related to. - -- key: a key string identifying the record. - -- value: binary record value. - -- expiry: record expiry date. - -.. _Functionality: - -Functionality -------------- - -Subsystems can store any type of value under a (subsystem, peerid, key) -combination. A \"replace\" flag set during store operations forces the -PEERSTORE to replace any old values stored under the same (subsystem, -peerid, key) combination with the new value. Additionally, an expiry -date is set after which the record is \*possibly\* deleted by PEERSTORE. - -Subsystems can iterate over all values stored under any of the following -combination of fields: - -- (subsystem) - -- (subsystem, peerid) - -- (subsystem, key) - -- (subsystem, peerid, key) - -Subsystems can also request to be notified about any new values stored -under a (subsystem, peerid, key) combination by sending a \"watch\" -request to PEERSTORE. - -.. _Architecture: - -Architecture ------------- - -PEERSTORE implements the following components: - -- PEERSTORE service: Handles store, iterate and watch operations. - -- PEERSTORE API: API to be used by other subsystems to communicate and - issue commands to the PEERSTORE service. - -- PEERSTORE plugins: Handles the persistent storage. At the moment, - only an \"sqlite\" plugin is implemented. - -:index:`libgnunetpeerstore <single: libgnunet; peerstore>` libgnunetpeerstore ------------------ diff --git a/developers/apis/regex.rst b/developers/apis/regex.rst @@ -1,152 +0,0 @@ - -.. index:: - double: subsystem; REGEX - -.. _REGEX-Subsystem: - -REGEX — Service discovery using regular expressions -=================================================== - -Using the REGEX subsystem, you can discover peers that offer a -particular service using regular expressions. The peers that offer a -service specify it using a regular expressions. Peers that want to -patronize a service search using a string. The REGEX subsystem will then -use the DHT to return a set of matching offerers to the patrons. - -For the technical details, we have Max's defense talk and Max's Master's -thesis. - -.. note:: An additional publication is under preparation and available - to team members (in Git). - -.. todo:: Missing links to Max's talk and Master's thesis - -.. _How-to-run-the-regex-profiler: - -How to run the regex profiler ------------------------------ - -The gnunet-regex-profiler can be used to profile the usage of mesh/regex -for a given set of regular expressions and strings. Mesh/regex allows -you to announce your peer ID under a certain regex and search for peers -matching a particular regex using a string. See -`szengel2012ms <https://bib.gnunet.org/full/date.html#2012_5f2>`__ for a -full introduction. - -First of all, the regex profiler uses GNUnet testbed, thus all the -implications for testbed also apply to the regex profiler (for example -you need password-less ssh login to the machines listed in your hosts -file). - -**Configuration** - -Moreover, an appropriate configuration file is needed. In the following -paragraph the important details are highlighted. - -Announcing of the regular expressions is done by the -gnunet-daemon-regexprofiler, therefore you have to make sure it is -started, by adding it to the START_ON_DEMAND set of ARM: - -:: - - [regexprofiler] - START_ON_DEMAND = YES - -Furthermore you have to specify the location of the binary: - -:: - - [regexprofiler] - # Location of the gnunet-daemon-regexprofiler binary. - BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler - # Regex prefix that will be applied to all regular expressions and - # search string. - REGEX_PREFIX = "GNVPN-0001-PAD" - -When running the profiler with a large scale deployment, you probably -want to reduce the workload of each peer. Use the following options to -do this. - -:: - - [dht] - # Force network size estimation - FORCE_NSE = 1 - - [dhtcache] - DATABASE = heap - # Disable RC-file for Bloom filter? (for benchmarking with limited IO - # availability) - DISABLE_BF_RC = YES - # Disable Bloom filter entirely - DISABLE_BF = YES - - [nse] - # Minimize proof-of-work CPU consumption by NSE - WORKBITS = 1 - -**Options** - -To finally run the profiler some options and the input data need to be -specified on the command line. - -:: - - gnunet-regex-profiler -c config-file -d log-file -n num-links \ - -p path-compression-length -s search-delay -t matching-timeout \ - -a num-search-strings hosts-file policy-dir search-strings-file - -Where\... - -- \... ``config-file`` means the configuration file created earlier. - -- \... ``log-file`` is the file where to write statistics output. - -- \... ``num-links`` indicates the number of random links between - started peers. - -- \... ``path-compression-length`` is the maximum path compression - length in the DFA. - -- \... ``search-delay`` time to wait between peers finished linking and - starting to match strings. - -- \... ``matching-timeout`` timeout after which to cancel the - searching. - -- \... ``num-search-strings`` number of strings in the - search-strings-file. - -- \... the ``hosts-file`` should contain a list of hosts for the - testbed, one per line in the following format: - - - ``user@host_ip:port`` - -- \... the ``policy-dir`` is a folder containing text files containing - one or more regular expressions. A peer is started for each file in - that folder and the regular expressions in the corresponding file are - announced by this peer. - -- \... the ``search-strings-file`` is a text file containing search - strings, one in each line. - -You can create regular expressions and search strings for every AS in -the Internet using the attached scripts. You need one of the `CAIDA -routeviews -prefix2as <http://data.caida.org/datasets/routing/routeviews-prefix2as/>`__ -data files for this. Run - -:: - - create_regex.py <filename> <output path> - -to create the regular expressions and - -:: - - create_strings.py <input path> <outfile> - -to create a search strings file from the previously created regular -expressions. - - diff --git a/developers/apis/rest.rst b/developers/apis/rest.rst @@ -2,10 +2,10 @@ .. index:: double: subsystem; REST -.. _REST-Subsystem: +.. _REST-Subsystem-Dev: -REST — RESTful GNUnet Web APIs -============================== +REST +==== .. todo:: Define REST @@ -14,22 +14,6 @@ The REST service is designed as a pluggable architecture. To create a new REST endpoint, simply add a library in the form "plugin_rest_*". The REST service will automatically load all REST plugins on startup. -**Configuration** - -The REST service can be configured in various ways. The reference config -file can be found in ``src/rest/rest.conf``: - -:: - - [rest] - REST_PORT=7776 - REST_ALLOW_HEADERS=Authorization,Accept,Content-Type - REST_ALLOW_ORIGIN=* - REST_ALLOW_CREDENTIALS=true - -The port as well as CORS (cross-origin resource sharing) headers -that are supposed to be advertised by the rest service are configurable. - .. _Namespace-considerations: Namespace considerations diff --git a/developers/apis/revocation.rst b/developers/apis/revocation.rst @@ -2,65 +2,10 @@ .. index:: double: subsystem; REVOCATION -.. _REVOCATION-Subsystem: - -REVOCATION — Ego key revocation -=============================== - -The REVOCATION subsystem is responsible for key revocation of Egos. If a -user learns that their private key has been compromised or has lost it, -they can use the REVOCATION system to inform all of the other users that -their private key is no longer valid. The subsystem thus includes ways -to query for the validity of keys and to propagate revocation messages. - -.. _Dissemination: - -Dissemination -------------- - -When a revocation is performed, the revocation is first of all -disseminated by flooding the overlay network. The goal is to reach every -peer, so that when a peer needs to check if a key has been revoked, this -will be purely a local operation where the peer looks at its local -revocation list. Flooding the network is also the most robust form of -key revocation --- an adversary would have to control a separator of the -overlay graph to restrict the propagation of the revocation message. -Flooding is also very easy to implement --- peers that receive a -revocation message for a key that they have never seen before simply -pass the message to all of their neighbours. - -Flooding can only distribute the revocation message to peers that are -online. In order to notify peers that join the network later, the -revocation service performs efficient set reconciliation over the sets -of known revocation messages whenever two peers (that both support -REVOCATION dissemination) connect. The SET service is used to perform -this operation efficiently. - -.. _Revocation-Message-Design-Requirements: - -Revocation Message Design Requirements --------------------------------------- +.. _REVOCATION-Subsystem-Dev: -However, flooding is also quite costly, creating O(\|E\|) messages on a -network with \|E\| edges. Thus, revocation messages are required to -contain a proof-of-work, the result of an expensive computation (which, -however, is cheap to verify). Only peers that have expended the CPU time -necessary to provide this proof will be able to flood the network with -the revocation message. This ensures that an attacker cannot simply -flood the network with millions of revocation messages. The -proof-of-work required by GNUnet is set to take days on a typical PC to -compute; if the ability to quickly revoke a key is needed, users have -the option to pre-compute revocation messages to store off-line and use -instantly after their key has expired. - -Revocation messages must also be signed by the private key that is being -revoked. Thus, they can only be created while the private key is in the -possession of the respective user. This is another reason to create a -revocation message ahead of time and store it in a secure location. - -:index:`libgnunetrevocation <single: libgnunet; revocation>` -libgnunetrevocation -------------------- +REVOCATION +========== The REVOCATION API consists of two parts, to query and to issue revocations. diff --git a/developers/apis/rps.rst b/developers/apis/rps.rst @@ -1,76 +0,0 @@ - -.. index:: - double: subsystems; Random peer sampling - see: RPS; Random peer sampling - -.. _RPS-Subsystem: - -RPS — Random peer sampling -========================== - -In literature, Random Peer Sampling (RPS) refers to the problem of -reliably [1]_ drawing random samples from an unstructured p2p network. - -Doing so in a reliable manner is not only hard because of inherent -problems but also because of possible malicious peers that could try to -bias the selection. - -It is useful for all kind of gossip protocols that require the selection -of random peers in the whole network like gathering statistics, -spreading and aggregating information in the network, load balancing and -overlay topology management. - -The approach chosen in the RPS service implementation in GNUnet follows -the `Brahms <https://bib.gnunet.org/full/date.html\#2009_5f0>`__ design. - -The current state is \"work in progress\". There are a lot of things -that need to be done, primarily finishing the experimental evaluation -and a re-design of the API. - -The abstract idea is to subscribe to connect to/start the RPS service -and request random peers that will be returned when they represent a -random selection from the whole network with high probability. - -An additional feature to the original Brahms-design is the selection of -sub-groups: The GNUnet implementation of RPS enables clients to ask for -random peers from a group that is defined by a common shared secret. -(The secret could of course also be public, depending on the use-case.) - -Another addition to the original protocol was made: The sampler -mechanism that was introduced in Brahms was slightly adapted and used to -actually sample the peers and returned to the client. This is necessary -as the original design only keeps peers connected to random other peers -in the network. In order to return random peers to client requests -independently random, they cannot be drawn from the connected peers. The -adapted sampler makes sure that each request for random peers is -independent from the others. - -.. _Brahms: - -Brahms ------- - -The high-level concept of Brahms is two-fold: Combining push-pull gossip -with locally fixing a assumed bias using cryptographic min-wise -permutations. The central data structure is the view - a peer's current -local sample. This view is used to select peers to push to and pull -from. This simple mechanism can be biased easily. For this reason Brahms -'fixes' the bias by using the so-called sampler. A data structure that -takes a list of elements as input and outputs a random one of them -independently of the frequency in the input set. Both an element that -was put into the sampler a single time and an element that was put into -it a million times have the same probability of being the output. This -is achieved with exploiting min-wise independent permutations. In the -RPS service we use HMACs: On the initialisation of a sampler element, a -key is chosen at random. On each input the HMAC with the random key is -computed. The sampler element keeps the element with the minimal HMAC. - -In order to fix the bias in the view, a fraction of the elements in the -view are sampled through the sampler from the random stream of peer IDs. - -According to the theoretical analysis of Bortnikov et al. this suffices -to keep the network connected and having random peers in the view. - -.. [1] - \"Reliable\" in this context means having no bias, neither spatial, - nor temporal, nor through malicious activity. diff --git a/developers/apis/set/set.rst b/developers/apis/set/set.rst @@ -1,92 +1,10 @@ .. index:: double: subsystem; SET -.. _SET-Subsystem: +.. _SET-Subsystem-Dev: -SET — Peer to peer set operations (Deprecated) -============================================== - -.. note:: - - The SET subsystem is in process of being replaced by the SETU and SETI - subsystems, which provide basically the same functionality, just using - two different subsystems. SETI and SETU should be used for new code. - -The SET service implements efficient set operations between two peers -over a CADET tunnel. Currently, set union and set intersection are the -only supported operations. Elements of a set consist of an *element -type* and arbitrary binary *data*. The size of an element's data is -limited to around 62 KB. - -.. _Local-Sets: - -Local Sets ----------- - -Sets created by a local client can be modified and reused for multiple -operations. As each set operation requires potentially expensive special -auxiliary data to be computed for each element of a set, a set can only -participate in one type of set operation (either union or intersection). -The type of a set is determined upon its creation. If a the elements of -a set are needed for an operation of a different type, all of the set's -element must be copied to a new set of appropriate type. - -.. _Set-Modifications: - -Set Modifications ------------------ - -Even when set operations are active, one can add to and remove elements -from a set. However, these changes will only be visible to operations -that have been created after the changes have taken place. That is, -every set operation only sees a snapshot of the set from the time the -operation was started. This mechanism is *not* implemented by copying -the whole set, but by attaching *generation information* to each element -and operation. - -.. _Set-Operations: - -Set Operations --------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Result-Elements: - -Result Elements ---------------- - -The SET service has three *result modes* that determine how an -operation's result set is delivered to the client: - -- **Full Result Set.** All elements of set resulting from the set - operation are returned to the client. - -- **Added Elements.** Only elements that result from the operation and - are not already in the local peer's set are returned. Note that for - some operations (like set intersection) this result mode will never - return any elements. This can be useful if only the remove peer is - actually interested in the result of the set operation. - -- **Removed Elements.** Only elements that are in the local peer's - initial set but not in the operation's result set are returned. Note - that for some operations (like set union) this result mode will never - return any elements. This can be useful if only the remove peer is - actually interested in the result of the set operation. - - -:index:`libgnunetset <single: libgnunet; set>` -libgnunetset ------------- +SET +=== .. _Sets: diff --git a/developers/apis/seti/seti.rst b/developers/apis/seti/seti.rst @@ -1,71 +1,10 @@ .. index:: double: subsystem; SETI -.. _SETI-Subsystem: +.. _SETI-Subsystem-Dev: -SETI — Peer to peer set intersections -===================================== - -The SETI service implements efficient set intersection between two peers -over a CADET tunnel. Elements of a set consist of an *element type* and -arbitrary binary *data*. The size of an element's data is limited to -around 62 KB. - -.. _Intersection-Sets: - -Intersection Sets ------------------ - -Sets created by a local client can be modified (by adding additional -elements) and reused for multiple operations. If elements are to be -removed, a fresh set must be created by the client. - -.. _Set-Intersection-Modifications: - -Set Intersection Modifications ------------------------------- - -Even when set operations are active, one can add elements to a set. -However, these changes will only be visible to operations that have been -created after the changes have taken place. That is, every set operation -only sees a snapshot of the set from the time the operation was started. -This mechanism is *not* implemented by copying the whole set, but by -attaching *generation information* to each element and operation. - -.. _Set-Intersection-Operations: - -Set Intersection Operations ---------------------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Intersection-Result-Elements: - -Intersection Result Elements ----------------------------- - -The SET service has two *result modes* that determine how an operation's -result set is delivered to the client: - -- **Return intersection.** All elements of set resulting from the set - intersection are returned to the client. - -- **Removed Elements.** Only elements that are in the local peer's - initial set but not in the intersection are returned. - - -:index:`libgnunetseti <single: libgnunet; seti>` -libgnunetseti -------------- +SETI +==== .. _Intersection-Set-API: diff --git a/developers/apis/setu/setu.rst b/developers/apis/setu/setu.rst @@ -2,74 +2,10 @@ .. index:: double: SETU; subsystem -.. _SETU-Subsystem: +.. _SETU-Subsystem-Dev: -SETU — Peer to peer set unions -============================== - -The SETU service implements efficient set union operations between two -peers over a CADET tunnel. Elements of a set consist of an *element -type* and arbitrary binary *data*. The size of an element's data is -limited to around 62 KB. - -.. _Union-Sets: - -Union Sets ----------- - -Sets created by a local client can be modified (by adding additional -elements) and reused for multiple operations. If elements are to be -removed, a fresh set must be created by the client. - -.. _Set-Union-Modifications: - -Set Union Modifications ------------------------ - -Even when set operations are active, one can add elements to a set. -However, these changes will only be visible to operations that have been -created after the changes have taken place. That is, every set operation -only sees a snapshot of the set from the time the operation was started. -This mechanism is *not* implemented by copying the whole set, but by -attaching *generation information* to each element and operation. - -.. _Set-Union-Operations: - -Set Union Operations --------------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Union-Result-Elements: - -Union Result Elements ---------------------- - -The SET service has three *result modes* that determine how an -operation's result set is delivered to the client: - -- **Locally added Elements.** Elements that are in the union but not - already in the local peer's set are returned. - -- **Remote added Elements.** Additionally, notify the client if the - remote peer lacked some elements and thus also return to the local - client those elements that we are sending to the remote peer to be - added to its union. Obtaining these elements requires setting the - ``GNUNET_SETU_OPTION_SYMMETRIC`` option. - - -:index:`libgnunetsetu <single: libgnunet; setu>` -libgnunetsetu -------------- +SETU +==== .. _Union-Set-API: @@ -190,7 +126,7 @@ of a set operation with the ``GNUNET_SERVICE_SETU_RESULT`` message. .. _The-SETU-Union-Peer_002dto_002dPeer-Protocol: The SETU Union Peer-to-Peer Protocol -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The SET union protocol is based on Eppstein's efficient set reconciliation without prior context. You should read this paper first diff --git a/developers/apis/statistics.rst b/developers/apis/statistics.rst @@ -2,55 +2,10 @@ .. index:: double: STATISTICS; subsystem -.. _STATISTICS-Subsystem: - -STATISTICS — Runtime statistics publication -=========================================== - -In GNUnet, the STATISTICS subsystem offers a central place for all -subsystems to publish unsigned 64-bit integer run-time statistics. -Keeping this information centrally means that there is a unified way for -the user to obtain data on all subsystems, and individual subsystems do -not have to always include a custom data export method for performance -metrics and other statistics. For example, the TRANSPORT system uses -STATISTICS to update information about the number of directly connected -peers and the bandwidth that has been consumed by the various plugins. -This information is valuable for diagnosing connectivity and performance -issues. - -Following the GNUnet service architecture, the STATISTICS subsystem is -divided into an API which is exposed through the header -**gnunet_statistics_service.h** and the STATISTICS service -**gnunet-service-statistics**. The **gnunet-statistics** command-line -tool can be used to obtain (and change) information about the values -stored by the STATISTICS service. The STATISTICS service does not -communicate with other peers. - -Data is stored in the STATISTICS service in the form of tuples -**(subsystem, name, value, persistence)**. The subsystem determines to -which other GNUnet's subsystem the data belongs. name is the name -through which value is associated. It uniquely identifies the record -from among other records belonging to the same subsystem. In some parts -of the code, the pair **(subsystem, name)** is called a **statistic** as -it identifies the values stored in the STATISTCS service.The persistence -flag determines if the record has to be preserved across service -restarts. A record is said to be persistent if this flag is set for it; -if not, the record is treated as a non-persistent record and it is lost -after service restart. Persistent records are written to and read from -the file **statistics.data** before shutdown and upon startup. The file -is located in the HOME directory of the peer. - -An anomaly of the STATISTICS service is that it does not terminate -immediately upon receiving a shutdown signal if it has any clients -connected to it. It waits for all the clients that are not monitors to -close their connections before terminating itself. This is to prevent -the loss of data during peer shutdown — delaying the STATISTICS -service shutdown helps other services to store important data to -STATISTICS during shutdown. - -:index:`libgnunetstatistics <single: libgnunet; statistics>` -libgnunetstatistics -------------------- +.. _STATISTICS-Subsystem-Dev: + +STATISTICS +========== **libgnunetstatistics** is the library containing the API for the STATISTICS subsystem. Any process requiring to use STATISTICS should use diff --git a/developers/apis/transport-ng.rst b/developers/apis/transport-ng.rst @@ -1,303 +0,0 @@ - -.. index:: - double: TRANSPORT Next Generation; subsystem - -.. _TRANSPORT_002dNG-Subsystem: - -TRANSPORT-NG — Next-generation transport management -=================================================== - -The current GNUnet TRANSPORT architecture is rooted in the GNUnet 0.4 -design of using plugins for the actual transmission operations and the -ATS subsystem to select a plugin and allocate bandwidth. The following -key issues have been identified with this design: - -- Bugs in one plugin can affect the TRANSPORT service and other - plugins. There is at least one open bug that affects sockets, where - the origin is difficult to pinpoint due to the large code base. - -- Relevant operating system default configurations often impose a limit - of 1024 file descriptors per process. Thus, one plugin may impact - other plugin's connectivity choices. - -- Plugins are required to offer bi-directional connectivity. However, - firewalls (incl. NAT boxes) and physical environments sometimes only - allow uni-directional connectivity, which then currently cannot be - utilized at all. - -- Distance vector routing was implemented in 209 but shortly afterwards - broken and due to the complexity of implementing it as a plugin and - dealing with the resource allocation consequences was never useful. - -- Most existing plugins communicate completely using cleartext, - exposing metad data (message size) and making it easy to fingerprint - and possibly block GNUnet traffic. - -- Various NAT traversal methods are not supported. - -- The service logic is cluttered with \"manipulation\" support code for - TESTBED to enable faking network characteristics like lossy - connections or firewewalls. - -- Bandwidth allocation is done in ATS, requiring the duplication of - state and resulting in much delayed allocation decisions. As a - result, often available bandwidth goes unused. Users are expected to - manually configure bandwidth limits, instead of TRANSPORT using - congestion control to adapt automatically. - -- TRANSPORT is difficult to test and has bad test coverage. - -- HELLOs include an absolute expiration time. Nodes with unsynchronized - clocks cannot connect. - -- Displaying the contents of a HELLO requires the respective plugin as - the plugin-specific data is encoded in binary. This also complicates - logging. - -.. _Design-goals-of-TNG: - -Design goals of TNG -------------------- - -In order to address the above issues, we want to: - -- Move plugins into separate processes which we shall call - *communicators*. Communicators connect as clients to the transport - service. - -- TRANSPORT should be able to utilize any number of communicators to the - same peer at the same time. - -- TRANSPORT should be responsible for fragmentation, retransmission, - flow- and congestion-control. Users should no longer have to - configure bandwidth limits: TRANSPORT should detect what is available - and use it. - -- Communicators should be allowed to be uni-directional and - unreliable. TRANSPORT shall create bi-directional channels from this - whenever possible. - -- DV should no longer be a plugin, but part of TRANSPORT. - -- TRANSPORT should provide communicators help communicating, for - example in the case of uni-directional communicators or the need for - out-of-band signalling for NAT traversal. We call this functionality - *backchannels*. - -- Transport manipulation should be signalled to CORE on a per-message - basis instead of an approximate bandwidth. - -- CORE should signal performance requirements (reliability, latency, - etc.) on a per-message basis to TRANSPORT. If possible, TRANSPORT - should consider those options when scheduling messages for - transmission. - -- HELLOs should be in a human-readable format with monotonic time - expirations. - -The new architecture is planned as follows: - -.. image:: /images/tng.png - -TRANSPORT's main objective is to establish bi-directional virtual links -using a variety of possibly uni-directional communicators. Links undergo -the following steps: - -1. Communicator informs TRANSPORT A that a queue (direct neighbour) is - available, or equivalently TRANSPORT A discovers a (DV) path to a - target B. - -2. TRANSPORT A sends a challenge to the target peer, trying to confirm - that the peer can receive. FIXME: This is not implemented properly - for DV. Here we should really take a validated DVH and send a - challenge exactly down that path! - -3. The other TRANSPORT, TRANSPORT B, receives the challenge, and sends - back a response, possibly using a dierent path. If TRANSPORT B does - not yet have a virtual link to A, it must try to establish a virtual - link. - -4. Upon receiving the response, TRANSPORT A creates the virtual link. If - the response included a challenge, TRANSPORT A must respond to this - challenge as well, eectively re-creating the TCP 3-way handshake - (just with longer challenge values). - -.. _HELLO_002dNG: - -HELLO-NG --------- - -HELLOs change in three ways. First of all, communicators encode the -respective addresses in a human-readable URL-like string. This way, we -do no longer require the communicator to print the contents of a HELLO. -Second, HELLOs no longer contain an expiration time, only a creation -time. The receiver must only compare the respective absolute values. So -given a HELLO from the same sender with a larger creation time, then the -old one is no longer valid. This also obsoletes the need for the -gnunet-hello binary to set HELLO expiration times to never. Third, a -peer no longer generates one big HELLO that always contains all of the -addresses. Instead, each address is signed individually and shared only -over the address scopes where it makes sense to share the address. In -particular, care should be taken to not share MACs across the Internet -and confine their use to the LAN. As each address is signed separately, -having multiple addresses valid at the same time (given the new creation -time expiration logic) requires that those addresses must have exactly -the same creation time. Whenever that monotonic time is increased, all -addresses must be re-signed and re-distributed. - -.. _Priorities-and-preferences: - -Priorities and preferences --------------------------- - -In the new design, TRANSPORT adopts a feature (which was previously -already available in CORE) of the MQ API to allow applications to -specify priorities and preferences per message (or rather, per MQ -envelope). The (updated) MQ API allows applications to specify one of -four priority levels as well as desired preferences for transmission by -setting options on an envelope. These preferences currently are: - -- GNUNET_MQ_PREF_UNRELIABLE: Disables TRANSPORT waiting for ACKS on - unreliable channels like UDP. Now it is fire and forget. These - messages then cannot be used for RTT estimates either. - -- GNUNET_MQ_PREF_LOW_LATENCY: Directs TRANSPORT to select the - lowest-latency transmission choices possible. - -- GNUNET_MQ_PREF_CORK_ALLOWED: Allows TRANSPORT to delay transmission - to group the message with other messages into a larger batch to - reduce the number of packets sent. - -- GNUNET_MQ_PREF_GOODPUT: Directs TRANSPORT to select the highest - goodput channel available. - -- GNUNET_MQ_PREF_OUT_OF_ORDER: Allows TRANSPORT to reorder the messages - as it sees fit, otherwise TRANSPORT should attempt to preserve - transmission order. - -Each MQ envelope is always able to store those options (and the -priority), and in the future this uniform API will be used by TRANSPORT, -CORE, CADET and possibly other subsystems that send messages (like -LAKE). When CORE sets preferences and priorities, it is supposed to -respect the preferences and priorities it is given from higher layers. -Similarly, CADET also simply passes on the preferences and priorities of -the layer above CADET. When a layer combines multiple smaller messages -into one larger transmission, the ``GNUNET_MQ_env_combine_options()`` -should be used to calculate options for the combined message. We note -that the exact semantics of the options may differ by layer. For -example, CADET will always strictly implement reliable and in-order -delivery of messages, while the same options are only advisory for -TRANSPORT and CORE: they should try (using ACKs on unreliable -communicators, not changing the message order themselves), but if -messages are lost anyway (e.g. because a TCP is dropped in the middle), -or if messages are reordered (e.g. because they took different paths -over the network and arrived in a different order) TRANSPORT and CORE do -not have to correct this. Whether a preference is strict or loose is -thus dened by the respective layer. - -.. _Communicators: - -Communicators -------------- - -The API for communicators is defined in -``gnunet_transport_communication_service.h``. Each communicator must -specify its (global) communication characteristics, which for now only -say whether the communication is reliable (e.g. TCP, HTTPS) or -unreliable (e.g. UDP, WLAN). Each communicator must specify a unique -address prex, or NULL if the communicator cannot establish outgoing -connections (for example because it is only acting as a TCP server). A -communicator must tell TRANSPORT which addresses it is reachable under. -Addresses may be added or removed at any time. A communicator may have -zero addresses (transmission only). Addresses do not have to match the -address prefix. - -TRANSPORT may ask a communicator to try to connect to another address. -TRANSPORT will only ask for connections where the address matches the -communicator's address prefix that was provided when the connection was -established. Communicators should then attempt to establish a -connection. -It is under the discretion of the communicator whether to honor this request. -Reasons for not honoring such a request may be that an existing connection exists -or resource limitations. -No response is provided to TRANSPORT service on failure. -The TRANSPORT service has to ask the communicator explicitly to retry. - -If a communicator succeeds in establishing an outgoing connection for -transmission, or if a communicator receives an incoming bi-directional -connection, the communicator must inform the TRANSPORT service that a -message queue (MQ) for transmission is now available. -For that MQ, the communicator must provide the peer identity claimed by the other end. -It must also provide a human-readable address (for debugging) and a maximum transfer unit -(MTU). A MTU of zero means sending is not supported, SIZE_MAX should be -used for no MTU. The communicator should also tell TRANSPORT what -network type is used for the queue. The communicator may tell TRANSPORT -anytime that the queue was deleted and is no longer available. - -The communicator API also provides for flow control. First, -communicators exhibit back-pressure on TRANSPORT: the number of messages -TRANSPORT may add to a queue for transmission will be limited. So by not -draining the transmission queue, back-pressure is provided to TRANSPORT. -In the other direction, communicators may allow TRANSPORT to give -back-pressure towards the communicator by providing a non-NULL -``GNUNET_TRANSPORT_MessageCompletedCallback`` argument to the -``GNUNET_TRANSPORT_communicator_receive`` function. In this case, -TRANSPORT will only invoke this function once it has processed the -message and is ready to receive more. Communicators should then limit -how much traffic they receive based on this backpressure. Note that -communicators do not have to provide a -``GNUNET_TRANSPORT_MessageCompletedCallback``; for example, UDP cannot -support back-pressure due to the nature of the UDP protocol. In this -case, TRANSPORT will implement its own TRANSPORT-to-TRANSPORT flow -control to reduce the sender's data rate to acceptable levels. - -TRANSPORT may notify a communicator about backchannel messages TRANSPORT -received from other peers for this communicator. Similarly, -communicators can ask TRANSPORT to try to send a backchannel message to -other communicators of other peers. The semantics of the backchannel -message are up to the communicators which use them. TRANSPORT may fail -transmitting backchannel messages, and TRANSPORT will not attempt to -retransmit them. - -UDP communicator -^^^^^^^^^^^^^^^^ - -The UDP communicator implements a basic encryption layer to protect from -metadata leakage. -The layer tries to establish a shared secret using an Elliptic-Curve Diffie-Hellman -key exchange in which the initiator of a packet creates an ephemeral key pair -to encrypt a message for the target peer identity. -The communicator always offers this kind of transmission queue to a (reachable) -peer in which messages are encrypted with dedicated keys. -The performance of this queue is not suitable for high volume data transfer. - -If the UDP connection is bi-directional, or the TRANSPORT is able to offer a -backchannel connection, the resulting key can be re-used if the recieving peer -is able to ACK the reception. -This will cause the communicator to offer a new queue (with a higher priority -than the default queue) to TRANSPORT with a limited capacity. -The capacity is increased whenever the communicator receives an ACK for a -transmission. -This queue is suitable for high-volume data transfer and TRANSPORT will likely -prioritize this queue (if available). - -Communicators that try to establish a connection to a target peer authenticate -their peer ID (public key) in the first packets by signing a monotonic time -stamp, its peer ID, and the target peerID and send this data as well as the signature -in one of the first packets. -Receivers should keep track (persist) of the monotonic time stamps for each -peer ID to reject possible replay attacks. - -FIXME: Handshake wire format? KX, Flow. - -TCP communicator -^^^^^^^^^^^^^^^^ - -FIXME: Handshake wire format? KX, Flow. - -QUIC communicator -^^^^^^^^^^^^^^^^^ -The QUIC communicator runs over a bi-directional UDP connection. -TLS layer with self-signed certificates (binding/signed with peer ID?). -Single, bi-directional stream? -FIXME: Handshake wire format? KX, Flow. diff --git a/developers/apis/transport.rst b/developers/apis/transport.rst @@ -1,58 +1,10 @@ .. index:: double: TRANSPORT; subsystem -.. _TRANSPORT-Subsystem: +.. _TRANSPORT-Subsystem-Dev: -TRANSPORT — Overlay transport management -======================================== - -This chapter documents how the GNUnet transport subsystem works. The -GNUnet transport subsystem consists of three main components: the -transport API (the interface used by the rest of the system to access -the transport service), the transport service itself (most of the -interesting functions, such as choosing transports, happens here) and -the transport plugins. A transport plugin is a concrete implementation -for how two GNUnet peers communicate; many plugins exist, for example -for communication via TCP, UDP, HTTP, HTTPS and others. Finally, the -transport subsystem uses supporting code, especially the NAT/UPnP -library to help with tasks such as NAT traversal. - -Key tasks of the transport service include: - -- Create our HELLO message, notify clients and neighbours if our HELLO - changes (using NAT library as necessary) - -- Validate HELLOs from other peers (send PING), allow other peers to - validate our HELLO's addresses (send PONG) - -- Upon request, establish connections to other peers (using address - selection from ATS subsystem) and maintain them (again using PINGs - and PONGs) as long as desired - -- Accept incoming connections, give ATS service the opportunity to - switch communication channels - -- Notify clients about peers that have connected to us or that have - been disconnected from us - -- If a (stateful) connection goes down unexpectedly (without explicit - DISCONNECT), quickly attempt to recover (without notifying clients) - but do notify clients quickly if reconnecting fails - -- Send (payload) messages arriving from clients to other peers via - transport plugins and receive messages from other peers, forwarding - those to clients - -- Enforce inbound traffic limits (using flow-control if it is - applicable); outbound traffic limits are enforced by CORE, not by us - (!) - -- Enforce restrictions on P2P connection as specified by the blacklist - configuration and blacklisting clients - -Note that the term \"clients\" in the list above really refers to the -GNUnet-CORE service, as CORE is typically the only client of the -transport service. +TRANSPORT +========= .. _Address-validation-protocol: diff --git a/developers/apis/vpnstack.rst b/developers/apis/vpnstack.rst @@ -1,6 +0,0 @@ - -VPN and VPN Support -=================== - -.. toctree:: - diff --git a/index.rst b/index.rst @@ -7,7 +7,6 @@ Welcome to GNUnet’s documentation! about installing - subsystems/index users/index developers/index guis/index diff --git a/subsystems/cadet.rst b/subsystems/cadet.rst @@ -1,43 +0,0 @@ -.. _CADET-Subsystem: - -CADET ------ - -The Confidential Ad-hoc Decentralized End-to-end Transport (CADET) subsystem -in GNUnet is responsible for secure end-to-end -communications between nodes in the GNUnet overlay network. CADET builds -on the CORE subsystem, which provides for the link-layer communication, -by adding routing, forwarding, and additional security to the -connections. CADET offers the same cryptographic services as CORE, but -on an end-to-end level. This is done so peers retransmitting traffic on -behalf of other peers cannot access the payload data. - -- CADET provides confidentiality with so-called perfect forward - secrecy; we use ECDHE powered by Curve25519 for the key exchange and - then use symmetric encryption, encrypting with both AES-256 and - Twofish - -- authentication is achieved by signing the ephemeral keys using - Ed25519, a deterministic variant of ECDSA - -- integrity protection (using SHA-512 to do encrypt-then-MAC, although - only 256 bits are sent to reduce overhead) - -- replay protection (using nonces, timestamps, challenge-response, - message counters and ephemeral keys) - -- liveness (keep-alive messages, timeout) - -Additional to the CORE-like security benefits, CADET offers other -properties that make it a more universal service than CORE. - -- CADET can establish channels to arbitrary peers in GNUnet. If a peer - is not immediately reachable, CADET will find a path through the - network and ask other peers to retransmit the traffic on its behalf. - -- CADET offers (optional) reliability mechanisms. In a reliable channel - traffic is guaranteed to arrive complete, unchanged and in-order. - -- CADET takes care of flow and congestion control mechanisms, not - allowing the sender to send more traffic than the receiver or the - network are able to process. diff --git a/subsystems/core.rst b/subsystems/core.rst @@ -1,106 +0,0 @@ -.. _CORE-Subsystem: - -.. index:: - double: CORE; subsystem - -CORE — GNUnet link layer -======================== - -The CORE subsystem in GNUnet is responsible for securing link-layer -communications between nodes in the GNUnet overlay network. CORE builds -on the TRANSPORT subsystem which provides for the actual, insecure, -unreliable link-layer communication (for example, via UDP or WLAN), and -then adds fundamental security to the connections: - -- confidentiality with so-called perfect forward secrecy; we use ECDHE - (`Elliptic-curve - Diffie—Hellman <http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman>`__) - powered by Curve25519 (`Curve25519 <http://cr.yp.to/ecdh.html>`__) - for the key exchange and then use symmetric encryption, encrypting - with both AES-256 - (`AES-256 <http://en.wikipedia.org/wiki/Rijndael>`__) and Twofish - (`Twofish <http://en.wikipedia.org/wiki/Twofish>`__) - -- `authentication <http://en.wikipedia.org/wiki/Authentication>`__ is - achieved by signing the ephemeral keys using Ed25519 - (`Ed25519 <http://ed25519.cr.yp.to/>`__), a deterministic variant of - ECDSA (`ECDSA <http://en.wikipedia.org/wiki/ECDSA>`__) - -- integrity protection (using SHA-512 - (`SHA-512 <http://en.wikipedia.org/wiki/SHA-2>`__) to do - encrypt-then-MAC - (`encrypt-then-MAC <http://en.wikipedia.org/wiki/Authenticated_encryption>`__)) - -- Replay (`replay <http://en.wikipedia.org/wiki/Replay_attack>`__) - protection (using nonces, timestamps, challenge-response, message - counters and ephemeral keys) - -- liveness (keep-alive messages, timeout) - -.. _Limitations: - -:index:`Limitations <CORE; limitations>` -Limitations ------------ - -CORE does not perform -`routing <http://en.wikipedia.org/wiki/Routing>`__; using CORE it is -only possible to communicate with peers that happen to already be -\"directly\" connected with each other. CORE also does not have an API -to allow applications to establish such \"direct\" connections --- for -this, applications can ask TRANSPORT, but TRANSPORT might not be able to -establish a \"direct\" connection. The TOPOLOGY subsystem is responsible -for trying to keep a few \"direct\" connections open at all times. -Applications that need to talk to particular peers should use the CADET -subsystem, as it can establish arbitrary \"indirect\" connections. - -Because CORE does not perform routing, CORE must only be used directly -by applications that either perform their own routing logic (such as -anonymous file-sharing) or that do not require routing, for example -because they are based on flooding the network. CORE communication is -unreliable and delivery is possibly out-of-order. Applications that -require reliable communication should use the CADET service. Each -application can only queue one message per target peer with the CORE -service at any time; messages cannot be larger than approximately 63 -kilobytes. If messages are small, CORE may group multiple messages -(possibly from different applications) prior to encryption. If permitted -by the application (using the `cork <http://baus.net/on-tcp_cork/>`__ -option), CORE may delay transmissions to facilitate grouping of multiple -small messages. If cork is not enabled, CORE will transmit the message -as soon as TRANSPORT allows it (TRANSPORT is responsible for limiting -bandwidth and congestion control). CORE does not allow flow control; -applications are expected to process messages at line-speed. If flow -control is needed, applications should use the CADET service. - -.. when is a peer connected -.. _When-is-a-peer-_0022connected_0022_003f: - -When is a peer \"connected\"? ------------------------------ - -In addition to the security features mentioned above, CORE also provides -one additional key feature to applications using it, and that is a -limited form of protocol-compatibility checking. CORE distinguishes -between TRANSPORT-level connections (which enable communication with -other peers) and application-level connections. Applications using the -CORE API will (typically) learn about application-level connections from -CORE, and not about TRANSPORT-level connections. When a typical -application uses CORE, it will specify a set of message types (from -``gnunet_protocols.h``) that it understands. CORE will then notify the -application about connections it has with other peers if and only if -those applications registered an intersecting set of message types with -their CORE service. Thus, it is quite possible that CORE only exposes a -subset of the established direct connections to a particular application ---- and different applications running above CORE might see different -sets of connections at the same time. - -A special case are applications that do not register a handler for any -message type. CORE assumes that these applications merely want to -monitor connections (or \"all\" messages via other callbacks) and will -notify those applications about all connections. This is used, for -example, by the ``gnunet-core`` command-line tool to display the active -connections. Note that it is also possible that the TRANSPORT service -has more active connections than the CORE service, as the CORE service -first has to perform a key exchange with connecting peers before -exchanging information about supported message types and notifying -applications about the new connection. diff --git a/subsystems/dht.rst b/subsystems/dht.rst @@ -1,69 +0,0 @@ -.. _Distributed-Hash-Table-_0028DHT_0029: - -.. index:: - double: Distributed hash table; subsystem - see: DHT; Distributed hash table - -DHT — Distributed Hash Table -============================ - -GNUnet includes a generic distributed hash table that can be used by -developers building P2P applications in the framework. This section -documents high-level features and how developers are expected to use the -DHT. We have a research paper detailing how the DHT works. Also, Nate's -thesis includes a detailed description and performance analysis (in -chapter 6). [R5N2011]_ - -.. todo:: Confirm: Are "Nate's thesis" and the "research paper" separate - entities? - -Key features of GNUnet's DHT include: - -- stores key-value pairs with values up to (approximately) 63k in size - -- works with many underlay network topologies (small-world, random - graph), underlay does not need to be a full mesh / clique - -- support for extended queries (more than just a simple 'key'), - filtering duplicate replies within the network (bloomfilter) and - content validation (for details, please read the subsection on the - block library) - -- can (optionally) return paths taken by the PUT and GET operations to - the application - -- provides content replication to handle churn - -GNUnet's DHT is randomized and unreliable. Unreliable means that there -is no strict guarantee that a value stored in the DHT is always found -— values are only found with high probability. While this is somewhat -true in all P2P DHTs, GNUnet developers should be particularly wary of -this fact (this will help you write secure, fault-tolerant code). Thus, -when writing any application using the DHT, you should always consider -the possibility that a value stored in the DHT by you or some other peer -might simply not be returned, or returned with a significant delay. Your -application logic must be written to tolerate this (naturally, some loss -of performance or quality of service is expected in this case). - -.. _Block-library-and-plugins: - -Block library and plugins -------------------------- - -.. _What-is-a-Block_003f: - -What is a Block? -^^^^^^^^^^^^^^^^ - -Blocks are small (< 63k) pieces of data stored under a key (struct -GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which -defines their data format. Blocks are used in GNUnet as units of static -data exchanged between peers and stored (or cached) locally. Uses of -blocks include file-sharing (the files are broken up into blocks), the -VPN (DNS information is stored in blocks) and the DHT (all information -in the DHT and meta-information for the maintenance of the DHT are both -stored using blocks). The block subsystem provides a few common -functions that must be available for any type of block. - - -.. [R5N2011] https://bib.gnunet.org/date.html#R5N diff --git a/subsystems/fs.rst b/subsystems/fs.rst @@ -1,54 +0,0 @@ -.. index:: - double: File sharing; subsystem - see: FS; File sharing - -.. _File_002dsharing-_0028FS_0029-Subsystem: - -FS — File sharing over GNUnet -============================= - -This chapter describes the details of how the file-sharing service -works. As with all services, it is split into an API (libgnunetfs), the -service process (gnunet-service-fs) and user interface(s). The -file-sharing service uses the datastore service to store blocks and the -DHT (and indirectly datacache) for lookups for non-anonymous -file-sharing. Furthermore, the file-sharing service uses the block -library (and the block fs plugin) for validation of DHT operations. - -In contrast to many other services, libgnunetfs is rather complex since -the client library includes a large number of high-level abstractions; -this is necessary since the FS service itself largely only operates on -the block level. The FS library is responsible for providing a -file-based abstraction to applications, including directories, meta -data, keyword search, verification, and so on. - -The method used by GNUnet to break large files into blocks and to use -keyword search is called the \"Encoding for Censorship Resistant -Sharing\" (ECRS). ECRS is largely implemented in the fs library; block -validation is also reflected in the block FS plugin and the FS service. -ECRS on-demand encoding is implemented in the FS service. - -.. note:: The documentation in this chapter is quite incomplete. - -.. _Encoding-for-Censorship_002dResistant-Sharing-_0028ECRS_0029: - -.. index:: - see: Encoding for Censorship-Resistant Sharing; ECRS - -:index:`ECRS — Encoding for Censorship-Resistant Sharing <single: ECRS>` -ECRS — Encoding for Censorship-Resistant Sharing ------------------------------------------------- - -When GNUnet shares files, it uses a content encoding that is called -ECRS, the Encoding for Censorship-Resistant Sharing. Most of ECRS is -described in the (so far unpublished) research paper attached to this -page. ECRS obsoletes the previous ESED and ESED II encodings which were -used in GNUnet before version 0.7.0. The rest of this page assumes that -the reader is familiar with the attached paper. What follows is a -description of some minor extensions that GNUnet makes over what is -described in the paper. The reason why these extensions are not in the -paper is that we felt that they were obvious or trivial extensions to -the original scheme and thus did not warrant space in the research -report. - -.. todo:: Find missing link to file system paper. diff --git a/subsystems/gns.rst b/subsystems/gns.rst @@ -1,47 +0,0 @@ -.. index:: - double: GNU Name System; subsystem - see: GNS; GNU Name System - -.. _GNU-Name-System-_0028GNS_0029: - -GNS — the GNU Name System -========================= - -The GNU Name System (GNS) is a decentralized database that enables users -to securely resolve names to values. Names can be used to identify other -users (for example, in social networking), or network services (for -example, VPN services running at a peer in GNUnet, or purely IP-based -services on the Internet). Users interact with GNS by typing in a -hostname that ends in a top-level domain that is configured in the "GNS" -section, matches an identity of the user or ends in a Base32-encoded -public key. - -Videos giving an overview of most of the GNS and the motivations behind -it is available here and here. The remainder of this chapter targets -developers that are familiar with high level concepts of GNS as -presented in these talks. - -.. todo:: Link to videos and GNS talks? - -GNS-aware applications should use the GNS resolver to obtain the -respective records that are stored under that name in GNS. Each record -consists of a type, value, expiration time and flags. - -The type specifies the format of the value. Types below 65536 correspond -to DNS record types, larger values are used for GNS-specific records. -Applications can define new GNS record types by reserving a number and -implementing a plugin (which mostly needs to convert the binary value -representation to a human-readable text format and vice-versa). The -expiration time specifies how long the record is to be valid. The GNS -API ensures that applications are only given non-expired values. The -flags are typically irrelevant for applications, as GNS uses them -internally to control visibility and validity of records. - -Records are stored along with a signature. The signature is generated -using the private key of the authoritative zone. This allows any GNS -resolver to verify the correctness of a name-value mapping. - -Internally, GNS uses the NAMECACHE to cache information obtained from -other users, the NAMESTORE to store information specific to the local -users, and the DHT to exchange data between users. A plugin API is used -to enable applications to define new GNS record types. diff --git a/subsystems/hostlist.rst b/subsystems/hostlist.rst @@ -1,293 +0,0 @@ - -.. index:: - double: HOSTLIST; subsystem - -.. _HOSTLIST-Subsystem: - -HOSTLIST — HELLO bootstrapping and gossip -========================================= - -Peers in the GNUnet overlay network need address information so that -they can connect with other peers. GNUnet uses so called HELLO messages -to store and exchange peer addresses. GNUnet provides several methods -for peers to obtain this information: - -- out-of-band exchange of HELLO messages (manually, using for example - gnunet-peerinfo) - -- HELLO messages shipped with GNUnet (automatic with distribution) - -- UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast) - -- topology gossiping (learning from other peers we already connected - to), and - -- the HOSTLIST daemon covered in this section, which is particularly - relevant for bootstrapping new peers. - -New peers have no existing connections (and thus cannot learn from -gossip among peers), may not have other peers in their LAN and might be -started with an outdated set of HELLO messages from the distribution. In -this case, getting new peers to connect to the network requires either -manual effort or the use of a HOSTLIST to obtain HELLOs. - -.. _HELLOs: - -HELLOs ------- - -The basic information peers require to connect to other peers are -contained in so called HELLO messages you can think of as a business -card. Besides the identity of the peer (based on the cryptographic -public key) a HELLO message may contain address information that -specifies ways to contact a peer. By obtaining HELLO messages, a peer -can learn how to contact other peers. - -.. _Overview-for-the-HOSTLIST-subsystem: - -Overview for the HOSTLIST subsystem ------------------------------------ - -The HOSTLIST subsystem provides a way to distribute and obtain contact -information to connect to other peers using a simple HTTP GET request. -Its implementation is split in three parts, the main file for the -daemon itself (``gnunet-daemon-hostlist.c``), the HTTP client used to -download peer information (``hostlist-client.c``) and the server -component used to provide this information to other peers -(``hostlist-server.c``). The server is basically a small HTTP web server -(based on GNU libmicrohttpd) which provides a list of HELLOs known to -the local peer for download. The client component is basically a HTTP -client (based on libcurl) which can download hostlists from one or more -websites. The hostlist format is a binary blob containing a sequence of -HELLO messages. Note that any HTTP server can theoretically serve a -hostlist, the built-in hostlist server makes it simply convenient to -offer this service. - -.. _Features: - -Features -^^^^^^^^ - -The HOSTLIST daemon can: - -- provide HELLO messages with validated addresses obtained from - PEERINFO to download for other peers - -- download HELLO messages and forward these message to the TRANSPORT - subsystem for validation - -- advertises the URL of this peer's hostlist address to other peers via - gossip - -- automatically learn about hostlist servers from the gossip of other - peers - -.. _HOSTLIST-_002d-Limitations: - -HOSTLIST - Limitations -^^^^^^^^^^^^^^^^^^^^^^ - -The HOSTLIST daemon does not: - -- verify the cryptographic information in the HELLO messages - -- verify the address information in the HELLO messages - -.. _Interacting-with-the-HOSTLIST-daemon: - -Interacting with the HOSTLIST daemon ------------------------------------- - -The HOSTLIST subsystem is currently implemented as a daemon, so there is -no need for the user to interact with it and therefore there is no -command line tool and no API to communicate with the daemon. In the -future, we can envision changing this to allow users to manually trigger -the download of a hostlist. - -Since there is no command line interface to interact with HOSTLIST, the -only way to interact with the hostlist is to use STATISTICS to obtain or -modify information about the status of HOSTLIST: - -:: - - $ gnunet-statistics -s hostlist - -In particular, HOSTLIST includes a **persistent** value in statistics -that specifies when the hostlist server might be queried next. As this -value is exponentially increasing during runtime, developers may want to -reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) -needs to be shutdown if changes to this value are to have any effect on -the daemon (as HOSTLIST does not monitor STATISTICS for changes to the -download frequency). - -.. _Hostlist-security-address-validation: - -Hostlist security address validation ------------------------------------- - -Since information obtained from other parties cannot be trusted without -validation, we have to distinguish between *validated* and *not -validated* addresses. Before using (and so trusting) information from -other parties, this information has to be double-checked (validated). -Address validation is not done by HOSTLIST but by the TRANSPORT service. - -The HOSTLIST component is functionally located between the PEERINFO and -the TRANSPORT subsystem. When acting as a server, the daemon obtains -valid (*validated*) peer information (HELLO messages) from the PEERINFO -service and provides it to other peers. When acting as a client, it -contacts the HOSTLIST servers specified in the configuration, downloads -the (unvalidated) list of HELLO messages and forwards these information -to the TRANSPORT server to validate the addresses. - -.. _The-HOSTLIST-daemon: - -:index:`The HOSTLIST daemon <double: daemon; HOSTLIST>` -The HOSTLIST daemon -------------------- - -The hostlist daemon is the main component of the HOSTLIST subsystem. It -is started by the ARM service and (if configured) starts the HOSTLIST -client and server components. - -GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT -If the daemon provides a hostlist itself it can advertise it's own -hostlist to other peers. To do so it sends a -``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to other peers -when they connect to this peer on the CORE level. This hostlist -advertisement message contains the URL to access the HOSTLIST HTTP -server of the sender. The daemon may also subscribe to this type of -message from CORE service, and then forward these kind of message to the -HOSTLIST client. The client then uses all available URLs to download -peer information when necessary. - -When starting, the HOSTLIST daemon first connects to the CORE subsystem -and if hostlist learning is enabled, registers a CORE handler to receive -this kind of messages. Next it starts (if configured) the client and -server. It passes pointers to CORE connect and disconnect and receive -handlers where the client and server store their functions, so the -daemon can notify them about CORE events. - -To clean up on shutdown, the daemon has a cleaning task, shutting down -all subsystems and disconnecting from CORE. - -.. _The-HOSTLIST-server: - -:index:`The HOSTLIST server <single: HOSTLIST; server>` -The HOSTLIST server -------------------- - -The server provides a way for other peers to obtain HELLOs. Basically it -is a small web server other peers can connect to and download a list of -HELLOs using standard HTTP; it may also advertise the URL of the -hostlist to other peers connecting on CORE level. - -.. _The-HTTP-Server: - -The HTTP Server -^^^^^^^^^^^^^^^ - -During startup, the server starts a web server listening on the port -specified with the HTTPPORT value (default 8080). In addition it -connects to the PEERINFO service to obtain peer information. The -HOSTLIST server uses the GNUNET_PEERINFO_iterate function to request -HELLO information for all peers and adds their information to a new -hostlist if they are suitable (expired addresses and HELLOs without -addresses are both not suitable) and the maximum size for a hostlist is -not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When PEERINFO finishes -(with a last NULL callback), the server destroys the previous hostlist -response available for download on the web server and replaces it with -the updated hostlist. The hostlist format is basically a sequence of -HELLO messages (as obtained from PEERINFO) without any special -tokenization. Since each HELLO message contains a size field, the -response can easily be split into separate HELLO messages by the client. - -A HOSTLIST client connecting to the HOSTLIST server will receive the -hostlist as an HTTP response and the server will terminate the -connection with the result code ``HTTP 200 OK``. The connection will be -closed immediately if no hostlist is available. - -.. _Advertising-the-URL: - -Advertising the URL -^^^^^^^^^^^^^^^^^^^ - -The server also advertises the URL to download the hostlist to other -peers if hostlist advertisement is enabled. When a new peer connects and -has hostlist learning enabled, the server sends a -``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to this peer -using the CORE service. - -HOSTLIST client -.. _The-HOSTLIST-client: - -The HOSTLIST client -------------------- - -The client provides the functionality to download the list of HELLOs -from a set of URLs. It performs a standard HTTP request to the URLs -configured and learned from advertisement messages received from other -peers. When a HELLO is downloaded, the HOSTLIST client forwards the -HELLO to the TRANSPORT service for validation. - -The client supports two modes of operation: - -- download of HELLOs (bootstrapping) - -- learning of URLs - -.. _Bootstrapping: - -Bootstrapping -^^^^^^^^^^^^^ - -For bootstrapping, it schedules a task to download the hostlist from the -set of known URLs. The downloads are only performed if the number of -current connections is smaller than a minimum number of connections (at -the moment 4). The interval between downloads increases exponentially; -however, the exponential growth is limited if it becomes longer than an -hour. At that point, the frequency growth is capped at (#number of -connections \* 1h). - -Once the decision has been taken to download HELLOs, the daemon chooses -a random URL from the list of known URLs. URLs can be configured in the -configuration or be learned from advertisement messages. The client uses -a HTTP client library (libcurl) to initiate the download using the -libcurl multi interface. Libcurl passes the data to the -callback_download function which stores the data in a buffer if space is -available and the maximum size for a hostlist download is not exceeded -(MAX_BYTES_PER_HOSTLISTS = 500000). When a full HELLO was downloaded, -the HOSTLIST client offers this HELLO message to the TRANSPORT service -for validation. When the download is finished or failed, statistical -information about the quality of this URL is updated. - -.. _Learning: - -:index:`Learning <single: HOSTLIST; learning>` -Learning -^^^^^^^^ - -The client also manages hostlist advertisements from other peers. The -HOSTLIST daemon forwards ``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` -messages to the client subsystem, which extracts the URL from the -message. Next, a test of the newly obtained URL is performed by -triggering a download from the new URL. If the URL works correctly, it -is added to the list of working URLs. - -The size of the list of URLs is restricted, so if an additional server -is added and the list is full, the URL with the worst quality ranking -(determined through successful downloads and number of HELLOs e.g.) is -discarded. During shutdown the list of URLs is saved to a file for -persistence and loaded on startup. URLs from the configuration file are -never discarded. - -.. _Usage: - -Usage ------ - -To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES -section for the ARM services. This is done in the default configuration. - -For more information on how to configure the HOSTLIST subsystem see the -installation handbook: Configuring the hostlist to bootstrap Configuring -your peer to provide a hostlist diff --git a/subsystems/identity.rst b/subsystems/identity.rst @@ -1,183 +0,0 @@ - -.. index:: - double: IDENTITY; subsystem - -.. _IDENTITY-Subsystem: - -IDENTITY — Ego management -========================= - -Identities of \"users\" in GNUnet are called egos. Egos can be used as -pseudonyms (\"fake names\") or be tied to an organization (for example, -\"GNU\") or even the actual identity of a human. GNUnet users are -expected to have many egos. They might have one tied to their real -identity, some for organizations they manage, and more for different -domains where they want to operate under a pseudonym. - -The IDENTITY service allows users to manage their egos. The identity -service manages the private keys egos of the local user; it does not -manage identities of other users (public keys). Public keys for other -users need names to become manageable. GNUnet uses the GNU Name System -(GNS) to give names to other users and manage their public keys -securely. This chapter is about the IDENTITY service, which is about the -management of private keys. - -On the network, an ego corresponds to an ECDSA key (over Curve25519, -using RFC 6979, as required by GNS). Thus, users can perform actions -under a particular ego by using (signing with) a particular private key. -Other users can then confirm that the action was really performed by -that ego by checking the signature against the respective public key. - -The IDENTITY service allows users to associate a human-readable name -with each ego. This way, users can use names that will remind them of -the purpose of a particular ego. The IDENTITY service will store the -respective private keys and allows applications to access key -information by name. Users can change the name that is locally (!) -associated with an ego. Egos can also be deleted, which means that the -private key will be removed and it thus will not be possible to perform -actions with that ego in the future. - -Additionally, the IDENTITY subsystem can associate service functions -with egos. For example, GNS requires the ego that should be used for the -shorten zone. GNS will ask IDENTITY for an ego for the \"gns-short\" -service. The IDENTITY service has a mapping of such service strings to -the name of the ego that the user wants to use for this service, for -example \"my-short-zone-ego\". - -Finally, the IDENTITY API provides access to a special ego, the -anonymous ego. The anonymous ego is special in that its private key is -not really private, but fixed and known to everyone. Thus, anyone can -perform actions as anonymous. This can be useful as with this trick, -code does not have to contain a special case to distinguish between -anonymous and pseudonymous egos. - -:index:`libgnunetidentity <single: libgnunet; identity>` -libgnunetidentity ------------------ - -.. _Connecting-to-the-identity-service: - -Connecting to the service -^^^^^^^^^^^^^^^^^^^^^^^^^ - -First, typical clients connect to the identity service using -``GNUNET_IDENTITY_connect``. This function takes a callback as a -parameter. If the given callback parameter is non-null, it will be -invoked to notify the application about the current state of the -identities in the system. - -- First, it will be invoked on all known egos at the time of the - connection. For each ego, a handle to the ego and the user's name for - the ego will be passed to the callback. Furthermore, a ``void **`` - context argument will be provided which gives the client the - opportunity to associate some state with the ego. - -- Second, the callback will be invoked with NULL for the ego, the name - and the context. This signals that the (initial) iteration over all - egos has completed. - -- Then, the callback will be invoked whenever something changes about - an ego. If an ego is renamed, the callback is invoked with the ego - handle of the ego that was renamed, and the new name. If an ego is - deleted, the callback is invoked with the ego handle and a name of - NULL. In the deletion case, the application should also release - resources stored in the context. - -- When the application destroys the connection to the identity service - using ``GNUNET_IDENTITY_disconnect``, the callback is again invoked - with the ego and a name of NULL (equivalent to deletion of the egos). - This should again be used to clean up the per-ego context. - -The ego handle passed to the callback remains valid until the callback -is invoked with a name of NULL, so it is safe to store a reference to -the ego's handle. - -.. _Operations-on-Egos: - -Operations on Egos -^^^^^^^^^^^^^^^^^^ - -Given an ego handle, the main operations are to get its associated -private key using ``GNUNET_IDENTITY_ego_get_private_key`` or its -associated public key using ``GNUNET_IDENTITY_ego_get_public_key``. - -The other operations on egos are pretty straightforward. Using -``GNUNET_IDENTITY_create``, an application can request the creation of -an ego by specifying the desired name. The operation will fail if that -name is already in use. Using ``GNUNET_IDENTITY_rename`` the name of an -existing ego can be changed. Finally, egos can be deleted using -``GNUNET_IDENTITY_delete``. All of these operations will trigger updates -to the callback given to the ``GNUNET_IDENTITY_connect`` function of all -applications that are connected with the identity service at the time. -``GNUNET_IDENTITY_cancel`` can be used to cancel the operations before -the respective continuations would be called. It is not guaranteed that -the operation will not be completed anyway, only the continuation will -no longer be called. - -.. _The-anonymous-Ego: - -The anonymous Ego -^^^^^^^^^^^^^^^^^ - -A special way to obtain an ego handle is to call -``GNUNET_IDENTITY_ego_get_anonymous``, which returns an ego for the -\"anonymous\" user --- anyone knows and can get the private key for this -user, so it is suitable for operations that are supposed to be anonymous -but require signatures (for example, to avoid a special path in the -code). The anonymous ego is always valid and accessing it does not -require a connection to the identity service. - -.. _Convenience-API-to-lookup-a-single-ego: - -Convenience API to lookup a single ego -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -As applications commonly simply have to lookup a single ego, there is a -convenience API to do just that. Use ``GNUNET_IDENTITY_ego_lookup`` to -lookup a single ego by name. Note that this is the user's name for the -ego, not the service function. The resulting ego will be returned via a -callback and will only be valid during that callback. The operation can -be canceled via ``GNUNET_IDENTITY_ego_lookup_cancel`` (cancellation is -only legal before the callback is invoked). - -.. _Associating-egos-with-service-functions: - -Associating egos with service functions -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The ``GNUNET_IDENTITY_set`` function is used to associate a particular -ego with a service function. The name used by the service and the ego -are given as arguments. Afterwards, the service can use its name to -lookup the associated ego using ``GNUNET_IDENTITY_get``. - -.. _The-IDENTITY-Client_002dService-Protocol: - -The IDENTITY Client-Service Protocol ------------------------------------- - -A client connecting to the identity service first sends a message with -type ``GNUNET_MESSAGE_TYPE_IDENTITY_START`` to the service. After that, -the client will receive information about changes to the egos by -receiving messages of type ``GNUNET_MESSAGE_TYPE_IDENTITY_UPDATE``. -Those messages contain the private key of the ego and the user's name of -the ego (or zero bytes for the name to indicate that the ego was -deleted). A special bit ``end_of_list`` is used to indicate the end of -the initial iteration over the identity service's egos. - -The client can trigger changes to the egos by sending ``CREATE``, -``RENAME`` or ``DELETE`` messages. The CREATE message contains the -private key and the desired name. The RENAME message contains the old -name and the new name. The DELETE message only needs to include the name -of the ego to delete. The service responds to each of these messages -with a ``RESULT_CODE`` message which indicates success or error of the -operation, and possibly a human-readable error message. - -Finally, the client can bind the name of a service function to an ego by -sending a ``SET_DEFAULT`` message with the name of the service function -and the private key of the ego. Such bindings can then be resolved using -a ``GET_DEFAULT`` message, which includes the name of the service -function. The identity service will respond to a GET_DEFAULT request -with a SET_DEFAULT message containing the respective information, or -with a RESULT_CODE to indicate an error. - - diff --git a/subsystems/index.rst b/subsystems/index.rst @@ -1,31 +0,0 @@ -Subsystems -========== - -These services comprise a backbone of core services for -peer-to-peer applications to use. - -.. toctree:: - cadet.rst - core.rst - dht.rst - fs.rst - gns.rst - hostlist.rst - identity.rst - messenger.rst - namecache.rst - namestore.rst - nse.rst - peerinfo.rst - peerstore.rst - regex.rst - rest.rst - revocation.rst - rps.rst - setops.rst - statistics.rst - transport-ng.rst - transport.rst - vpnstack.rst - - diff --git a/subsystems/messenger.rst b/subsystems/messenger.rst @@ -1,241 +0,0 @@ -.. index:: - double: subsystem; MESSENGER - -.. _MESSENGER-Subsystem: - -MESSENGER — Room-based end-to-end messaging -=========================================== - -The MESSENGER subsystem is responsible for secure end-to-end -communication in groups of nodes in the GNUnet overlay network. -MESSENGER builds on the CADET subsystem which provides a reliable and -secure end-to-end communication between the nodes inside of these -groups. - -Additionally to the CADET security benefits, MESSENGER provides -following properties designed for application level usage: - -- MESSENGER provides integrity by signing the messages with the users - provided ego - -- MESSENGER adds (optional) forward secrecy by replacing the key pair - of the used ego and signing the propagation of the new one with old - one (chaining egos) - -- MESSENGER provides verification of a original sender by checking - against all used egos from a member which are currently in active use - (active use depends on the state of a member session) - -- MESSENGER offsers (optional) decentralized message forwarding between - all nodes in a group to improve availability and prevent MITM-attacks - -- MESSENGER handles new connections and disconnections from nodes in - the group by reconnecting them preserving an efficient structure for - message distribution (ensuring availability and accountablity) - -- MESSENGER provides replay protection (messages can be uniquely - identified via SHA-512, include a timestamp and the hash of the last - message) - -- MESSENGER allows detection for dropped messages by chaining them - (messages refer to the last message by their hash) improving - accountability - -- MESSENGER allows requesting messages from other peers explicitly to - ensure availability - -- MESSENGER provides confidentiality by padding messages to few - different sizes (512 bytes, 4096 bytes, 32768 bytes and maximal - message size from CADET) - -- MESSENGER adds (optional) confidentiality with ECDHE to exchange and - use symmetric encryption, encrypting with both AES-256 and Twofish - but allowing only selected members to decrypt (using the receivers - ego for ECDHE) - -Also MESSENGER provides multiple features with privacy in mind: - -- MESSENGER allows deleting messages from all peers in the group by the - original sender (uses the MESSENGER provided verification) - -- MESSENGER allows using the publicly known anonymous ego instead of - any unique identifying ego - -- MESSENGER allows your node to decide between acting as host of the - used messaging room (sharing your peer's identity with all nodes in - the group) or acting as guest (sharing your peer's identity only with - the nodes you explicitly open a connection to) - -- MESSENGER handles members independently of the peer's identity making - forwarded messages indistinguishable from directly received ones ( - complicating the tracking of messages and identifying its origin) - -- MESSENGER allows names of members being not unique (also names are - optional) - -- MESSENGER does not include information about the selected receiver of - an explicitly encrypted message in its header, complicating it for - other members to draw conclusions from communication partners - - -:index:`libgnunetmessenger <single: libgnunet; messenger>` -libgnunetmessenger ------------------- - -The MESSENGER API (defined in ``gnunet_messenger_service.h``) allows P2P -applications built using GNUnet to communicate with specified kinds of -messages in a group. It provides applications the ability to send and -receive encrypted messages to any group of peers participating in GNUnet -in a decentralized way ( without even knowing all peers's identities). - -MESSENGER delivers messages to other peers in \"rooms\". A room uses a -variable amount of CADET \"channels\" which will all be used for message -distribution. Each channel can represent an outgoing connection opened -by entering a room with ``GNUNET_MESSENGER_enter_room`` or an incoming -connection if the room was opened before via -``GNUNET_MESSENGER_open_room``. - -|messenger_room| - -To enter a room you have to specify the \"door\" (peer's identity of a -peer which has opened the room) and the key of the room (which is -identical to a CADET \"port\"). To open a room you have to specify only -the key to use. When opening a room you automatically distribute a -PEER-message sharing your peer's identity in the room. - -Entering or opening a room can also be combined in any order. In any -case you will automatically get a unique member ID and send a -JOIN-message notifying others about your entry and your public key from -your selected ego. - -The ego can be selected by name with the initial -``GNUNET_MESSENGER_connect`` besides setting a (identity-)callback for -each change/confirmation of the used ego and a (message-)callback which -gets called every time a message gets sent or received in the room. Once -the identity-callback got called you can check your used ego with -``GNUNET_MESSENGER_get_key`` providing only its public key. The function -returns NULL if the anonymous ego is used. If the ego should be replaced -with a newly generated one, you can use ``GNUNET_MESSENGER_update`` to -ensure proper chaining of used egos. - -Also once the identity-callback got called you can check your used name -with ``GNUNET_MESSENGER_get_name`` and potentially change or set a name -via ``GNUNET_MESSENGER_set_name``. A name is for example required to -create a new ego with ``GNUNET_MESSENGER_update``. Also any change in -ego or name will automatically be distributed in the room with a NAME- -or KEY-message respectively. - -To send a message a message inside of a room you can use -``GNUNET_MESSENGER_send_message``. If you specify a selected contact as -receiver, the message gets encrypted automatically and will be sent as -PRIVATE- message instead. - -To request a potentially missed message or to get a specific message -after its original call of the message-callback, you can use -``GNUNET_MESSENGER_get_message``. Additionally once a message was -distributed to application level and the message-callback got called, -you can get the contact respresenting a message's sender respectively -with ``GNUNET_MESSENGER_get_sender``. This allows getting name and the -public key of any sender currently in use with -``GNUNET_MESSENGER_contact_get_name`` and -``GNUNET_MESSENGER_contact_get_key``. It is also possible to iterate -through all current members of a room with -``GNUNET_MESSENGER_iterate_members`` using a callback. - -To leave a room you can use ``GNUNET_MESSENGER_close_room`` which will -also close the rooms connections once all applications on the same peer -have left the room. Leaving a room will also send a LEAVE-message -closing a member session on all connected peers before any connection -will be closed. Leaving a room is however not required for any -application to keep your member session open between multiple sessions -of the actual application. - -Finally, when an application no longer wants to use CADET, it should -call ``GNUNET_MESSENGER_disconnect``. You don't have to explicitly close -the used rooms or leave them. - -Here is a little summary to the kinds of messages you can send manually: - -.. _MERGE_002dmessage: - -MERGE-message -^^^^^^^^^^^^^ - -MERGE-messages will generally be sent automatically to reduce the amount -of parallel chained messages. This is necessary to close a member -session for example. You can also send MERGE-messages manually if -required to merge two chains of messages. - -.. _INVITE_002dmessage: - -INVITE-message -^^^^^^^^^^^^^^ - -INVITE-messages can be used to invite other members in a room to a -different room, sharing one potential door and the required key to enter -the room. This kind of message is typically sent as encrypted -PRIVATE-message to selected members because it doesn't make much sense -to invite all members from one room to another considering a rooms key -doesn't specify its usage. - -.. _TEXT_002dmessage: - -TEXT-message -^^^^^^^^^^^^ - -TEXT-messages can be used to send simple text-based messages and should -be considered as being in readable form without complex decoding. The -text has to end with a NULL-terminator character and should be in UTF-8 -encoding for most compatibility. - -.. _FILE_002dmessage: - -FILE-message -^^^^^^^^^^^^ - -FILE-messages can be used to share files inside of a room. They do not -contain the actual file being shared but its original hash, filename, -URI to download the file and a symmetric key to decrypt the downloaded -file. - -It is recommended to use the FS subsystem and the FILE-messages in -combination. - -.. _DELETE_002dmessage: - -DELETE-message -^^^^^^^^^^^^^^ - -DELETE-messages can be used to delete messages selected with its hash. -You can also select any custom delay relative to the time of sending the -DELETE-message. Deletion will only be processed on each peer in a room -if the sender is authorized. - -The only information of a deleted message which being kept will be the -chained hashes connecting the message graph for potential traversion. -For example the check for completion of a member session requires this -information. - -.. _Member-sessions: - -Member sessions ---------------- - -A member session is a triple of the room key, the member ID and the -public key of the member's ego. Member sessions allow that a member can -change their ID or their ego once at a time without losing the ability -to delete old messages or identifying the original sender of a message. -On every change of ID or EGO a session will be marked as closed. So -every session chain will only contain one open session with the current -ID and public key. - -If a session is marked as closed the MESSENGER service will check from -the first message opening a session to its last one closing the session -for completion. If a the service can confirm that there is no message -still missing which was sent from the closed member session, it will be -marked as completed. - -A completed member session is not able to verify any incoming message to -ensure forward secrecy preventing others from using old stolen egos. - -.. |messenger_room| image:: /images/messenger_room.png diff --git a/subsystems/namecache.rst b/subsystems/namecache.rst @@ -1,128 +0,0 @@ - -.. index:: - single: GNS; name cache - double: subsystem; NAMECACHE - -.. _GNS-Namecache: - -NAMECACHE — DHT caching of GNS results -====================================== - -The NAMECACHE subsystem is responsible for caching (encrypted) -resolution results of the GNU Name System (GNS). GNS makes zone -information available to other users via the DHT. However, as accessing -the DHT for every lookup is expensive (and as the DHT's local cache is -lost whenever the peer is restarted), GNS uses the NAMECACHE as a more -persistent cache for DHT lookups. Thus, instead of always looking up -every name in the DHT, GNS first checks if the result is already -available locally in the NAMECACHE. Only if there is no result in the -NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the same -(encrypted) format as the DHT. It thus makes no sense to iterate over -all items in the NAMECACHE – the NAMECACHE does not have a way to -provide the keys required to decrypt the entries. - -Blocks in the NAMECACHE share the same expiration mechanism as blocks in -the DHT – the block expires wheneever any of the records in the -(encrypted) block expires. The expiration time of the block is the only -information stored in plaintext. The NAMECACHE service internally -performs all of the required work to expire blocks, clients do not have -to worry about this. Also, given that NAMECACHE stores only GNS blocks -that local users requested, there is no configuration option to limit -the size of the NAMECACHE. It is assumed to be always small enough (a -few MB) to fit on the drive. - -The NAMECACHE supports the use of different database backends via a -plugin API. - -:index:`libgnunetnamecache <single: libgnunet; namecache>` -libgnunetnamecache ------------------- - -The NAMECACHE API consists of five simple functions. First, there is -``GNUNET_NAMECACHE_connect`` to connect to the NAMECACHE service. This -returns the handle required for all other operations on the NAMECACHE. -Using ``GNUNET_NAMECACHE_block_cache`` clients can insert a block into -the cache. ``GNUNET_NAMECACHE_lookup_block`` can be used to lookup -blocks that were stored in the NAMECACHE. Both operations can be -canceled using ``GNUNET_NAMECACHE_cancel``. Note that canceling a -``GNUNET_NAMECACHE_block_cache`` operation can result in the block being -stored in the NAMECACHE --- or not. Cancellation primarily ensures that -the continuation function with the result of the operation will no -longer be invoked. Finally, ``GNUNET_NAMECACHE_disconnect`` closes the -connection to the NAMECACHE. - -The maximum size of a block that can be stored in the NAMECACHE is -``GNUNET_NAMECACHE_MAX_VALUE_SIZE``, which is defined to be 63 kB. - -.. _The-NAMECACHE-Client_002dService-Protocol: - -The NAMECACHE Client-Service Protocol -------------------------------------- - -All messages in the NAMECACHE IPC protocol start with the -``struct GNUNET_NAMECACHE_Header`` which adds a request ID (32-bit -integer) to the standard message header. The request ID is used to match -requests with the respective responses from the NAMECACHE, as they are -allowed to happen out-of-order. - -.. _Lookup: - -Lookup -^^^^^^ - -The ``struct LookupBlockMessage`` is used to lookup a block stored in -the cache. It contains the query hash. The NAMECACHE always responds -with a ``struct LookupBlockResponseMessage``. If the NAMECACHE has no -response, it sets the expiration time in the response to zero. -Otherwise, the response is expected to contain the expiration time, the -ECDSA signature, the derived key and the (variable-size) encrypted data -of the block. - -.. _Store: - -Store -^^^^^ - -The ``struct BlockCacheMessage`` is used to cache a block in the -NAMECACHE. It has the same structure as the -``struct LookupBlockResponseMessage``. The service responds with a -``struct BlockCacheResponseMessage`` which contains the result of the -operation (success or failure). In the future, we might want to make it -possible to provide an error message as well. - -.. _The-NAMECACHE-Plugin-API: - -The NAMECACHE Plugin API ------------------------- - -The NAMECACHE plugin API consists of two functions, ``cache_block`` to -store a block in the database, and ``lookup_block`` to lookup a block in -the database. - -.. _Lookup2: - -Lookup2 -^^^^^^^ - -The ``lookup_block`` function is expected to return at most one block to -the iterator, and return ``GNUNET_NO`` if there were no non-expired -results. If there are multiple non-expired results in the cache, the -lookup is supposed to return the result with the largest expiration -time. - -.. _Store2: - -Store2 -^^^^^^ - -The ``cache_block`` function is expected to try to store the block in -the database, and return ``GNUNET_SYSERR`` if this was not possible for -any reason. Furthermore, ``cache_block`` is expected to implicitly -perform cache maintenance and purge blocks from the cache that have -expired. Note that ``cache_block`` might encounter the case where the -database already has another block stored under the same key. In this -case, the plugin must ensure that the block with the larger expiration -time is preserved. Obviously, this can done either by simply adding new -blocks and selecting for the most recent expiration time during lookup, -or by checking which block is more recent during the store operation. - diff --git a/subsystems/namestore.rst b/subsystems/namestore.rst @@ -1,168 +0,0 @@ - -.. index:: - double: subsystem; NAMESTORE - -.. _NAMESTORE-Subsystem: - -NAMESTORE — Storage of local GNS zones -====================================== - -The NAMESTORE subsystem provides persistent storage for local GNS zone -information. All local GNS zone information are managed by NAMESTORE. It -provides both the functionality to administer local GNS information -(e.g. delete and add records) as well as to retrieve GNS information -(e.g to list name information in a client). NAMESTORE does only manage -the persistent storage of zone information belonging to the user running -the service: GNS information from other users obtained from the DHT are -stored by the NAMECACHE subsystem. - -NAMESTORE uses a plugin-based database backend to store GNS information -with good performance. Here sqlite and PostgreSQL are supported -database backends. NAMESTORE clients interact with the IDENTITY -subsystem to obtain cryptographic information about zones based on egos -as described with the IDENTITY subsystem, but internally NAMESTORE -refers to zones using the respective private key. - -NAMESTORE is queried and monitored by the ZONEMASTER service which periodically -publishes public records of GNS zones. ZONEMASTER also -collaborates with the NAMECACHE subsystem and stores zone information -when local information are modified in the NAMECACHE cache to increase look-up -performance for local information and to enable local access to private records -in zones through GNS. - -NAMESTORE provides functionality to look-up and store records, to -iterate over a specific or all zones and to monitor zones for changes. -NAMESTORE functionality can be accessed using the NAMESTORE C API, the NAMESTORE -REST API, or the NAMESTORE command line tool. - -:index:`libgnunetnamestore <single: libgnunet; namestore>` -libgnunetnamestore ------------------- - -To interact with NAMESTORE clients first connect to the NAMESTORE -service using the ``GNUNET_NAMESTORE_connect`` passing a configuration -handle. As a result they obtain a NAMESTORE handle, they can use for -operations, or NULL is returned if the connection failed. - -To disconnect from NAMESTORE, clients use -``GNUNET_NAMESTORE_disconnect`` and specify the handle to disconnect. - -NAMESTORE internally uses the private key to refer to zones. These -private keys can be obtained from the IDENTITY subsystem. Here *egos* -*can be used to refer to zones or the default ego assigned to the GNS -subsystem can be used to obtained the master zone's private key.* - -.. _Editing-Zone-Information: - -Editing Zone Information -^^^^^^^^^^^^^^^^^^^^^^^^ - -NAMESTORE provides functions to lookup records stored under a label in a -zone and to store records under a label in a zone. - -To store (and delete) records, the client uses the -``GNUNET_NAMESTORE_records_store`` function and has to provide namestore -handle to use, the private key of the zone, the label to store the -records under, the records and number of records plus an callback -function. After the operation is performed NAMESTORE will call the -provided callback function with the result GNUNET_SYSERR on failure -(including timeout/queue drop/failure to validate), GNUNET_NO if content -was already there or not found GNUNET_YES (or other positive value) on -success plus an additional error message. - -In addition, ``GNUNET_NAMESTORE_records_store2`` can be used to store multiple -record sets using a single API call. This allows the caller to import -a large number of (label, records) tuples in a single database transaction. -This is useful for large zone imports. - -Records are deleted by using the store command with 0 records to store. -It is important to note, that records are not merged when records exist -with the label. So a client has first to retrieve records, merge with -existing records and then store the result. - -To perform a lookup operation, the client uses the -``GNUNET_NAMESTORE_records_lookup`` function. Here it has to pass the -namestore handle, the private key of the zone and the label. It also has -to provide a callback function which will be called with the result of -the lookup operation: the zone for the records, the label, and the -records including the number of records included. - -A special operation is used to set the preferred nickname for a zone. -This nickname is stored with the zone and is automatically merged with -all labels and records stored in a zone. Here the client uses the -``GNUNET_NAMESTORE_set_nick`` function and passes the private key of the -zone, the nickname as string plus a the callback with the result of the -operation. - -.. _Transactional-Namestore-API: - -Transactional operations -^^^^^^^^^^^^^^^^^^^^^^^^ - -All API calls by default are mapped to implicit single transactions in the -database backends. -This happends automatically, as most databases support implicit transactions -including the databases supported by NAMESTORE. - -However, implicit transactions have two drawbacks: - - 1. When storing or deleting a lot of records individual transactions cause - a significant overhead in the database. - 2. Storage and deletion of records my multiple clients concurrently can lead - to inconsistencies. - -This is why NAMESTORE supports explicit transactions in order to efficiently -handle large amounds of zone data as well as keep then NAMESTORE consistent -when the client thinks this is necessary. - -When the client wants to start a transaction, ``GNUNET_NAMESTORE_transaction_begin`` -is called. -After this call, ``GNUNET_NAMESTORE_records_lookup`` or ``GNUNET_NAMESTORE_records_store(2)`` -can be called successively. -The operations will only be commited to the database (and monitors such as ZONEMASTER -notified of the changes) when ``GNUNET_NAMESTORE_transaction_commit`` is used -to finalize the transaction. -Alternatively, the transaction can be aborted using ``GNUNET_NAMESTORE_transaction_rollback``. -Should the client disconnect after calling ``GNUNET_NAMESTORE_transaction_begin`` -any running transaction will automatically be rolled-back. - - - -.. _Iterating-Zone-Information: - -Iterating Zone Information -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -A client can iterate over all information in a zone or all zones managed -by NAMESTORE. Here a client uses one of the -``GNUNET_NAMESTORE_zone_iteration_start(2)`` functions and passes the -namestore handle, the zone to iterate over and a callback function to -call with the result. To iterate over all the zones, it is possible to -pass NULL for the zone. A ``GNUNET_NAMESTORE_ZoneIterator`` handle is -returned to be used to continue iteration. - -NAMESTORE calls the callback for every result and expects the client to -call ``GNUNET_NAMESTORE_zone_iterator_next`` to continue to iterate or -``GNUNET_NAMESTORE_zone_iterator_stop`` to interrupt the iteration. When -NAMESTORE reached the last item it will call the callback with a NULL -value to indicate. - -.. _Monitoring-Zone-Information: - -Monitoring Zone Information -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Clients can also monitor zones to be notified about changes. Here the -clients uses one of the ``GNUNET_NAMESTORE_zone_monitor_start(2)`` functions and -passes the private key of the zone and and a callback function to call -with updates for a zone. The client can specify to obtain zone -information first by iterating over the zone and specify a -synchronization callback to be called when the client and the namestore -are synced. - -On an update, NAMESTORE will call the callback with the private key of -the zone, the label and the records and their number. - -To stop monitoring, the client calls -``GNUNET_NAMESTORE_zone_monitor_stop`` and passes the handle obtained -from the function to start the monitoring. diff --git a/subsystems/nse.rst b/subsystems/nse.rst @@ -1,315 +0,0 @@ -.. index:: - single: subsystem; Network size estimation - see: NSE; Network size estimation - -.. _NSE-Subsystem: - -NSE — Network size estimation -============================= - -NSE stands for Network Size Estimation. The NSE subsystem provides other -subsystems and users with a rough estimate of the number of peers -currently participating in the GNUnet overlay. The computed value is not -a precise number as producing a precise number in a decentralized, -efficient and secure way is impossible. While NSE's estimate is -inherently imprecise, NSE also gives the expected range. For a peer that -has been running in a stable network for a while, the real network size -will typically (99.7% of the time) be in the range of [2/3 estimate, 3/2 -estimate]. We will now give an overview of the algorithm used to -calculate the estimate; all of the details can be found in this -technical report. - -.. todo:: link to the report. - -.. _Motivation: - -Motivation ----------- - -Some subsystems, like DHT, need to know the size of the GNUnet network -to optimize some parameters of their own protocol. The decentralized -nature of GNUnet makes efficient and securely counting the exact number -of peers infeasible. Although there are several decentralized algorithms -to count the number of peers in a system, so far there is none to do so -securely. Other protocols may allow any malicious peer to manipulate the -final result or to take advantage of the system to perform Denial of -Service (DoS) attacks against the network. GNUnet's NSE protocol avoids -these drawbacks. - -NSE security -.. _Security: - -:index:`Security <single: NSE; security>` -Security -^^^^^^^^ - -The NSE subsystem is designed to be resilient against these attacks. It -uses `proofs of -work <http://en.wikipedia.org/wiki/Proof-of-work_system>`__ to prevent -one peer from impersonating a large number of participants, which would -otherwise allow an adversary to artificially inflate the estimate. The -DoS protection comes from the time-based nature of the protocol: the -estimates are calculated periodically and out-of-time traffic is either -ignored or stored for later retransmission by benign peers. In -particular, peers cannot trigger global network communication at will. - -.. _Principle: - -:index:`Principle <single: NSE; principle of operation>` -Principle ---------- - -The algorithm calculates the estimate by finding the globally closest -peer ID to a random, time-based value. - -The idea is that the closer the ID is to the random value, the more -\"densely packed\" the ID space is, and therefore, more peers are in the -network. - -.. _Example: - -Example -^^^^^^^ - -Suppose all peers have IDs between 0 and 100 (our ID space), and the -random value is 42. If the closest peer has the ID 70 we can imagine -that the average \"distance\" between peers is around 30 and therefore -the are around 3 peers in the whole ID space. On the other hand, if the -closest peer has the ID 44, we can imagine that the space is rather -packed with peers, maybe as much as 50 of them. Naturally, we could have -been rather unlucky, and there is only one peer and happens to have the -ID 44. Thus, the current estimate is calculated as the average over -multiple rounds, and not just a single sample. - -.. _Algorithm: - -Algorithm -^^^^^^^^^ - -Given that example, one can imagine that the job of the subsystem is to -efficiently communicate the ID of the closest peer to the target value -to all the other peers, who will calculate the estimate from it. - -.. _Target-value: - -Target value -^^^^^^^^^^^^ - -The target value itself is generated by hashing the current time, -rounded down to an agreed value. If the rounding amount is 1h (default) -and the time is 12:34:56, the time to hash would be 12:00:00. The -process is repeated each rounding amount (in this example would be every -hour). Every repetition is called a round. - -.. _Timing: - -Timing -^^^^^^ - -The NSE subsystem has some timing control to avoid everybody -broadcasting its ID all at one. Once each peer has the target random -value, it compares its own ID to the target and calculates the -hypothetical size of the network if that peer were to be the closest. -Then it compares the hypothetical size with the estimate from the -previous rounds. For each value there is an associated point in the -period, let's call it \"broadcast time\". If its own hypothetical -estimate is the same as the previous global estimate, its \"broadcast -time\" will be in the middle of the round. If its bigger it will be -earlier and if its smaller (the most likely case) it will be later. This -ensures that the peers closest to the target value start broadcasting -their ID the first. - -.. _Controlled-Flooding: - -Controlled Flooding -^^^^^^^^^^^^^^^^^^^ - -When a peer receives a value, first it verifies that it is closer than -the closest value it had so far, otherwise it answers the incoming -message with a message containing the better value. Then it checks a -proof of work that must be included in the incoming message, to ensure -that the other peer's ID is not made up (otherwise a malicious peer -could claim to have an ID of exactly the target value every round). Once -validated, it compares the broadcast time of the received value with the -current time and if it's not too early, sends the received value to its -neighbors. Otherwise it stores the value until the correct broadcast -time comes. This prevents unnecessary traffic of sub-optimal values, -since a better value can come before the broadcast time, rendering the -previous one obsolete and saving the traffic that would have been used -to broadcast it to the neighbors. - -.. _Calculating-the-estimate: - -Calculating the estimate -^^^^^^^^^^^^^^^^^^^^^^^^ - -Once the closest ID has been spread across the network each peer gets -the exact distance between this ID and the target value of the round and -calculates the estimate with a mathematical formula described in the -tech report. The estimate generated with this method for a single round -is not very precise. Remember the case of the example, where the only -peer is the ID 44 and we happen to generate the target value 42, -thinking there are 50 peers in the network. Therefore, the NSE subsystem -remembers the last 64 estimates and calculates an average over them, -giving a result of which usually has one bit of uncertainty (the real -size could be half of the estimate or twice as much). Note that the -actual network size is calculated in powers of two of the raw input, -thus one bit of uncertainty means a factor of two in the size estimate. - -:index:`libgnunetnse <single: libgnunet; nse>` -libgnunetnse ------------- - -The NSE subsystem has the simplest API of all services, with only two -calls: ``GNUNET_NSE_connect`` and ``GNUNET_NSE_disconnect``. - -The connect call gets a callback function as a parameter and this -function is called each time the network agrees on an estimate. This -usually is once per round, with some exceptions: if the closest peer has -a late local clock and starts spreading its ID after everyone else -agreed on a value, the callback might be activated twice in a round, the -second value being always bigger than the first. The default round time -is set to 1 hour. - -The disconnect call disconnects from the NSE subsystem and the callback -is no longer called with new estimates. - -.. _Results: - -Results -^^^^^^^ - -The callback provides two values: the average and the `standard -deviation <http://en.wikipedia.org/wiki/Standard_deviation>`__ of the -last 64 rounds. The values provided by the callback function are -logarithmic, this means that the real estimate numbers can be obtained -by calculating 2 to the power of the given value (2average). From a -statistics point of view this means that: - -- 68% of the time the real size is included in the interval - [(2average-stddev), 2] - -- 95% of the time the real size is included in the interval - [(2average-2*stddev, 2^average+2*stddev] - -- 99.7% of the time the real size is included in the interval - [(2average-3*stddev, 2average+3*stddev] - -The expected standard variation for 64 rounds in a network of stable -size is 0.2. Thus, we can say that normally: - -- 68% of the time the real size is in the range [-13%, +15%] - -- 95% of the time the real size is in the range [-24%, +32%] - -- 99.7% of the time the real size is in the range [-34%, +52%] - -As said in the introduction, we can be quite sure that usually the real -size is between one third and three times the estimate. This can of -course vary with network conditions. Thus, applications may want to also -consider the provided standard deviation value, not only the average (in -particular, if the standard variation is very high, the average maybe -meaningless: the network size is changing rapidly). - -.. _libgnunetnse-_002d-Examples: - -Examples -^^^^^^^^ - -Let's close with a couple examples. - -Average: 10, std dev: 1 Here the estimate would be - 2^10 = 1024 peers. (The range in which we can be 95% sure is: [2^8, - 2^12] = [256, 4096]. We can be very (>99.7%) sure that the network is - not a hundred peers and absolutely sure that it is not a million - peers, but somewhere around a thousand.) - -Average 22, std dev: 0.2 Here the estimate would be - 2^22 = 4 Million peers. (The range in which we can be 99.7% sure is: - [2^21.4, 2^22.6] = [2.8M, 6.3M]. We can be sure that the network size - is around four million, with absolutely way of it being 1 million.) - -To put this in perspective, if someone remembers the LHC Higgs boson -results, were announced with \"5 sigma\" and \"6 sigma\" certainties. In -this case a 5 sigma minimum would be 2 million and a 6 sigma minimum, -1.8 million. - -.. _The-NSE-Client_002dService-Protocol: - -The NSE Client-Service Protocol -------------------------------- - -As with the API, the client-service protocol is very simple, only has 2 -different messages, defined in ``src/nse/nse.h``: - -- ``GNUNET_MESSAGE_TYPE_NSE_START`` This message has no parameters and - is sent from the client to the service upon connection. - -- ``GNUNET_MESSAGE_TYPE_NSE_ESTIMATE`` This message is sent from the - service to the client for every new estimate and upon connection. - Contains a timestamp for the estimate, the average and the standard - deviation for the respective round. - -When the ``GNUNET_NSE_disconnect`` API call is executed, the client -simply disconnects from the service, with no message involved. - -NSE Peer-to-Peer Protocol -.. _The-NSE-Peer_002dto_002dPeer-Protocol: - -The NSE Peer-to-Peer Protocol ------------------------------ - -GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD -The NSE subsystem only has one message in the P2P protocol, the -``GNUNET_MESSAGE_TYPE_NSE_P2P_FLOOD`` message. - -This message key contents are the timestamp to identify the round -(differences in system clocks may cause some peers to send messages way -too early or way too late, so the timestamp allows other peers to -identify such messages easily), the `proof of -work <http://en.wikipedia.org/wiki/Proof-of-work_system>`__ used to make -it difficult to mount a `Sybil -attack <http://en.wikipedia.org/wiki/Sybil_attack>`__, and the public -key, which is used to verify the signature on the message. - -Every peer stores a message for the previous, current and next round. -The messages for the previous and current round are given to peers that -connect to us. The message for the next round is simply stored until our -system clock advances to the next round. The message for the current -round is what we are flooding the network with right now. At the -beginning of each round the peer does the following: - -- calculates its own distance to the target value - -- creates, signs and stores the message for the current round (unless - it has a better message in the \"next round\" slot which came early - in the previous round) - -- calculates, based on the stored round message (own or received) when - to start flooding it to its neighbors - -Upon receiving a message the peer checks the validity of the message -(round, proof of work, signature). The next action depends on the -contents of the incoming message: - -- if the message is worse than the current stored message, the peer - sends the current message back immediately, to stop the other peer - from spreading suboptimal results - -- if the message is better than the current stored message, the peer - stores the new message and calculates the new target time to start - spreading it to its neighbors (excluding the one the message came - from) - -- if the message is for the previous round, it is compared to the - message stored in the \"previous round slot\", which may then be - updated - -- if the message is for the next round, it is compared to the message - stored in the \"next round slot\", which again may then be updated - -Finally, when it comes to send the stored message for the current round -to the neighbors there is a random delay added for each neighbor, to -avoid traffic spikes and minimize cross-messages. - - diff --git a/subsystems/peerinfo.rst b/subsystems/peerinfo.rst @@ -1,217 +0,0 @@ - -.. index:: - double: subsystem; PEERINFO - -.. _PEERINFO-Subsystem: - -PEERINFO — Persistent HELLO storage -=================================== - -The PEERINFO subsystem is used to store verified (validated) information -about known peers in a persistent way. It obtains these addresses for -example from TRANSPORT service which is in charge of address validation. -Validation means that the information in the HELLO message are checked -by connecting to the addresses and performing a cryptographic handshake -to authenticate the peer instance stating to be reachable with these -addresses. Peerinfo does not validate the HELLO messages itself but only -stores them and gives them to interested clients. - -As future work, we think about moving from storing just HELLO messages -to providing a generic persistent per-peer information store. More and -more subsystems tend to need to store per-peer information in persistent -way. To not duplicate this functionality we plan to provide a PEERSTORE -service providing this functionality. - -.. _PEERINFO-_002d-Features: - -PEERINFO - Features -------------------- - -- Persistent storage - -- Client notification mechanism on update - -- Periodic clean up for expired information - -- Differentiation between public and friend-only HELLO - -.. _PEERINFO-_002d-Limitations: - -PEERINFO - Limitations ----------------------- - -- Does not perform HELLO validation - -.. _DeveloperPeer-Information: - -DeveloperPeer Information -------------------------- - -The PEERINFO subsystem stores these information in the form of HELLO -messages you can think of as business cards. These HELLO messages -contain the public key of a peer and the addresses a peer can be reached -under. The addresses include an expiration date describing how long they -are valid. This information is updated regularly by the TRANSPORT -service by revalidating the address. If an address is expired and not -renewed, it can be removed from the HELLO message. - -Some peer do not want to have their HELLO messages distributed to other -peers, especially when GNUnet's friend-to-friend modus is enabled. To -prevent this undesired distribution. PEERINFO distinguishes between -*public* and *friend-only* HELLO messages. Public HELLO messages can be -freely distributed to other (possibly unknown) peers (for example using -the hostlist, gossiping, broadcasting), whereas friend-only HELLO -messages may not be distributed to other peers. Friend-only HELLO -messages have an additional flag ``friend_only`` set internally. For -public HELLO message this flag is not set. PEERINFO does and cannot not -check if a client is allowed to obtain a specific HELLO type. - -The HELLO messages can be managed using the GNUnet HELLO library. Other -GNUnet systems can obtain these information from PEERINFO and use it for -their purposes. Clients are for example the HOSTLIST component providing -these information to other peers in form of a hostlist or the TRANSPORT -subsystem using these information to maintain connections to other -peers. - -.. _Startup: - -Startup -------- - -During startup the PEERINFO services loads persistent HELLOs from disk. -First PEERINFO parses the directory configured in the HOSTS value of the -``PEERINFO`` configuration section to store PEERINFO information. For -all files found in this directory valid HELLO messages are extracted. In -addition it loads HELLO messages shipped with the GNUnet distribution. -These HELLOs are used to simplify network bootstrapping by providing -valid peer information with the distribution. The use of these HELLOs -can be prevented by setting the ``USE_INCLUDED_HELLOS`` in the -``PEERINFO`` configuration section to ``NO``. Files containing invalid -information are removed. - -.. _Managing-Information: - -Managing Information --------------------- - -The PEERINFO services stores information about known PEERS and a single -HELLO message for every peer. A peer does not need to have a HELLO if no -information are available. HELLO information from different sources, for -example a HELLO obtained from a remote HOSTLIST and a second HELLO -stored on disk, are combined and merged into one single HELLO message -per peer which will be given to clients. During this merge process the -HELLO is immediately written to disk to ensure persistence. - -PEERINFO in addition periodically scans the directory where information -are stored for empty HELLO messages with expired TRANSPORT addresses. -This periodic task scans all files in the directory and recreates the -HELLO messages it finds. Expired TRANSPORT addresses are removed from -the HELLO and if the HELLO does not contain any valid addresses, it is -discarded and removed from the disk. - -.. _Obtaining-Information: - -Obtaining Information ---------------------- - -When a client requests information from PEERINFO, PEERINFO performs a -lookup for the respective peer or all peers if desired and transmits -this information to the client. The client can specify if friend-only -HELLOs have to be included or not and PEERINFO filters the respective -HELLO messages before transmitting information. - -To notify clients about changes to PEERINFO information, PEERINFO -maintains a list of clients interested in this notifications. Such a -notification occurs if a HELLO for a peer was updated (due to a merge -for example) or a new peer was added. - -.. _The-PEERINFO-Client_002dService-Protocol: - -The PEERINFO Client-Service Protocol ------------------------------------- - -To connect and disconnect to and from the PEERINFO Service PEERINFO -utilizes the util client/server infrastructure, so no special messages -types are used here. - -To add information for a peer, the plain HELLO message is transmitted to -the service without any wrapping. All pieces of information required are -stored within the HELLO message. The PEERINFO service provides a message -handler accepting and processing these HELLO messages. - -When obtaining PEERINFO information using the iterate functionality -specific messages are used. To obtain information for all peers, a -``struct ListAllPeersMessage`` with message type -``GNUNET_MESSAGE_TYPE_PEERINFO_GET_ALL`` and a flag include_friend_only -to indicate if friend-only HELLO messages should be included are -transmitted. If information for a specific peer is required a -``struct ListAllPeersMessage`` with ``GNUNET_MESSAGE_TYPE_PEERINFO_GET`` -containing the peer identity is used. - -For both variants the PEERINFO service replies for each HELLO message it -wants to transmit with a ``struct ListAllPeersMessage`` with type -``GNUNET_MESSAGE_TYPE_PEERINFO_INFO`` containing the plain HELLO. The -final message is ``struct GNUNET_MessageHeader`` with type -``GNUNET_MESSAGE_TYPE_PEERINFO_INFO``. If the client receives this -message, it can proceed with the next request if any is pending. - -:index:`libgnunetpeerinfo <single: libgnunet; peerinfo>` -libgnunetpeerinfo ------------------ - -The PEERINFO API consists mainly of three different functionalities: - -- maintaining a connection to the service - -- adding new information to the PEERINFO service - -- retrieving information from the PEERINFO service - -.. _Connecting-to-the-PEERINFO-Service: - -Connecting to the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To connect to the PEERINFO service the function -``GNUNET_PEERINFO_connect`` is used, taking a configuration handle as an -argument, and to disconnect from PEERINFO the function -``GNUNET_PEERINFO_disconnect``, taking the PEERINFO handle returned from -the connect function has to be called. - -.. _Adding-Information-to-the-PEERINFO-Service: - -Adding Information to the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -``GNUNET_PEERINFO_add_peer`` adds a new peer to the PEERINFO subsystem -storage. This function takes the PEERINFO handle as an argument, the -HELLO message to store and a continuation with a closure to be called -with the result of the operation. The ``GNUNET_PEERINFO_add_peer`` -returns a handle to this operation allowing to cancel the operation with -the respective cancel function ``GNUNET_PEERINFO_add_peer_cancel``. To -retrieve information from PEERINFO you can iterate over all information -stored with PEERINFO or you can tell PEERINFO to notify if new peer -information are available. - -.. _Obtaining-Information-from-the-PEERINFO-Service: - -Obtaining Information from the PEERINFO Service -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To iterate over information in PEERINFO you use -``GNUNET_PEERINFO_iterate``. This function expects the PEERINFO handle, -a flag if HELLO messages intended for friend only mode should be -included, a timeout how long the operation should take and a callback -with a callback closure to be called for the results. If you want to -obtain information for a specific peer, you can specify the peer -identity, if this identity is NULL, information for all peers are -returned. The function returns a handle to allow to cancel the operation -using ``GNUNET_PEERINFO_iterate_cancel``. - -To get notified when peer information changes, you can use -``GNUNET_PEERINFO_notify``. This function expects a configuration handle -and a flag if friend-only HELLO messages should be included. The -PEERINFO service will notify you about every change and the callback -function will be called to notify you about changes. The function -returns a handle to cancel notifications with -``GNUNET_PEERINFO_notify_cancel``. diff --git a/subsystems/peerstore.rst b/subsystems/peerstore.rst @@ -1,110 +0,0 @@ - -.. index:: - double: subsystem; PEERSTORE - -.. _PEERSTORE-Subsystem: - -PEERSTORE — Extensible local persistent data storage -==================================================== - -GNUnet's PEERSTORE subsystem offers persistent per-peer storage for -other GNUnet subsystems. GNUnet subsystems can use PEERSTORE to -persistently store and retrieve arbitrary data. Each data record stored -with PEERSTORE contains the following fields: - -- subsystem: Name of the subsystem responsible for the record. - -- peerid: Identity of the peer this record is related to. - -- key: a key string identifying the record. - -- value: binary record value. - -- expiry: record expiry date. - -.. _Functionality: - -Functionality -------------- - -Subsystems can store any type of value under a (subsystem, peerid, key) -combination. A \"replace\" flag set during store operations forces the -PEERSTORE to replace any old values stored under the same (subsystem, -peerid, key) combination with the new value. Additionally, an expiry -date is set after which the record is \*possibly\* deleted by PEERSTORE. - -Subsystems can iterate over all values stored under any of the following -combination of fields: - -- (subsystem) - -- (subsystem, peerid) - -- (subsystem, key) - -- (subsystem, peerid, key) - -Subsystems can also request to be notified about any new values stored -under a (subsystem, peerid, key) combination by sending a \"watch\" -request to PEERSTORE. - -.. _Architecture: - -Architecture ------------- - -PEERSTORE implements the following components: - -- PEERSTORE service: Handles store, iterate and watch operations. - -- PEERSTORE API: API to be used by other subsystems to communicate and - issue commands to the PEERSTORE service. - -- PEERSTORE plugins: Handles the persistent storage. At the moment, - only an \"sqlite\" plugin is implemented. - -:index:`libgnunetpeerstore <single: libgnunet; peerstore>` -libgnunetpeerstore ------------------- - -libgnunetpeerstore is the library containing the PEERSTORE API. -Subsystems wishing to communicate with the PEERSTORE service use this -API to open a connection to PEERSTORE. This is done by calling -``GNUNET_PEERSTORE_connect`` which returns a handle to the newly created -connection. This handle has to be used with any further calls to the -API. - -To store a new record, the function ``GNUNET_PEERSTORE_store`` is to be -used which requires the record fields and a continuation function that -will be called by the API after the STORE request is sent to the -PEERSTORE service. Note that calling the continuation function does not -mean that the record is successfully stored, only that the STORE request -has been successfully sent to the PEERSTORE service. -``GNUNET_PEERSTORE_store_cancel`` can be called to cancel the STORE -request only before the continuation function has been called. - -To iterate over stored records, the function -``GNUNET_PEERSTORE_iterate`` is to be used. *peerid* and *key* can be -set to NULL. An iterator callback function will be called with each -matching record found and a NULL record at the end to signal the end of -result set. ``GNUNET_PEERSTORE_iterate_cancel`` can be used to cancel -the ITERATE request before the iterator callback is called with a NULL -record. - -To be notified with new values stored under a (subsystem, peerid, key) -combination, the function ``GNUNET_PEERSTORE_watch`` is to be used. This -will register the watcher with the PEERSTORE service, any new records -matching the given combination will trigger the callback function passed -to ``GNUNET_PEERSTORE_watch``. This continues until -``GNUNET_PEERSTORE_watch_cancel`` is called or the connection to the -service is destroyed. - -After the connection is no longer needed, the function -``GNUNET_PEERSTORE_disconnect`` can be called to disconnect from the -PEERSTORE service. Any pending ITERATE or WATCH requests will be -destroyed. If the ``sync_first`` flag is set to ``GNUNET_YES``, the API -will delay the disconnection until all pending STORE requests are sent -to the PEERSTORE service, otherwise, the pending STORE requests will be -destroyed as well. - - diff --git a/subsystems/regex.rst b/subsystems/regex.rst @@ -1,152 +0,0 @@ - -.. index:: - double: subsystem; REGEX - -.. _REGEX-Subsystem: - -REGEX — Service discovery using regular expressions -=================================================== - -Using the REGEX subsystem, you can discover peers that offer a -particular service using regular expressions. The peers that offer a -service specify it using a regular expressions. Peers that want to -patronize a service search using a string. The REGEX subsystem will then -use the DHT to return a set of matching offerers to the patrons. - -For the technical details, we have Max's defense talk and Max's Master's -thesis. - -.. note:: An additional publication is under preparation and available - to team members (in Git). - -.. todo:: Missing links to Max's talk and Master's thesis - -.. _How-to-run-the-regex-profiler: - -How to run the regex profiler ------------------------------ - -The gnunet-regex-profiler can be used to profile the usage of mesh/regex -for a given set of regular expressions and strings. Mesh/regex allows -you to announce your peer ID under a certain regex and search for peers -matching a particular regex using a string. See -`szengel2012ms <https://bib.gnunet.org/full/date.html#2012_5f2>`__ for a -full introduction. - -First of all, the regex profiler uses GNUnet testbed, thus all the -implications for testbed also apply to the regex profiler (for example -you need password-less ssh login to the machines listed in your hosts -file). - -**Configuration** - -Moreover, an appropriate configuration file is needed. In the following -paragraph the important details are highlighted. - -Announcing of the regular expressions is done by the -gnunet-daemon-regexprofiler, therefore you have to make sure it is -started, by adding it to the START_ON_DEMAND set of ARM: - -:: - - [regexprofiler] - START_ON_DEMAND = YES - -Furthermore you have to specify the location of the binary: - -:: - - [regexprofiler] - # Location of the gnunet-daemon-regexprofiler binary. - BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler - # Regex prefix that will be applied to all regular expressions and - # search string. - REGEX_PREFIX = "GNVPN-0001-PAD" - -When running the profiler with a large scale deployment, you probably -want to reduce the workload of each peer. Use the following options to -do this. - -:: - - [dht] - # Force network size estimation - FORCE_NSE = 1 - - [dhtcache] - DATABASE = heap - # Disable RC-file for Bloom filter? (for benchmarking with limited IO - # availability) - DISABLE_BF_RC = YES - # Disable Bloom filter entirely - DISABLE_BF = YES - - [nse] - # Minimize proof-of-work CPU consumption by NSE - WORKBITS = 1 - -**Options** - -To finally run the profiler some options and the input data need to be -specified on the command line. - -:: - - gnunet-regex-profiler -c config-file -d log-file -n num-links \ - -p path-compression-length -s search-delay -t matching-timeout \ - -a num-search-strings hosts-file policy-dir search-strings-file - -Where\... - -- \... ``config-file`` means the configuration file created earlier. - -- \... ``log-file`` is the file where to write statistics output. - -- \... ``num-links`` indicates the number of random links between - started peers. - -- \... ``path-compression-length`` is the maximum path compression - length in the DFA. - -- \... ``search-delay`` time to wait between peers finished linking and - starting to match strings. - -- \... ``matching-timeout`` timeout after which to cancel the - searching. - -- \... ``num-search-strings`` number of strings in the - search-strings-file. - -- \... the ``hosts-file`` should contain a list of hosts for the - testbed, one per line in the following format: - - - ``user@host_ip:port`` - -- \... the ``policy-dir`` is a folder containing text files containing - one or more regular expressions. A peer is started for each file in - that folder and the regular expressions in the corresponding file are - announced by this peer. - -- \... the ``search-strings-file`` is a text file containing search - strings, one in each line. - -You can create regular expressions and search strings for every AS in -the Internet using the attached scripts. You need one of the `CAIDA -routeviews -prefix2as <http://data.caida.org/datasets/routing/routeviews-prefix2as/>`__ -data files for this. Run - -:: - - create_regex.py <filename> <output path> - -to create the regular expressions and - -:: - - create_strings.py <input path> <outfile> - -to create a search strings file from the previously created regular -expressions. - - diff --git a/subsystems/rest.rst b/subsystems/rest.rst @@ -1,53 +0,0 @@ - -.. index:: - double: subsystem; REST - -.. _REST-Subsystem: - -REST — RESTful GNUnet Web APIs -============================== - -.. todo:: Define REST - -Using the REST subsystem, you can expose REST-based APIs or services. -The REST service is designed as a pluggable architecture. To create a -new REST endpoint, simply add a library in the form "plugin_rest_*". The -REST service will automatically load all REST plugins on startup. - -**Configuration** - -The REST service can be configured in various ways. The reference config -file can be found in ``src/rest/rest.conf``: - -:: - - [rest] - REST_PORT=7776 - REST_ALLOW_HEADERS=Authorization,Accept,Content-Type - REST_ALLOW_ORIGIN=* - REST_ALLOW_CREDENTIALS=true - -The port as well as CORS (cross-origin resource sharing) headers -that are supposed to be advertised by the rest service are configurable. - -.. _Namespace-considerations: - -Namespace considerations ------------------------- - -The ``gnunet-rest-service`` will load all plugins that are installed. As -such it is important that the endpoint namespaces do not clash. - -For example, plugin X might expose the endpoint "/xxx" while plugin Y -exposes endpoint "/xxx/yyy". This is a problem if plugin X is also -supposed to handle a call to "/xxx/yyy". Currently the REST service will -not complain or warn about such clashes, so please make sure that -endpoints are unambiguous. - -.. _Endpoint-documentation: - -Endpoint documentation ----------------------- - -This is WIP. Endpoints should be documented appropriately. Preferably -using annotations. diff --git a/subsystems/revocation.rst b/subsystems/revocation.rst @@ -1,173 +0,0 @@ - -.. index:: - double: subsystem; REVOCATION - -.. _REVOCATION-Subsystem: - -REVOCATION — Ego key revocation -=============================== - -The REVOCATION subsystem is responsible for key revocation of Egos. If a -user learns that their private key has been compromised or has lost it, -they can use the REVOCATION system to inform all of the other users that -their private key is no longer valid. The subsystem thus includes ways -to query for the validity of keys and to propagate revocation messages. - -.. _Dissemination: - -Dissemination -------------- - -When a revocation is performed, the revocation is first of all -disseminated by flooding the overlay network. The goal is to reach every -peer, so that when a peer needs to check if a key has been revoked, this -will be purely a local operation where the peer looks at its local -revocation list. Flooding the network is also the most robust form of -key revocation --- an adversary would have to control a separator of the -overlay graph to restrict the propagation of the revocation message. -Flooding is also very easy to implement --- peers that receive a -revocation message for a key that they have never seen before simply -pass the message to all of their neighbours. - -Flooding can only distribute the revocation message to peers that are -online. In order to notify peers that join the network later, the -revocation service performs efficient set reconciliation over the sets -of known revocation messages whenever two peers (that both support -REVOCATION dissemination) connect. The SET service is used to perform -this operation efficiently. - -.. _Revocation-Message-Design-Requirements: - -Revocation Message Design Requirements --------------------------------------- - -However, flooding is also quite costly, creating O(\|E\|) messages on a -network with \|E\| edges. Thus, revocation messages are required to -contain a proof-of-work, the result of an expensive computation (which, -however, is cheap to verify). Only peers that have expended the CPU time -necessary to provide this proof will be able to flood the network with -the revocation message. This ensures that an attacker cannot simply -flood the network with millions of revocation messages. The -proof-of-work required by GNUnet is set to take days on a typical PC to -compute; if the ability to quickly revoke a key is needed, users have -the option to pre-compute revocation messages to store off-line and use -instantly after their key has expired. - -Revocation messages must also be signed by the private key that is being -revoked. Thus, they can only be created while the private key is in the -possession of the respective user. This is another reason to create a -revocation message ahead of time and store it in a secure location. - -:index:`libgnunetrevocation <single: libgnunet; revocation>` -libgnunetrevocation -------------------- - -The REVOCATION API consists of two parts, to query and to issue -revocations. - -.. _Querying-for-revoked-keys: - -Querying for revoked keys -^^^^^^^^^^^^^^^^^^^^^^^^^ - -``GNUNET_REVOCATION_query`` is used to check if a given ECDSA public key -has been revoked. The given callback will be invoked with the result of -the check. The query can be canceled using -``GNUNET_REVOCATION_query_cancel`` on the return value. - -.. _Preparing-revocations: - -Preparing revocations -^^^^^^^^^^^^^^^^^^^^^ - -It is often desirable to create a revocation record ahead-of-time and -store it in an off-line location to be used later in an emergency. This -is particularly true for GNUnet revocations, where performing the -revocation operation itself is computationally expensive and thus is -likely to take some time. Thus, if users want the ability to perform -revocations quickly in an emergency, they must pre-compute the -revocation message. The revocation API enables this with two functions -that are used to compute the revocation message, but not trigger the -actual revocation operation. - -``GNUNET_REVOCATION_check_pow`` should be used to calculate the -proof-of-work required in the revocation message. This function takes -the public key, the required number of bits for the proof of work (which -in GNUnet is a network-wide constant) and finally a proof-of-work number -as arguments. The function then checks if the given proof-of-work number -is a valid proof of work for the given public key. Clients preparing a -revocation are expected to call this function repeatedly (typically with -a monotonically increasing sequence of numbers of the proof-of-work -number) until a given number satisfies the check. That number should -then be saved for later use in the revocation operation. - -``GNUNET_REVOCATION_sign_revocation`` is used to generate the signature -that is required in a revocation message. It takes the private key that -(possibly in the future) is to be revoked and returns the signature. The -signature can again be saved to disk for later use, which will then -allow performing a revocation even without access to the private key. - -.. _Issuing-revocations: - -Issuing revocations -^^^^^^^^^^^^^^^^^^^ - -Given a ECDSA public key, the signature from ``GNUNET_REVOCATION_sign`` -and the proof-of-work, ``GNUNET_REVOCATION_revoke`` can be used to -perform the actual revocation. The given callback is called upon -completion of the operation. ``GNUNET_REVOCATION_revoke_cancel`` can be -used to stop the library from calling the continuation; however, in that -case it is undefined whether or not the revocation operation will be -executed. - -.. _The-REVOCATION-Client_002dService-Protocol: - -The REVOCATION Client-Service Protocol --------------------------------------- - -The REVOCATION protocol consists of four simple messages. - -A ``QueryMessage`` containing a public ECDSA key is used to check if a -particular key has been revoked. The service responds with a -``QueryResponseMessage`` which simply contains a bit that says if the -given public key is still valid, or if it has been revoked. - -The second possible interaction is for a client to revoke a key by -passing a ``RevokeMessage`` to the service. The ``RevokeMessage`` -contains the ECDSA public key to be revoked, a signature by the -corresponding private key and the proof-of-work. The service responds -with a ``RevocationResponseMessage`` which can be used to indicate that -the ``RevokeMessage`` was invalid (e.g. the proof of work is incorrect), -or otherwise to indicate that the revocation has been processed -successfully. - -.. _The-REVOCATION-Peer_002dto_002dPeer-Protocol: - -The REVOCATION Peer-to-Peer Protocol ------------------------------------- - -Revocation uses two disjoint ways to spread revocation information among -peers. First of all, P2P gossip exchanged via CORE-level neighbours is -used to quickly spread revocations to all connected peers. Second, -whenever two peers (that both support revocations) connect, the SET -service is used to compute the union of the respective revocation sets. - -In both cases, the exchanged messages are ``RevokeMessage``\ s which -contain the public key that is being revoked, a matching ECDSA -signature, and a proof-of-work. Whenever a peer learns about a new -revocation this way, it first validates the signature and the -proof-of-work, then stores it to disk (typically to a file -$GNUNET_DATA_HOME/revocation.dat) and finally spreads the information to -all directly connected neighbours. - -For computing the union using the SET service, the peer with the smaller -hashed peer identity will connect (as a \"client\" in the two-party set -protocol) to the other peer after one second (to reduce traffic spikes -on connect) and initiate the computation of the set union. All -revocation services use a common hash to identify the SET operation over -revocation sets. - -The current implementation accepts revocation set union operations from -all peers at any time; however, well-behaved peers should only initiate -this operation once after establishing a connection to a peer with a -larger hashed peer identity. diff --git a/subsystems/rps.rst b/subsystems/rps.rst @@ -1,76 +0,0 @@ - -.. index:: - double: subsystems; Random peer sampling - see: RPS; Random peer sampling - -.. _RPS-Subsystem: - -RPS — Random peer sampling -========================== - -In literature, Random Peer Sampling (RPS) refers to the problem of -reliably [1]_ drawing random samples from an unstructured p2p network. - -Doing so in a reliable manner is not only hard because of inherent -problems but also because of possible malicious peers that could try to -bias the selection. - -It is useful for all kind of gossip protocols that require the selection -of random peers in the whole network like gathering statistics, -spreading and aggregating information in the network, load balancing and -overlay topology management. - -The approach chosen in the RPS service implementation in GNUnet follows -the `Brahms <https://bib.gnunet.org/full/date.html\#2009_5f0>`__ design. - -The current state is \"work in progress\". There are a lot of things -that need to be done, primarily finishing the experimental evaluation -and a re-design of the API. - -The abstract idea is to subscribe to connect to/start the RPS service -and request random peers that will be returned when they represent a -random selection from the whole network with high probability. - -An additional feature to the original Brahms-design is the selection of -sub-groups: The GNUnet implementation of RPS enables clients to ask for -random peers from a group that is defined by a common shared secret. -(The secret could of course also be public, depending on the use-case.) - -Another addition to the original protocol was made: The sampler -mechanism that was introduced in Brahms was slightly adapted and used to -actually sample the peers and returned to the client. This is necessary -as the original design only keeps peers connected to random other peers -in the network. In order to return random peers to client requests -independently random, they cannot be drawn from the connected peers. The -adapted sampler makes sure that each request for random peers is -independent from the others. - -.. _Brahms: - -Brahms ------- - -The high-level concept of Brahms is two-fold: Combining push-pull gossip -with locally fixing a assumed bias using cryptographic min-wise -permutations. The central data structure is the view - a peer's current -local sample. This view is used to select peers to push to and pull -from. This simple mechanism can be biased easily. For this reason Brahms -'fixes' the bias by using the so-called sampler. A data structure that -takes a list of elements as input and outputs a random one of them -independently of the frequency in the input set. Both an element that -was put into the sampler a single time and an element that was put into -it a million times have the same probability of being the output. This -is achieved with exploiting min-wise independent permutations. In the -RPS service we use HMACs: On the initialisation of a sampler element, a -key is chosen at random. On each input the HMAC with the random key is -computed. The sampler element keeps the element with the minimal HMAC. - -In order to fix the bias in the view, a fraction of the elements in the -view are sampled through the sampler from the random stream of peer IDs. - -According to the theoretical analysis of Bortnikov et al. this suffices -to keep the network connected and having random peers in the view. - -.. [1] - \"Reliable\" in this context means having no bias, neither spatial, - nor temporal, nor through malicious activity. diff --git a/subsystems/set/set.rst b/subsystems/set/set.rst @@ -1,337 +0,0 @@ -.. index:: - double: subsystem; SET - -.. _SET-Subsystem: - -SET — Peer to peer set operations (Deprecated) -============================================== - -.. note:: - - The SET subsystem is in process of being replaced by the SETU and SETI - subsystems, which provide basically the same functionality, just using - two different subsystems. SETI and SETU should be used for new code. - -The SET service implements efficient set operations between two peers -over a CADET tunnel. Currently, set union and set intersection are the -only supported operations. Elements of a set consist of an *element -type* and arbitrary binary *data*. The size of an element's data is -limited to around 62 KB. - -.. _Local-Sets: - -Local Sets ----------- - -Sets created by a local client can be modified and reused for multiple -operations. As each set operation requires potentially expensive special -auxiliary data to be computed for each element of a set, a set can only -participate in one type of set operation (either union or intersection). -The type of a set is determined upon its creation. If a the elements of -a set are needed for an operation of a different type, all of the set's -element must be copied to a new set of appropriate type. - -.. _Set-Modifications: - -Set Modifications ------------------ - -Even when set operations are active, one can add to and remove elements -from a set. However, these changes will only be visible to operations -that have been created after the changes have taken place. That is, -every set operation only sees a snapshot of the set from the time the -operation was started. This mechanism is *not* implemented by copying -the whole set, but by attaching *generation information* to each element -and operation. - -.. _Set-Operations: - -Set Operations --------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Result-Elements: - -Result Elements ---------------- - -The SET service has three *result modes* that determine how an -operation's result set is delivered to the client: - -- **Full Result Set.** All elements of set resulting from the set - operation are returned to the client. - -- **Added Elements.** Only elements that result from the operation and - are not already in the local peer's set are returned. Note that for - some operations (like set intersection) this result mode will never - return any elements. This can be useful if only the remove peer is - actually interested in the result of the set operation. - -- **Removed Elements.** Only elements that are in the local peer's - initial set but not in the operation's result set are returned. Note - that for some operations (like set union) this result mode will never - return any elements. This can be useful if only the remove peer is - actually interested in the result of the set operation. - - -:index:`libgnunetset <single: libgnunet; set>` -libgnunetset ------------- - -.. _Sets: - -Sets -^^^^ - -New sets are created with ``GNUNET_SET_create``. Both the local peer's -configuration (as each set has its own client connection) and the -operation type must be specified. The set exists until either the client -calls ``GNUNET_SET_destroy`` or the client's connection to the service -is disrupted. In the latter case, the client is notified by the return -value of functions dealing with sets. This return value must always be -checked. - -Elements are added and removed with ``GNUNET_SET_add_element`` and -``GNUNET_SET_remove_element``. - -.. _Listeners: - -Listeners -^^^^^^^^^ - -Listeners are created with ``GNUNET_SET_listen``. Each time time a -remote peer suggests a set operation with an application id and -operation type matching a listener, the listener's callback is invoked. -The client then must synchronously call either ``GNUNET_SET_accept`` or -``GNUNET_SET_reject``. Note that the operation will not be started until -the client calls ``GNUNET_SET_commit`` (see Section \"Supplying a -Set\"). - -.. _Operations: - -Operations -^^^^^^^^^^ - -Operations to be initiated by the local peer are created with -``GNUNET_SET_prepare``. Note that the operation will not be started -until the client calls ``GNUNET_SET_commit`` (see Section \"Supplying a -Set\"). - -.. _Supplying-a-Set: - -Supplying a Set -^^^^^^^^^^^^^^^ - -To create symmetry between the two ways of starting a set operation -(accepting and initiating it), the operation handles returned by -``GNUNET_SET_accept`` and ``GNUNET_SET_prepare`` do not yet have a set -to operate on, thus they can not do any work yet. - -The client must call ``GNUNET_SET_commit`` to specify a set to use for -an operation. ``GNUNET_SET_commit`` may only be called once per set -operation. - -.. _The-Result-Callback: - -The Result Callback -^^^^^^^^^^^^^^^^^^^ - -Clients must specify both a result mode and a result callback with -``GNUNET_SET_accept`` and ``GNUNET_SET_prepare``. The result callback -with a status indicating either that an element was received, or the -operation failed or succeeded. The interpretation of the received -element depends on the result mode. The callback needs to know which -result mode it is used in, as the arguments do not indicate if an -element is part of the full result set, or if it is in the difference -between the original set and the final set. - -.. _The-SET-Client_002dService-Protocol: - -The SET Client-Service Protocol -------------------------------- - -.. _Creating-Sets: - -Creating Sets -^^^^^^^^^^^^^ - -For each set of a client, there exists a client connection to the -service. Sets are created by sending the ``GNUNET_SERVICE_SET_CREATE`` -message over a new client connection. Multiple operations for one set -are multiplexed over one client connection, using a request id supplied -by the client. - -.. _Listeners2: - -Listeners -^^^^^^^^^ - -Each listener also requires a separate client connection. By sending the -``GNUNET_SERVICE_SET_LISTEN`` message, the client notifies the service -of the application id and operation type it is interested in. A client -rejects an incoming request by sending ``GNUNET_SERVICE_SET_REJECT`` on -the listener's client connection. In contrast, when accepting an -incoming request, a ``GNUNET_SERVICE_SET_ACCEPT`` message must be sent -over the set that is supplied for the set operation. - -.. _Initiating-Operations: - -Initiating Operations -^^^^^^^^^^^^^^^^^^^^^ - -Operations with remote peers are initiated by sending a -``GNUNET_SERVICE_SET_EVALUATE`` message to the service. The client -connection that this message is sent by determines the set to use. - -.. _Modifying-Sets: - -Modifying Sets -^^^^^^^^^^^^^^ - -Sets are modified with the ``GNUNET_SERVICE_SET_ADD`` and -``GNUNET_SERVICE_SET_REMOVE`` messages. - -.. _Results-and-Operation-Status: - -Results and Operation Status -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The service notifies the client of result elements and success/failure -of a set operation with the ``GNUNET_SERVICE_SET_RESULT`` message. - -.. _Iterating-Sets: - -Iterating Sets -^^^^^^^^^^^^^^ - -All elements of a set can be requested by sending -``GNUNET_SERVICE_SET_ITER_REQUEST``. The server responds with -``GNUNET_SERVICE_SET_ITER_ELEMENT`` and eventually terminates the -iteration with ``GNUNET_SERVICE_SET_ITER_DONE``. After each received -element, the client must send ``GNUNET_SERVICE_SET_ITER_ACK``. Note that -only one set iteration may be active for a set at any given time. - -.. _The-SET-Intersection-Peer_002dto_002dPeer-Protocol: - -The SET Intersection Peer-to-Peer Protocol ------------------------------------------- - -The intersection protocol operates over CADET and starts with a -GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer -initiating the operation to the peer listening for inbound requests. It -includes the number of elements of the initiating peer, which is used to -decide which side will send a Bloom filter first. - -The listening peer checks if the operation type and application -identifier are acceptable for its current state. If not, it responds -with a GNUNET_MESSAGE_TYPE_SET_RESULT and a status of -GNUNET_SET_STATUS_FAILURE (and terminates the CADET channel). - -If the application accepts the request, the listener sends back a -``GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO`` if it has more -elements in the set than the client. Otherwise, it immediately starts -with the Bloom filter exchange. If the initiator receives a -``GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_ELEMENT_INFO`` response, it -beings the Bloom filter exchange, unless the set size is indicated to be -zero, in which case the intersection is considered finished after just -the initial handshake. - -.. _The-Bloom-filter-exchange: - -The Bloom filter exchange -^^^^^^^^^^^^^^^^^^^^^^^^^ - -In this phase, each peer transmits a Bloom filter over the remaining -keys of the local set to the other peer using a -``GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_BF`` message. This message -additionally includes the number of elements left in the sender's set, -as well as the XOR over all of the keys in that set. - -The number of bits 'k' set per element in the Bloom filter is calculated -based on the relative size of the two sets. Furthermore, the size of the -Bloom filter is calculated based on 'k' and the number of elements in -the set to maximize the amount of data filtered per byte transmitted on -the wire (while avoiding an excessively high number of iterations). - -The receiver of the message removes all elements from its local set that -do not pass the Bloom filter test. It then checks if the set size of the -sender and the XOR over the keys match what is left of its own set. If -they do, it sends a ``GNUNET_MESSAGE_TYPE_SET_INTERSECTION_P2P_DONE`` -back to indicate that the latest set is the final result. Otherwise, the -receiver starts another Bloom filter exchange, except this time as the -sender. - -.. _Salt: - -Salt -^^^^ - -Bloomfilter operations are probabilistic: With some non-zero probability -the test may incorrectly say an element is in the set, even though it is -not. - -To mitigate this problem, the intersection protocol iterates exchanging -Bloom filters using a different random 32-bit salt in each iteration -(the salt is also included in the message). With different salts, set -operations may fail for different elements. Merging the results from the -executions, the probability of failure drops to zero. - -The iterations terminate once both peers have established that they have -sets of the same size, and where the XOR over all keys computes the same -512-bit value (leaving a failure probability of 2\ :superscript:`-511`\ ). - -.. _The-SET-Union-Peer_002dto_002dPeer-Protocol: - -The SET Union Peer-to-Peer Protocol ------------------------------------ - -The SET union protocol is based on Eppstein's efficient set -reconciliation without prior context. You should read this paper first -if you want to understand the protocol. - -.. todo:: Link to Eppstein's paper! - -The union protocol operates over CADET and starts with a -GNUNET_MESSAGE_TYPE_SET_P2P_OPERATION_REQUEST being sent by the peer -initiating the operation to the peer listening for inbound requests. It -includes the number of elements of the initiating peer, which is -currently not used. - -The listening peer checks if the operation type and application -identifier are acceptable for its current state. If not, it responds -with a ``GNUNET_MESSAGE_TYPE_SET_RESULT`` and a status of -``GNUNET_SET_STATUS_FAILURE`` (and terminates the CADET channel). - -If the application accepts the request, it sends back a strata estimator -using a message of type GNUNET_MESSAGE_TYPE_SET_UNION_P2P_SE. The -initiator evaluates the strata estimator and initiates the exchange of -invertible Bloom filters, sending a -GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. - -During the IBF exchange, if the receiver cannot invert the Bloom filter -or detects a cycle, it sends a larger IBF in response (up to a defined -maximum limit; if that limit is reached, the operation fails). Elements -decoded while processing the IBF are transmitted to the other peer using -GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS, or requested from the other peer -using GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS messages, depending -on the sign observed during decoding of the IBF. Peers respond to a -GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENT_REQUESTS message with the respective -element in a GNUNET_MESSAGE_TYPE_SET_P2P_ELEMENTS message. If the IBF -fully decodes, the peer responds with a -GNUNET_MESSAGE_TYPE_SET_UNION_P2P_DONE message instead of another -GNUNET_MESSAGE_TYPE_SET_UNION_P2P_IBF. - -All Bloom filter operations use a salt to mingle keys before hashing -them into buckets, such that future iterations have a fresh chance of -succeeding if they failed due to collisions before. - diff --git a/subsystems/seti/seti.rst b/subsystems/seti/seti.rst @@ -1,258 +0,0 @@ -.. index:: - double: subsystem; SETI - -.. _SETI-Subsystem: - -SETI — Peer to peer set intersections -===================================== - -The SETI service implements efficient set intersection between two peers -over a CADET tunnel. Elements of a set consist of an *element type* and -arbitrary binary *data*. The size of an element's data is limited to -around 62 KB. - -.. _Intersection-Sets: - -Intersection Sets ------------------ - -Sets created by a local client can be modified (by adding additional -elements) and reused for multiple operations. If elements are to be -removed, a fresh set must be created by the client. - -.. _Set-Intersection-Modifications: - -Set Intersection Modifications ------------------------------- - -Even when set operations are active, one can add elements to a set. -However, these changes will only be visible to operations that have been -created after the changes have taken place. That is, every set operation -only sees a snapshot of the set from the time the operation was started. -This mechanism is *not* implemented by copying the whole set, but by -attaching *generation information* to each element and operation. - -.. _Set-Intersection-Operations: - -Set Intersection Operations ---------------------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Intersection-Result-Elements: - -Intersection Result Elements ----------------------------- - -The SET service has two *result modes* that determine how an operation's -result set is delivered to the client: - -- **Return intersection.** All elements of set resulting from the set - intersection are returned to the client. - -- **Removed Elements.** Only elements that are in the local peer's - initial set but not in the intersection are returned. - - -:index:`libgnunetseti <single: libgnunet; seti>` -libgnunetseti -------------- - -.. _Intersection-Set-API: - -Intersection Set API -^^^^^^^^^^^^^^^^^^^^ - -New sets are created with ``GNUNET_SETI_create``. Only the local peer's -configuration (as each set has its own client connection) must be -provided. The set exists until either the client calls -``GNUNET_SET_destroy`` or the client's connection to the service is -disrupted. In the latter case, the client is notified by the return -value of functions dealing with sets. This return value must always be -checked. - -Elements are added with ``GNUNET_SET_add_element``. - -.. _Intersection-Listeners: - -Intersection Listeners -^^^^^^^^^^^^^^^^^^^^^^ - -Listeners are created with ``GNUNET_SET_listen``. Each time time a -remote peer suggests a set operation with an application id and -operation type matching a listener, the listener's callback is invoked. -The client then must synchronously call either ``GNUNET_SET_accept`` or -``GNUNET_SET_reject``. Note that the operation will not be started until -the client calls ``GNUNET_SET_commit`` (see Section \"Supplying a -Set\"). - -.. _Intersection-Operations: - -Intersection Operations -^^^^^^^^^^^^^^^^^^^^^^^ - -Operations to be initiated by the local peer are created with -``GNUNET_SET_prepare``. Note that the operation will not be started -until the client calls ``GNUNET_SET_commit`` (see Section \"Supplying a -Set\"). - -.. _Supplying-a-Set-for-Intersection: - -Supplying a Set for Intersection -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To create symmetry between the two ways of starting a set operation -(accepting and initiating it), the operation handles returned by -``GNUNET_SET_accept`` and ``GNUNET_SET_prepare`` do not yet have a set -to operate on, thus they can not do any work yet. - -The client must call ``GNUNET_SET_commit`` to specify a set to use for -an operation. ``GNUNET_SET_commit`` may only be called once per set -operation. - -.. _The-Intersection-Result-Callback: - -The Intersection Result Callback -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Clients must specify both a result mode and a result callback with -``GNUNET_SET_accept`` and ``GNUNET_SET_prepare``. The result callback -with a status indicating either that an element was received, or the -operation failed or succeeded. The interpretation of the received -element depends on the result mode. The callback needs to know which -result mode it is used in, as the arguments do not indicate if an -element is part of the full result set, or if it is in the difference -between the original set and the final set. - -.. _The-SETI-Client_002dService-Protocol: - -The SETI Client-Service Protocol --------------------------------- - -.. _Creating-Intersection-Sets: - -Creating Intersection Sets -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For each set of a client, there exists a client connection to the -service. Sets are created by sending the ``GNUNET_SERVICE_SETI_CREATE`` -message over a new client connection. Multiple operations for one set -are multiplexed over one client connection, using a request id supplied -by the client. - -.. _Listeners-for-Intersection: - -Listeners for Intersection -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Each listener also requires a separate client connection. By sending the -``GNUNET_SERVICE_SETI_LISTEN`` message, the client notifies the service -of the application id and operation type it is interested in. A client -rejects an incoming request by sending ``GNUNET_SERVICE_SETI_REJECT`` on -the listener's client connection. In contrast, when accepting an -incoming request, a ``GNUNET_SERVICE_SETI_ACCEPT`` message must be sent -over the set that is supplied for the set operation. - -.. _Initiating-Intersection-Operations: - -Initiating Intersection Operations -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Operations with remote peers are initiated by sending a -``GNUNET_SERVICE_SETI_EVALUATE`` message to the service. The client -connection that this message is sent by determines the set to use. - -.. _Modifying-Intersection-Sets: - -Modifying Intersection Sets -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Sets are modified with the ``GNUNET_SERVICE_SETI_ADD`` message. - -.. _Intersection-Results-and-Operation-Status: - -Intersection Results and Operation Status -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The service notifies the client of result elements and success/failure -of a set operation with the ``GNUNET_SERVICE_SETI_RESULT`` message. - -.. _The-SETI-Intersection-Peer_002dto_002dPeer-Protocol: - -The SETI Intersection Peer-to-Peer Protocol -------------------------------------------- - -The intersection protocol operates over CADET and starts with a -GNUNET_MESSAGE_TYPE_SETI_P2P_OPERATION_REQUEST being sent by the peer -initiating the operation to the peer listening for inbound requests. It -includes the number of elements of the initiating peer, which is used to -decide which side will send a Bloom filter first. - -The listening peer checks if the operation type and application -identifier are acceptable for its current state. If not, it responds -with a GNUNET_MESSAGE_TYPE_SETI_RESULT and a status of -GNUNET_SETI_STATUS_FAILURE (and terminates the CADET channel). - -If the application accepts the request, the listener sends back a -``GNUNET_MESSAGE_TYPE_SETI_P2P_ELEMENT_INFO`` if it has more elements in -the set than the client. Otherwise, it immediately starts with the Bloom -filter exchange. If the initiator receives a -``GNUNET_MESSAGE_TYPE_SETI_P2P_ELEMENT_INFO`` response, it beings the -Bloom filter exchange, unless the set size is indicated to be zero, in -which case the intersection is considered finished after just the -initial handshake. - -.. _The-Bloom-filter-exchange-in-SETI: - -The Bloom filter exchange in SETI -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -In this phase, each peer transmits a Bloom filter over the remaining -keys of the local set to the other peer using a -``GNUNET_MESSAGE_TYPE_SETI_P2P_BF`` message. This message additionally -includes the number of elements left in the sender's set, as well as the -XOR over all of the keys in that set. - -The number of bits 'k' set per element in the Bloom filter is calculated -based on the relative size of the two sets. Furthermore, the size of the -Bloom filter is calculated based on 'k' and the number of elements in -the set to maximize the amount of data filtered per byte transmitted on -the wire (while avoiding an excessively high number of iterations). - -The receiver of the message removes all elements from its local set that -do not pass the Bloom filter test. It then checks if the set size of the -sender and the XOR over the keys match what is left of its own set. If -they do, it sends a ``GNUNET_MESSAGE_TYPE_SETI_P2P_DONE`` back to -indicate that the latest set is the final result. Otherwise, the -receiver starts another Bloom filter exchange, except this time as the -sender. - -.. _Intersection-Salt: - -Intersection Salt -^^^^^^^^^^^^^^^^^ - -Bloom filter operations are probabilistic: With some non-zero -probability the test may incorrectly say an element is in the set, even -though it is not. - -To mitigate this problem, the intersection protocol iterates exchanging -Bloom filters using a different random 32-bit salt in each iteration -(the salt is also included in the message). With different salts, set -operations may fail for different elements. Merging the results from the -executions, the probability of failure drops to zero. - -The iterations terminate once both peers have established that they have -sets of the same size, and where the XOR over all keys computes the same -512-bit value (leaving a failure probability of 2-511). - - diff --git a/subsystems/setops.rst b/subsystems/setops.rst @@ -1,11 +0,0 @@ - -Peer-to-Peer Set Operations -=========================== - -Many applications - -.. toctree:: - set/set.rst - seti/seti.rst - setu/setu.rst - diff --git a/subsystems/setu/setu.rst b/subsystems/setu/setu.rst @@ -1,232 +0,0 @@ - -.. index:: - double: SETU; subsystem - -.. _SETU-Subsystem: - -SETU — Peer to peer set unions -============================== - -The SETU service implements efficient set union operations between two -peers over a CADET tunnel. Elements of a set consist of an *element -type* and arbitrary binary *data*. The size of an element's data is -limited to around 62 KB. - -.. _Union-Sets: - -Union Sets ----------- - -Sets created by a local client can be modified (by adding additional -elements) and reused for multiple operations. If elements are to be -removed, a fresh set must be created by the client. - -.. _Set-Union-Modifications: - -Set Union Modifications ------------------------ - -Even when set operations are active, one can add elements to a set. -However, these changes will only be visible to operations that have been -created after the changes have taken place. That is, every set operation -only sees a snapshot of the set from the time the operation was started. -This mechanism is *not* implemented by copying the whole set, but by -attaching *generation information* to each element and operation. - -.. _Set-Union-Operations: - -Set Union Operations --------------------- - -Set operations can be started in two ways: Either by accepting an -operation request from a remote peer, or by requesting a set operation -from a remote peer. Set operations are uniquely identified by the -involved *peers*, an *application id* and the *operation type*. - -The client is notified of incoming set operations by *set listeners*. A -set listener listens for incoming operations of a specific operation -type and application id. Once notified of an incoming set request, the -client can accept the set request (providing a local set for the -operation) or reject it. - -.. _Union-Result-Elements: - -Union Result Elements ---------------------- - -The SET service has three *result modes* that determine how an -operation's result set is delivered to the client: - -- **Locally added Elements.** Elements that are in the union but not - already in the local peer's set are returned. - -- **Remote added Elements.** Additionally, notify the client if the - remote peer lacked some elements and thus also return to the local - client those elements that we are sending to the remote peer to be - added to its union. Obtaining these elements requires setting the - ``GNUNET_SETU_OPTION_SYMMETRIC`` option. - - -:index:`libgnunetsetu <single: libgnunet; setu>` -libgnunetsetu -------------- - -.. _Union-Set-API: - -Union Set API -^^^^^^^^^^^^^ - -New sets are created with ``GNUNET_SETU_create``. Only the local peer's -configuration (as each set has its own client connection) must be -provided. The set exists until either the client calls -``GNUNET_SETU_destroy`` or the client's connection to the service is -disrupted. In the latter case, the client is notified by the return -value of functions dealing with sets. This return value must always be -checked. - -Elements are added with ``GNUNET_SETU_add_element``. - -.. _Union-Listeners: - -Union Listeners -^^^^^^^^^^^^^^^ - -Listeners are created with ``GNUNET_SETU_listen``. Each time time a -remote peer suggests a set operation with an application id and -operation type matching a listener, the listener's callback is invoked. -The client then must synchronously call either ``GNUNET_SETU_accept`` or -``GNUNET_SETU_reject``. Note that the operation will not be started -until the client calls ``GNUNET_SETU_commit`` (see Section \"Supplying a -Set\"). - -.. _Union-Operations: - -Union Operations -^^^^^^^^^^^^^^^^ - -Operations to be initiated by the local peer are created with -``GNUNET_SETU_prepare``. Note that the operation will not be started -until the client calls ``GNUNET_SETU_commit`` (see Section \"Supplying a -Set\"). - -.. _Supplying-a-Set-for-Union: - -Supplying a Set for Union -^^^^^^^^^^^^^^^^^^^^^^^^^ - -To create symmetry between the two ways of starting a set operation -(accepting and initiating it), the operation handles returned by -``GNUNET_SETU_accept`` and ``GNUNET_SETU_prepare`` do not yet have a set -to operate on, thus they can not do any work yet. - -The client must call ``GNUNET_SETU_commit`` to specify a set to use for -an operation. ``GNUNET_SETU_commit`` may only be called once per set -operation. - -.. _The-Union-Result-Callback: - -The Union Result Callback -^^^^^^^^^^^^^^^^^^^^^^^^^ - -Clients must specify both a result mode and a result callback with -``GNUNET_SETU_accept`` and ``GNUNET_SETU_prepare``. The result callback -with a status indicating either that an element was received, -transmitted to the other peer (if this information was requested), or if -the operation failed or ultimately succeeded. - -.. _The-SETU-Client_002dService-Protocol: - -The SETU Client-Service Protocol --------------------------------- - -.. _Creating-Union-Sets: - -Creating Union Sets -^^^^^^^^^^^^^^^^^^^ - -For each set of a client, there exists a client connection to the -service. Sets are created by sending the ``GNUNET_SERVICE_SETU_CREATE`` -message over a new client connection. Multiple operations for one set -are multiplexed over one client connection, using a request id supplied -by the client. - -.. _Listeners-for-Union: - -Listeners for Union -^^^^^^^^^^^^^^^^^^^ - -Each listener also requires a separate client connection. By sending the -``GNUNET_SERVICE_SETU_LISTEN`` message, the client notifies the service -of the application id and operation type it is interested in. A client -rejects an incoming request by sending ``GNUNET_SERVICE_SETU_REJECT`` on -the listener's client connection. In contrast, when accepting an -incoming request, a ``GNUNET_SERVICE_SETU_ACCEPT`` message must be sent -over the set that is supplied for the set operation. - -.. _Initiating-Union-Operations: - -Initiating Union Operations -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Operations with remote peers are initiated by sending a -``GNUNET_SERVICE_SETU_EVALUATE`` message to the service. The client -connection that this message is sent by determines the set to use. - -.. _Modifying-Union-Sets: - -Modifying Union Sets -^^^^^^^^^^^^^^^^^^^^ - -Sets are modified with the ``GNUNET_SERVICE_SETU_ADD`` message. - -.. _Union-Results-and-Operation-Status: - -Union Results and Operation Status -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The service notifies the client of result elements and success/failure -of a set operation with the ``GNUNET_SERVICE_SETU_RESULT`` message. - -.. _The-SETU-Union-Peer_002dto_002dPeer-Protocol: - -The SETU Union Peer-to-Peer Protocol -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The SET union protocol is based on Eppstein's efficient set -reconciliation without prior context. You should read this paper first -if you want to understand the protocol. - -.. todo:: Link to Eppstein's paper! - -The union protocol operates over CADET and starts with a -GNUNET_MESSAGE_TYPE_SETU_P2P_OPERATION_REQUEST being sent by the peer -initiating the operation to the peer listening for inbound requests. It -includes the number of elements of the initiating peer, which is -currently not used. - -The listening peer checks if the operation type and application -identifier are acceptable for its current state. If not, it responds -with a ``GNUNET_MESSAGE_TYPE_SETU_RESULT`` and a status of -``GNUNET_SETU_STATUS_FAILURE`` (and terminates the CADET channel). - -If the application accepts the request, it sends back a strata estimator -using a message of type GNUNET_MESSAGE_TYPE_SETU_P2P_SE. The initiator -evaluates the strata estimator and initiates the exchange of invertible -Bloom filters, sending a GNUNET_MESSAGE_TYPE_SETU_P2P_IBF. - -During the IBF exchange, if the receiver cannot invert the Bloom filter -or detects a cycle, it sends a larger IBF in response (up to a defined -maximum limit; if that limit is reached, the operation fails). Elements -decoded while processing the IBF are transmitted to the other peer using -GNUNET_MESSAGE_TYPE_SETU_P2P_ELEMENTS, or requested from the other peer -using GNUNET_MESSAGE_TYPE_SETU_P2P_ELEMENT_REQUESTS messages, depending -on the sign observed during decoding of the IBF. Peers respond to a -GNUNET_MESSAGE_TYPE_SETU_P2P_ELEMENT_REQUESTS message with the -respective element in a GNUNET_MESSAGE_TYPE_SETU_P2P_ELEMENTS message. -If the IBF fully decodes, the peer responds with a -GNUNET_MESSAGE_TYPE_SETU_P2P_DONE message instead of another -GNUNET_MESSAGE_TYPE_SETU_P2P_IBF. - -All Bloom filter operations use a salt to mingle keys before hashing -them into buckets, such that future iterations have a fresh chance of -succeeding if they failed due to collisions before. diff --git a/subsystems/statistics.rst b/subsystems/statistics.rst @@ -1,193 +0,0 @@ - -.. index:: - double: STATISTICS; subsystem - -.. _STATISTICS-Subsystem: - -STATISTICS — Runtime statistics publication -=========================================== - -In GNUnet, the STATISTICS subsystem offers a central place for all -subsystems to publish unsigned 64-bit integer run-time statistics. -Keeping this information centrally means that there is a unified way for -the user to obtain data on all subsystems, and individual subsystems do -not have to always include a custom data export method for performance -metrics and other statistics. For example, the TRANSPORT system uses -STATISTICS to update information about the number of directly connected -peers and the bandwidth that has been consumed by the various plugins. -This information is valuable for diagnosing connectivity and performance -issues. - -Following the GNUnet service architecture, the STATISTICS subsystem is -divided into an API which is exposed through the header -**gnunet_statistics_service.h** and the STATISTICS service -**gnunet-service-statistics**. The **gnunet-statistics** command-line -tool can be used to obtain (and change) information about the values -stored by the STATISTICS service. The STATISTICS service does not -communicate with other peers. - -Data is stored in the STATISTICS service in the form of tuples -**(subsystem, name, value, persistence)**. The subsystem determines to -which other GNUnet's subsystem the data belongs. name is the name -through which value is associated. It uniquely identifies the record -from among other records belonging to the same subsystem. In some parts -of the code, the pair **(subsystem, name)** is called a **statistic** as -it identifies the values stored in the STATISTCS service.The persistence -flag determines if the record has to be preserved across service -restarts. A record is said to be persistent if this flag is set for it; -if not, the record is treated as a non-persistent record and it is lost -after service restart. Persistent records are written to and read from -the file **statistics.data** before shutdown and upon startup. The file -is located in the HOME directory of the peer. - -An anomaly of the STATISTICS service is that it does not terminate -immediately upon receiving a shutdown signal if it has any clients -connected to it. It waits for all the clients that are not monitors to -close their connections before terminating itself. This is to prevent -the loss of data during peer shutdown — delaying the STATISTICS -service shutdown helps other services to store important data to -STATISTICS during shutdown. - -:index:`libgnunetstatistics <single: libgnunet; statistics>` -libgnunetstatistics -------------------- - -**libgnunetstatistics** is the library containing the API for the -STATISTICS subsystem. Any process requiring to use STATISTICS should use -this API by to open a connection to the STATISTICS service. This is done -by calling the function ``GNUNET_STATISTICS_create()``. This function -takes the subsystem's name which is trying to use STATISTICS and a -configuration. All values written to STATISTICS with this connection -will be placed in the section corresponding to the given subsystem's -name. The connection to STATISTICS can be destroyed with the function -``GNUNET_STATISTICS_destroy()``. This function allows for the connection -to be destroyed immediately or upon transferring all pending write -requests to the service. - -Note: STATISTICS subsystem can be disabled by setting ``DISABLE = YES`` -under the ``[STATISTICS]`` section in the configuration. With such a -configuration all calls to ``GNUNET_STATISTICS_create()`` return -``NULL`` as the STATISTICS subsystem is unavailable and no other -functions from the API can be used. - -.. _Statistics-retrieval: - -Statistics retrieval -^^^^^^^^^^^^^^^^^^^^ - -Once a connection to the statistics service is obtained, information -about any other system which uses statistics can be retrieved with the -function GNUNET_STATISTICS_get(). This function takes the connection -handle, the name of the subsystem whose information we are interested in -(a ``NULL`` value will retrieve information of all available subsystems -using STATISTICS), the name of the statistic we are interested in (a -``NULL`` value will retrieve all available statistics), a continuation -callback which is called when all of requested information is retrieved, -an iterator callback which is called for each parameter in the retrieved -information and a closure for the aforementioned callbacks. The library -then invokes the iterator callback for each value matching the request. - -Call to ``GNUNET_STATISTICS_get()`` is asynchronous and can be canceled -with the function ``GNUNET_STATISTICS_get_cancel()``. This is helpful -when retrieving statistics takes too long and especially when we want to -shutdown and cleanup everything. - -.. _Setting-statistics-and-updating-them: - -Setting statistics and updating them -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -So far we have seen how to retrieve statistics, here we will learn how -we can set statistics and update them so that other subsystems can -retrieve them. - -A new statistic can be set using the function -``GNUNET_STATISTICS_set()``. This function takes the name of the -statistic and its value and a flag to make the statistic persistent. The -value of the statistic should be of the type ``uint64_t``. The function -does not take the name of the subsystem; it is determined from the -previous ``GNUNET_STATISTICS_create()`` invocation. If the given -statistic is already present, its value is overwritten. - -An existing statistics can be updated, i.e its value can be increased or -decreased by an amount with the function ``GNUNET_STATISTICS_update()``. -The parameters to this function are similar to -``GNUNET_STATISTICS_set()``, except that it takes the amount to be -changed as a type ``int64_t`` instead of the value. - -The library will combine multiple set or update operations into one -message if the client performs requests at a rate that is faster than -the available IPC with the STATISTICS service. Thus, the client does not -have to worry about sending requests too quickly. - -.. _Watches: - -Watches -^^^^^^^ - -As interesting feature of STATISTICS lies in serving notifications -whenever a statistic of our interest is modified. This is achieved by -registering a watch through the function ``GNUNET_STATISTICS_watch()``. -The parameters of this function are similar to those of -``GNUNET_STATISTICS_get()``. Changes to the respective statistic's value -will then cause the given iterator callback to be called. Note: A watch -can only be registered for a specific statistic. Hence the subsystem -name and the parameter name cannot be ``NULL`` in a call to -``GNUNET_STATISTICS_watch()``. - -A registered watch will keep notifying any value changes until -``GNUNET_STATISTICS_watch_cancel()`` is called with the same parameters -that are used for registering the watch. - -.. _The-STATISTICS-Client_002dService-Protocol: - -The STATISTICS Client-Service Protocol --------------------------------------- - -.. _Statistics-retrieval2: - -Statistics retrieval -^^^^^^^^^^^^^^^^^^^^ - -To retrieve statistics, the client transmits a message of type -``GNUNET_MESSAGE_TYPE_STATISTICS_GET`` containing the given subsystem -name and statistic parameter to the STATISTICS service. The service -responds with a message of type ``GNUNET_MESSAGE_TYPE_STATISTICS_VALUE`` -for each of the statistics parameters that match the client request for -the client. The end of information retrieved is signaled by the service -by sending a message of type ``GNUNET_MESSAGE_TYPE_STATISTICS_END``. - -.. _Setting-and-updating-statistics: - -Setting and updating statistics -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The subsystem name, parameter name, its value and the persistence flag -are communicated to the service through the message -``GNUNET_MESSAGE_TYPE_STATISTICS_SET``. - -When the service receives a message of type -``GNUNET_MESSAGE_TYPE_STATISTICS_SET``, it retrieves the subsystem name -and checks for a statistic parameter with matching the name given in the -message. If a statistic parameter is found, the value is overwritten by -the new value from the message; if not found then a new statistic -parameter is created with the given name and value. - -In addition to just setting an absolute value, it is possible to perform -a relative update by sending a message of type -``GNUNET_MESSAGE_TYPE_STATISTICS_SET`` with an update flag -(``GNUNET_STATISTICS_SETFLAG_RELATIVE``) signifying that the value in -the message should be treated as an update value. - -.. _Watching-for-updates: - -Watching for updates -^^^^^^^^^^^^^^^^^^^^ - -The function registers the watch at the service by sending a message of -type ``GNUNET_MESSAGE_TYPE_STATISTICS_WATCH``. The service then sends -notifications through messages of type -``GNUNET_MESSAGE_TYPE_STATISTICS_WATCH_VALUE`` whenever the statistic -parameter's value is changed. - - diff --git a/subsystems/transport-ng.rst b/subsystems/transport-ng.rst @@ -1,303 +0,0 @@ - -.. index:: - double: TRANSPORT Next Generation; subsystem - -.. _TRANSPORT_002dNG-Subsystem: - -TRANSPORT-NG — Next-generation transport management -=================================================== - -The current GNUnet TRANSPORT architecture is rooted in the GNUnet 0.4 -design of using plugins for the actual transmission operations and the -ATS subsystem to select a plugin and allocate bandwidth. The following -key issues have been identified with this design: - -- Bugs in one plugin can affect the TRANSPORT service and other - plugins. There is at least one open bug that affects sockets, where - the origin is difficult to pinpoint due to the large code base. - -- Relevant operating system default configurations often impose a limit - of 1024 file descriptors per process. Thus, one plugin may impact - other plugin's connectivity choices. - -- Plugins are required to offer bi-directional connectivity. However, - firewalls (incl. NAT boxes) and physical environments sometimes only - allow uni-directional connectivity, which then currently cannot be - utilized at all. - -- Distance vector routing was implemented in 209 but shortly afterwards - broken and due to the complexity of implementing it as a plugin and - dealing with the resource allocation consequences was never useful. - -- Most existing plugins communicate completely using cleartext, - exposing metad data (message size) and making it easy to fingerprint - and possibly block GNUnet traffic. - -- Various NAT traversal methods are not supported. - -- The service logic is cluttered with \"manipulation\" support code for - TESTBED to enable faking network characteristics like lossy - connections or firewewalls. - -- Bandwidth allocation is done in ATS, requiring the duplication of - state and resulting in much delayed allocation decisions. As a - result, often available bandwidth goes unused. Users are expected to - manually configure bandwidth limits, instead of TRANSPORT using - congestion control to adapt automatically. - -- TRANSPORT is difficult to test and has bad test coverage. - -- HELLOs include an absolute expiration time. Nodes with unsynchronized - clocks cannot connect. - -- Displaying the contents of a HELLO requires the respective plugin as - the plugin-specific data is encoded in binary. This also complicates - logging. - -.. _Design-goals-of-TNG: - -Design goals of TNG -------------------- - -In order to address the above issues, we want to: - -- Move plugins into separate processes which we shall call - *communicators*. Communicators connect as clients to the transport - service. - -- TRANSPORT should be able to utilize any number of communicators to the - same peer at the same time. - -- TRANSPORT should be responsible for fragmentation, retransmission, - flow- and congestion-control. Users should no longer have to - configure bandwidth limits: TRANSPORT should detect what is available - and use it. - -- Communicators should be allowed to be uni-directional and - unreliable. TRANSPORT shall create bi-directional channels from this - whenever possible. - -- DV should no longer be a plugin, but part of TRANSPORT. - -- TRANSPORT should provide communicators help communicating, for - example in the case of uni-directional communicators or the need for - out-of-band signalling for NAT traversal. We call this functionality - *backchannels*. - -- Transport manipulation should be signalled to CORE on a per-message - basis instead of an approximate bandwidth. - -- CORE should signal performance requirements (reliability, latency, - etc.) on a per-message basis to TRANSPORT. If possible, TRANSPORT - should consider those options when scheduling messages for - transmission. - -- HELLOs should be in a human-readable format with monotonic time - expirations. - -The new architecture is planned as follows: - -.. image:: /images/tng.png - -TRANSPORT's main objective is to establish bi-directional virtual links -using a variety of possibly uni-directional communicators. Links undergo -the following steps: - -1. Communicator informs TRANSPORT A that a queue (direct neighbour) is - available, or equivalently TRANSPORT A discovers a (DV) path to a - target B. - -2. TRANSPORT A sends a challenge to the target peer, trying to confirm - that the peer can receive. FIXME: This is not implemented properly - for DV. Here we should really take a validated DVH and send a - challenge exactly down that path! - -3. The other TRANSPORT, TRANSPORT B, receives the challenge, and sends - back a response, possibly using a dierent path. If TRANSPORT B does - not yet have a virtual link to A, it must try to establish a virtual - link. - -4. Upon receiving the response, TRANSPORT A creates the virtual link. If - the response included a challenge, TRANSPORT A must respond to this - challenge as well, eectively re-creating the TCP 3-way handshake - (just with longer challenge values). - -.. _HELLO_002dNG: - -HELLO-NG --------- - -HELLOs change in three ways. First of all, communicators encode the -respective addresses in a human-readable URL-like string. This way, we -do no longer require the communicator to print the contents of a HELLO. -Second, HELLOs no longer contain an expiration time, only a creation -time. The receiver must only compare the respective absolute values. So -given a HELLO from the same sender with a larger creation time, then the -old one is no longer valid. This also obsoletes the need for the -gnunet-hello binary to set HELLO expiration times to never. Third, a -peer no longer generates one big HELLO that always contains all of the -addresses. Instead, each address is signed individually and shared only -over the address scopes where it makes sense to share the address. In -particular, care should be taken to not share MACs across the Internet -and confine their use to the LAN. As each address is signed separately, -having multiple addresses valid at the same time (given the new creation -time expiration logic) requires that those addresses must have exactly -the same creation time. Whenever that monotonic time is increased, all -addresses must be re-signed and re-distributed. - -.. _Priorities-and-preferences: - -Priorities and preferences --------------------------- - -In the new design, TRANSPORT adopts a feature (which was previously -already available in CORE) of the MQ API to allow applications to -specify priorities and preferences per message (or rather, per MQ -envelope). The (updated) MQ API allows applications to specify one of -four priority levels as well as desired preferences for transmission by -setting options on an envelope. These preferences currently are: - -- GNUNET_MQ_PREF_UNRELIABLE: Disables TRANSPORT waiting for ACKS on - unreliable channels like UDP. Now it is fire and forget. These - messages then cannot be used for RTT estimates either. - -- GNUNET_MQ_PREF_LOW_LATENCY: Directs TRANSPORT to select the - lowest-latency transmission choices possible. - -- GNUNET_MQ_PREF_CORK_ALLOWED: Allows TRANSPORT to delay transmission - to group the message with other messages into a larger batch to - reduce the number of packets sent. - -- GNUNET_MQ_PREF_GOODPUT: Directs TRANSPORT to select the highest - goodput channel available. - -- GNUNET_MQ_PREF_OUT_OF_ORDER: Allows TRANSPORT to reorder the messages - as it sees fit, otherwise TRANSPORT should attempt to preserve - transmission order. - -Each MQ envelope is always able to store those options (and the -priority), and in the future this uniform API will be used by TRANSPORT, -CORE, CADET and possibly other subsystems that send messages (like -LAKE). When CORE sets preferences and priorities, it is supposed to -respect the preferences and priorities it is given from higher layers. -Similarly, CADET also simply passes on the preferences and priorities of -the layer above CADET. When a layer combines multiple smaller messages -into one larger transmission, the ``GNUNET_MQ_env_combine_options()`` -should be used to calculate options for the combined message. We note -that the exact semantics of the options may differ by layer. For -example, CADET will always strictly implement reliable and in-order -delivery of messages, while the same options are only advisory for -TRANSPORT and CORE: they should try (using ACKs on unreliable -communicators, not changing the message order themselves), but if -messages are lost anyway (e.g. because a TCP is dropped in the middle), -or if messages are reordered (e.g. because they took different paths -over the network and arrived in a different order) TRANSPORT and CORE do -not have to correct this. Whether a preference is strict or loose is -thus dened by the respective layer. - -.. _Communicators: - -Communicators -------------- - -The API for communicators is defined in -``gnunet_transport_communication_service.h``. Each communicator must -specify its (global) communication characteristics, which for now only -say whether the communication is reliable (e.g. TCP, HTTPS) or -unreliable (e.g. UDP, WLAN). Each communicator must specify a unique -address prex, or NULL if the communicator cannot establish outgoing -connections (for example because it is only acting as a TCP server). A -communicator must tell TRANSPORT which addresses it is reachable under. -Addresses may be added or removed at any time. A communicator may have -zero addresses (transmission only). Addresses do not have to match the -address prefix. - -TRANSPORT may ask a communicator to try to connect to another address. -TRANSPORT will only ask for connections where the address matches the -communicator's address prefix that was provided when the connection was -established. Communicators should then attempt to establish a -connection. -It is under the discretion of the communicator whether to honor this request. -Reasons for not honoring such a request may be that an existing connection exists -or resource limitations. -No response is provided to TRANSPORT service on failure. -The TRANSPORT service has to ask the communicator explicitly to retry. - -If a communicator succeeds in establishing an outgoing connection for -transmission, or if a communicator receives an incoming bi-directional -connection, the communicator must inform the TRANSPORT service that a -message queue (MQ) for transmission is now available. -For that MQ, the communicator must provide the peer identity claimed by the other end. -It must also provide a human-readable address (for debugging) and a maximum transfer unit -(MTU). A MTU of zero means sending is not supported, SIZE_MAX should be -used for no MTU. The communicator should also tell TRANSPORT what -network type is used for the queue. The communicator may tell TRANSPORT -anytime that the queue was deleted and is no longer available. - -The communicator API also provides for flow control. First, -communicators exhibit back-pressure on TRANSPORT: the number of messages -TRANSPORT may add to a queue for transmission will be limited. So by not -draining the transmission queue, back-pressure is provided to TRANSPORT. -In the other direction, communicators may allow TRANSPORT to give -back-pressure towards the communicator by providing a non-NULL -``GNUNET_TRANSPORT_MessageCompletedCallback`` argument to the -``GNUNET_TRANSPORT_communicator_receive`` function. In this case, -TRANSPORT will only invoke this function once it has processed the -message and is ready to receive more. Communicators should then limit -how much traffic they receive based on this backpressure. Note that -communicators do not have to provide a -``GNUNET_TRANSPORT_MessageCompletedCallback``; for example, UDP cannot -support back-pressure due to the nature of the UDP protocol. In this -case, TRANSPORT will implement its own TRANSPORT-to-TRANSPORT flow -control to reduce the sender's data rate to acceptable levels. - -TRANSPORT may notify a communicator about backchannel messages TRANSPORT -received from other peers for this communicator. Similarly, -communicators can ask TRANSPORT to try to send a backchannel message to -other communicators of other peers. The semantics of the backchannel -message are up to the communicators which use them. TRANSPORT may fail -transmitting backchannel messages, and TRANSPORT will not attempt to -retransmit them. - -UDP communicator -^^^^^^^^^^^^^^^^ - -The UDP communicator implements a basic encryption layer to protect from -metadata leakage. -The layer tries to establish a shared secret using an Elliptic-Curve Diffie-Hellman -key exchange in which the initiator of a packet creates an ephemeral key pair -to encrypt a message for the target peer identity. -The communicator always offers this kind of transmission queue to a (reachable) -peer in which messages are encrypted with dedicated keys. -The performance of this queue is not suitable for high volume data transfer. - -If the UDP connection is bi-directional, or the TRANSPORT is able to offer a -backchannel connection, the resulting key can be re-used if the recieving peer -is able to ACK the reception. -This will cause the communicator to offer a new queue (with a higher priority -than the default queue) to TRANSPORT with a limited capacity. -The capacity is increased whenever the communicator receives an ACK for a -transmission. -This queue is suitable for high-volume data transfer and TRANSPORT will likely -prioritize this queue (if available). - -Communicators that try to establish a connection to a target peer authenticate -their peer ID (public key) in the first packets by signing a monotonic time -stamp, its peer ID, and the target peerID and send this data as well as the signature -in one of the first packets. -Receivers should keep track (persist) of the monotonic time stamps for each -peer ID to reject possible replay attacks. - -FIXME: Handshake wire format? KX, Flow. - -TCP communicator -^^^^^^^^^^^^^^^^ - -FIXME: Handshake wire format? KX, Flow. - -QUIC communicator -^^^^^^^^^^^^^^^^^ -The QUIC communicator runs over a bi-directional UDP connection. -TLS layer with self-signed certificates (binding/signed with peer ID?). -Single, bi-directional stream? -FIXME: Handshake wire format? KX, Flow. diff --git a/subsystems/transport.rst b/subsystems/transport.rst @@ -1,843 +0,0 @@ -.. index:: - double: TRANSPORT; subsystem - -.. _TRANSPORT-Subsystem: - -TRANSPORT — Overlay transport management -======================================== - -This chapter documents how the GNUnet transport subsystem works. The -GNUnet transport subsystem consists of three main components: the -transport API (the interface used by the rest of the system to access -the transport service), the transport service itself (most of the -interesting functions, such as choosing transports, happens here) and -the transport plugins. A transport plugin is a concrete implementation -for how two GNUnet peers communicate; many plugins exist, for example -for communication via TCP, UDP, HTTP, HTTPS and others. Finally, the -transport subsystem uses supporting code, especially the NAT/UPnP -library to help with tasks such as NAT traversal. - -Key tasks of the transport service include: - -- Create our HELLO message, notify clients and neighbours if our HELLO - changes (using NAT library as necessary) - -- Validate HELLOs from other peers (send PING), allow other peers to - validate our HELLO's addresses (send PONG) - -- Upon request, establish connections to other peers (using address - selection from ATS subsystem) and maintain them (again using PINGs - and PONGs) as long as desired - -- Accept incoming connections, give ATS service the opportunity to - switch communication channels - -- Notify clients about peers that have connected to us or that have - been disconnected from us - -- If a (stateful) connection goes down unexpectedly (without explicit - DISCONNECT), quickly attempt to recover (without notifying clients) - but do notify clients quickly if reconnecting fails - -- Send (payload) messages arriving from clients to other peers via - transport plugins and receive messages from other peers, forwarding - those to clients - -- Enforce inbound traffic limits (using flow-control if it is - applicable); outbound traffic limits are enforced by CORE, not by us - (!) - -- Enforce restrictions on P2P connection as specified by the blacklist - configuration and blacklisting clients - -Note that the term \"clients\" in the list above really refers to the -GNUnet-CORE service, as CORE is typically the only client of the -transport service. - -.. _Address-validation-protocol: - -Address validation protocol ---------------------------- - -This section documents how the GNUnet transport service validates -connections with other peers. It is a high-level description of the -protocol necessary to understand the details of the implementation. It -should be noted that when we talk about PING and PONG messages in this -section, we refer to transport-level PING and PONG messages, which are -different from core-level PING and PONG messages (both in implementation -and function). - -The goal of transport-level address validation is to minimize the -chances of a successful man-in-the-middle attack against GNUnet peers on -the transport level. Such an attack would not allow the adversary to -decrypt the P2P transmissions, but a successful attacker could at least -measure traffic volumes and latencies (raising the adversaries -capabilities by those of a global passive adversary in the worst case). -The scenarios we are concerned about is an attacker, Mallory, giving a -``HELLO`` to Alice that claims to be for Bob, but contains Mallory's IP -address instead of Bobs (for some transport). Mallory would then forward -the traffic to Bob (by initiating a connection to Bob and claiming to be -Alice). As a further complication, the scheme has to work even if say -Alice is behind a NAT without traversal support and hence has no address -of her own (and thus Alice must always initiate the connection to Bob). - -An additional constraint is that ``HELLO`` messages do not contain a -cryptographic signature since other peers must be able to edit (i.e. -remove) addresses from the ``HELLO`` at any time (this was not true in -GNUnet 0.8.x). A basic **assumption** is that each peer knows the set of -possible network addresses that it **might** be reachable under (so for -example, the external IP address of the NAT plus the LAN address(es) -with the respective ports). - -The solution is the following. If Alice wants to validate that a given -address for Bob is valid (i.e. is actually established **directly** with -the intended target), she sends a PING message over that connection to -Bob. Note that in this case, Alice initiated the connection so only -Alice knows which address was used for sure (Alice may be behind NAT, so -whatever address Bob sees may not be an address Alice knows she has). -Bob checks that the address given in the ``PING`` is actually one of -Bob's addresses (ie: does not belong to Mallory), and if it is, sends -back a ``PONG`` (with a signature that says that Bob owns/uses the -address from the ``PING``). Alice checks the signature and is happy if -it is valid and the address in the ``PONG`` is the address Alice used. -This is similar to the 0.8.x protocol where the ``HELLO`` contained a -signature from Bob for each address used by Bob. Here, the purpose code -for the signature is ``GNUNET_SIGNATURE_PURPOSE_TRANSPORT_PONG_OWN``. -After this, Alice will remember Bob's address and consider the address -valid for a while (12h in the current implementation). Note that after -this exchange, Alice only considers Bob's address to be valid, the -connection itself is not considered 'established'. In particular, Alice -may have many addresses for Bob that Alice considers valid. - -The ``PONG`` message is protected with a nonce/challenge against replay -attacks (`replay <http://en.wikipedia.org/wiki/Replay_attack>`__) and -uses an expiration time for the signature (but those are almost -implementation details). - -NAT library -.. _NAT-library: - -NAT library ------------ - -The goal of the GNUnet NAT library is to provide a general-purpose API -for NAT traversal **without** third-party support. So protocols that -involve contacting a third peer to help establish a connection between -two peers are outside of the scope of this API. That does not mean that -GNUnet doesn't support involving a third peer (we can do this with the -distance-vector transport or using application-level protocols), it just -means that the NAT API is not concerned with this possibility. The API -is written so that it will work for IPv6-NAT in the future as well as -current IPv4-NAT. Furthermore, the NAT API is always used, even for -peers that are not behind NAT --- in that case, the mapping provided is -simply the identity. - -NAT traversal is initiated by calling ``GNUNET_NAT_register``. Given a -set of addresses that the peer has locally bound to (TCP or UDP), the -NAT library will return (via callback) a (possibly longer) list of -addresses the peer **might** be reachable under. Internally, depending -on the configuration, the NAT library will try to punch a hole (using -UPnP) or just \"know\" that the NAT was manually punched and generate -the respective external IP address (the one that should be globally -visible) based on the given information. - -The NAT library also supports ICMP-based NAT traversal. Here, the other -peer can request connection-reversal by this peer (in this special case, -the peer is even allowed to configure a port number of zero). If the NAT -library detects a connection-reversal request, it returns the respective -target address to the client as well. It should be noted that -connection-reversal is currently only intended for TCP, so other plugins -**must** pass ``NULL`` for the reversal callback. Naturally, the NAT -library also supports requesting connection reversal from a remote peer -(``GNUNET_NAT_run_client``). - -Once initialized, the NAT handle can be used to test if a given address -is possibly a valid address for this peer (``GNUNET_NAT_test_address``). -This is used for validating our addresses when generating PONGs. - -Finally, the NAT library contains an API to test if our NAT -configuration is correct. Using ``GNUNET_NAT_test_start`` **before** -binding to the respective port, the NAT library can be used to test if -the configuration works. The test function act as a local client, -initialize the NAT traversal and then contact a ``gnunet-nat-server`` -(running by default on ``gnunet.org``) and ask for a connection to be -established. This way, it is easy to test if the current NAT -configuration is valid. - -.. _Distance_002dVector-plugin: - -Distance-Vector plugin ----------------------- - -The Distance Vector (DV) transport is a transport mechanism that allows -peers to act as relays for each other, thereby connecting peers that -would otherwise be unable to connect. This gives a larger connection set -to applications that may work better with more peers to choose from (for -example, File Sharing and/or DHT). - -The Distance Vector transport essentially has two functions. The first -is \"gossiping\" connection information about more distant peers to -directly connected peers. The second is taking messages intended for -non-directly connected peers and encapsulating them in a DV wrapper that -contains the required information for routing the message through -forwarding peers. Via gossiping, optimal routes through the known DV -neighborhood are discovered and utilized and the message encapsulation -provides some benefits in addition to simply getting the message from -the correct source to the proper destination. - -The gossiping function of DV provides an up to date routing table of -peers that are available up to some number of hops. We call this a -fisheye view of the network (like a fish, nearby objects are known while -more distant ones unknown). Gossip messages are sent only to directly -connected peers, but they are sent about other knowns peers within the -\"fisheye distance\". Whenever two peers connect, they immediately -gossip to each other about their appropriate other neighbors. They also -gossip about the newly connected peer to previously connected neighbors. -In order to keep the routing tables up to date, disconnect notifications -are propagated as gossip as well (because disconnects may not be -sent/received, timeouts are also used remove stagnant routing table -entries). - -Routing of messages via DV is straightforward. When the DV transport is -notified of a message destined for a non-direct neighbor, the -appropriate forwarding peer is selected, and the base message is -encapsulated in a DV message which contains information about the -initial peer and the intended recipient. At each forwarding hop, the -initial peer is validated (the forwarding peer ensures that it has the -initial peer in its neighborhood, otherwise the message is dropped). -Next the base message is re-encapsulated in a new DV message for the -next hop in the forwarding chain (or delivered to the current peer, if -it has arrived at the destination). - -Assume a three peer network with peers Alice, Bob and Carol. Assume that - -:: - - Alice <-> Bob and Bob <-> Carol - -are direct (e.g. over TCP or UDP transports) connections, but that Alice -cannot directly connect to Carol. This may be the case due to NAT or -firewall restrictions, or perhaps based on one of the peers respective -configurations. If the Distance Vector transport is enabled on all three -peers, it will automatically discover (from the gossip protocol) that -Alice and Carol can connect via Bob and provide a \"virtual\" Alice <-> -Carol connection. Routing between Alice and Carol happens as follows; -Alice creates a message destined for Carol and notifies the DV transport -about it. The DV transport at Alice looks up Carol in the routing table -and finds that the message must be sent through Bob for Carol. The -message is encapsulated setting Alice as the initiator and Carol as the -destination and sent to Bob. Bob receives the messages, verifies that -both Alice and Carol are known to Bob, and re-wraps the message in a new -DV message for Carol. The DV transport at Carol receives this message, -unwraps the original message, and delivers it to Carol as though it came -directly from Alice. - -SMTP plugin -.. _SMTP-plugin: - -SMTP plugin ------------ - -.. todo:: Update? - -This section describes the new SMTP transport plugin for GNUnet as it -exists in the 0.7.x and 0.8.x branch. SMTP support is currently not -available in GNUnet 0.9.x. This page also describes the transport layer -abstraction (as it existed in 0.7.x and 0.8.x) in more detail and gives -some benchmarking results. The performance results presented are quite -old and maybe outdated at this point. For the readers in the year 2019, -you will notice by the mention of version 0.7, 0.8, and 0.9 that this -section has to be taken with your usual grain of salt and be updated -eventually. - -- Why use SMTP for a peer-to-peer transport? - -- SMTPHow does it work? - -- How do I configure my peer? - -- How do I test if it works? - -- How fast is it? - -- Is there any additional documentation? - -.. _Why-use-SMTP-for-a-peer_002dto_002dpeer-transport_003f: - -Why use SMTP for a peer-to-peer transport? ------------------------------------------- - -There are many reasons why one would not want to use SMTP: - -- SMTP is using more bandwidth than TCP, UDP or HTTP - -- SMTP has a much higher latency. - -- SMTP requires significantly more computation (encoding and decoding - time) for the peers. - -- SMTP is significantly more complicated to configure. - -- SMTP may be abused by tricking GNUnet into sending mail - to non-participating third parties. - -So why would anybody want to use SMTP? - -- SMTP can be used to contact peers behind NAT boxes (in virtual - private networks). - -- SMTP can be used to circumvent policies that limit or prohibit - peer-to-peer traffic by masking as \"legitimate\" traffic. - -- SMTP uses E-mail addresses which are independent of a specific IP, - which can be useful to address peers that use dynamic IP addresses. - -- SMTP can be used to initiate a connection (e.g. initial address - exchange) and peers can then negotiate the use of a more efficient - protocol (e.g. TCP) for the actual communication. - -In summary, SMTP can for example be used to send a message to a peer -behind a NAT box that has a dynamic IP to tell the peer to establish a -TCP connection to a peer outside of the private network. Even an -extraordinary overhead for this first message would be irrelevant in -this type of situation. - -.. _How-does-it-work_003f: - -How does it work? ------------------ - -When a GNUnet peer needs to send a message to another GNUnet peer that -has advertised (only) an SMTP transport address, GNUnet base64-encodes -the message and sends it in an E-mail to the advertised address. The -advertisement contains a filter which is placed in the E-mail header, -such that the receiving host can filter the tagged E-mails and forward -it to the GNUnet peer process. The filter can be specified individually -by each peer and be changed over time. This makes it impossible to -censor GNUnet E-mail messages by searching for a generic filter. - -.. _How-do-I-configure-my-peer_003f: - -How do I configure my peer? ---------------------------- - -First, you need to configure ``procmail`` to filter your inbound E-mail -for GNUnet traffic. The GNUnet messages must be delivered into a pipe, -for example ``/tmp/gnunet.smtp``. You also need to define a filter that -is used by ``procmail`` to detect GNUnet messages. You are free to -choose whichever filter you like, but you should make sure that it does -not occur in your other E-mail. In our example, we will use -``X-mailer: GNUnet``. The ``~/.procmailrc`` configuration file then -looks like this: - -:: - - :0: - * ^X-mailer: GNUnet - /tmp/gnunet.smtp - # where do you want your other e-mail delivered to - # (default: /var/spool/mail/) - :0: /var/spool/mail/ - -After adding this file, first make sure that your regular E-mail still -works (e.g. by sending an E-mail to yourself). Then edit the GNUnet -configuration. In the section ``SMTP`` you need to specify your E-mail -address under ``EMAIL``, your mail server (for outgoing mail) under -``SERVER``, the filter (X-mailer: GNUnet in the example) under -``FILTER`` and the name of the pipe under ``PIPE``. The completed -section could then look like this: - -.. code-block:: text - - EMAIL = me@mail.gnu.org MTU = 65000 SERVER = mail.gnu.org:25 FILTER = - "X-mailer: GNUnet" PIPE = /tmp/gnunet.smtp - -.. todo:: set highlighting for this code block properly. - -Finally, you need to add ``smtp`` to the list of ``TRANSPORTS`` in the -``GNUNETD`` section. GNUnet peers will use the E-mail address that you -specified to contact your peer until the advertisement times out. Thus, -if you are not sure if everything works properly or if you are not -planning to be online for a long time, you may want to configure this -timeout to be short, e.g. just one hour. For this, set ``HELLOEXPIRES`` -to ``1`` in the ``GNUNETD`` section. - -This should be it, but you may probably want to test it first. - -.. _How-do-I-test-if-it-works_003f: - -How do I test if it works? --------------------------- - -Any transport can be subjected to some rudimentary tests using the -``gnunet-transport-check`` tool. The tool sends a message to the local -node via the transport and checks that a valid message is received. -While this test does not involve other peers and can not check if -firewalls or other network obstacles prohibit proper operation, this is -a great testcase for the SMTP transport since it tests pretty much -nearly all of the functionality. - -``gnunet-transport-check`` should only be used without running -``gnunetd`` at the same time. By default, ``gnunet-transport-check`` -tests all transports that are specified in the configuration file. But -you can specifically test SMTP by giving the option -``--transport=smtp``. - -Note that this test always checks if a transport can receive and send. -While you can configure most transports to only receive or only send -messages, this test will only work if you have configured the transport -to send and receive messages. - -.. _How-fast-is-it_003f: - -How fast is it? ---------------- - -We have measured the performance of the UDP, TCP and SMTP transport -layer directly and when used from an application using the GNUnet core. -Measuring just the transport layer gives the better view of the actual -overhead of the protocol, whereas evaluating the transport from the -application puts the overhead into perspective from a practical point of -view. - -The loopback measurements of the SMTP transport were performed on three -different machines spanning a range of modern SMTP configurations. We -used a PIII-800 running RedHat 7.3 with the Purdue Computer Science -configuration which includes filters for spam. We also used a Xenon 2 -GHZ with a vanilla RedHat 8.0 sendmail configuration. Furthermore, we -used qmail on a PIII-1000 running Sorcerer GNU Linux (SGL). The numbers -for UDP and TCP are provided using the SGL configuration. The qmail -benchmark uses qmail's internal filtering whereas the sendmail -benchmarks relies on procmail to filter and deliver the mail. We used -the transport layer to send a message of b bytes (excluding transport -protocol headers) directly to the local machine. This way, network -latency and packet loss on the wire have no impact on the timings. n -messages were sent sequentially over the transport layer, sending -message i+1 after the i-th message was received. All messages were sent -over the same connection and the time to establish the connection was -not taken into account since this overhead is minuscule in practice --- -as long as a connection is used for a significant number of messages. - -+--------------+----------+----------+----------+----------+----------+ -| Transport | UDP | TCP | SMTP | SMTP (RH | SMTP | -| | | | (Purdue | 8.0) | (SGL | -| | | | s | | qmail) | -| | | | endmail) | | | -+==============+==========+==========+==========+==========+==========+ -| 11 bytes | 31 ms | 55 ms | 781 s | 77 s | 24 s | -+--------------+----------+----------+----------+----------+----------+ -| 407 bytes | 37 ms | 62 ms | 789 s | 78 s | 25 s | -+--------------+----------+----------+----------+----------+----------+ -| 1,221 bytes | 46 ms | 73 ms | 804 s | 78 s | 25 s | -+--------------+----------+----------+----------+----------+----------+ - -The benchmarks show that UDP and TCP are, as expected, both -significantly faster compared with any of the SMTP services. Among the -SMTP implementations, there can be significant differences depending on -the SMTP configuration. Filtering with an external tool like procmail -that needs to re-parse its configuration for each mail can be very -expensive. Applying spam filters can also significantly impact the -performance of the underlying SMTP implementation. The microbenchmark -shows that SMTP can be a viable solution for initiating peer-to-peer -sessions: a couple of seconds to connect to a peer are probably not even -going to be noticed by users. The next benchmark measures the possible -throughput for a transport. Throughput can be measured by sending -multiple messages in parallel and measuring packet loss. Note that not -only UDP but also the TCP transport can actually loose messages since -the TCP implementation drops messages if the ``write`` to the socket -would block. While the SMTP protocol never drops messages itself, it is -often so slow that only a fraction of the messages can be sent and -received in the given time-bounds. For this benchmark we report the -message loss after allowing t time for sending m messages. If messages -were not sent (or received) after an overall timeout of t, they were -considered lost. The benchmark was performed using two Xeon 2 GHZ -machines running RedHat 8.0 with sendmail. The machines were connected -with a direct 100 MBit Ethernet connection. Figures udp1200, tcp1200 and -smtp-MTUs show that the throughput for messages of size 1,200 octets is -2,343 kbps, 3,310 kbps and 6 kbps for UDP, TCP and SMTP respectively. -The high per-message overhead of SMTP can be improved by increasing the -MTU, for example, an MTU of 12,000 octets improves the throughput to 13 -kbps as figure smtp-MTUs shows. Our research paper [Transport2014]_ has -some more details on the benchmarking results. - -Bluetooth plugin -.. _Bluetooth-plugin: - -Bluetooth plugin ----------------- - -This page describes the new Bluetooth transport plugin for GNUnet. The -plugin is still in the testing stage so don't expect it to work -perfectly. If you have any questions or problems just post them here or -ask on the IRC channel. - -- What do I need to use the Bluetooth plugin transport? - -- BluetoothHow does it work? - -- What possible errors should I be aware of? - -- How do I configure my peer? - -- How can I test it? - -.. _What-do-I-need-to-use-the-Bluetooth-plugin-transport_003f: - -What do I need to use the Bluetooth plugin transport? ------------------------------------------------------ - -If you are a GNU/Linux user and you want to use the Bluetooth transport -plugin you should install the ``BlueZ`` development libraries (if they -aren't already installed). For instructions about how to install the -libraries you should check out the BlueZ site -(`http://www.bluez.org <http://www.bluez.org/>`__). If you don't know if -you have the necessary libraries, don't worry, just run the GNUnet -configure script and you will be able to see a notification at the end -which will warn you if you don't have the necessary libraries. - -.. _How-does-it-work2_003f: - -.. todo:: Change to unique title? - -How does it work2? ------------------- - -The Bluetooth transport plugin uses virtually the same code as the WLAN -plugin and only the helper binary is different. The helper takes a -single argument, which represents the interface name and is specified in -the configuration file. Here are the basic steps that are followed by -the helper binary used on GNU/Linux: - -- it verifies if the name corresponds to a Bluetooth interface name - -- it verifies if the interface is up (if it is not, it tries to bring - it up) - -- it tries to enable the page and inquiry scan in order to make the - device discoverable and to accept incoming connection requests *The - above operations require root access so you should start the - transport plugin with root privileges.* - -- it finds an available port number and registers a SDP service which - will be used to find out on which port number is the server listening - on and switch the socket in listening mode - -- it sends a HELLO message with its address - -- finally it forwards traffic from the reading sockets to the STDOUT - and from the STDIN to the writing socket - -Once in a while the device will make an inquiry scan to discover the -nearby devices and it will send them randomly HELLO messages for peer -discovery. - -.. _What-possible-errors-should-I-be-aware-of_003f: - -What possible errors should I be aware of? ------------------------------------------- - -*This section is dedicated for GNU/Linux users* - -Well there are many ways in which things could go wrong but I will try -to present some tools that you could use to debug and some scenarios. - -- ``bluetoothd -n -d`` : use this command to enable logging in the - foreground and to print the logging messages - -- ``hciconfig``: can be used to configure the Bluetooth devices. If you - run it without any arguments it will print information about the - state of the interfaces. So if you receive an error that the device - couldn't be brought up you should try to bring it manually and to see - if it works (use ``hciconfig -a hciX up``). If you can't and the - Bluetooth address has the form 00:00:00:00:00:00 it means that there - is something wrong with the D-Bus daemon or with the Bluetooth - daemon. Use ``bluetoothd`` tool to see the logs - -- ``sdptool`` can be used to control and interrogate SDP servers. If - you encounter problems regarding the SDP server (like the SDP server - is down) you should check out if the D-Bus daemon is running - correctly and to see if the Bluetooth daemon started correctly(use - ``bluetoothd`` tool). Also, sometimes the SDP service could work but - somehow the device couldn't register its service. Use - ``sdptool browse [dev-address]`` to see if the service is registered. - There should be a service with the name of the interface and GNUnet - as provider. - -- ``hcitool`` : another useful tool which can be used to configure the - device and to send some particular commands to it. - -- ``hcidump`` : could be used for low level debugging - -.. _How-do-I-configure-my-peer2_003f: - -.. todo:: Fix name/referencing now that we're using Sphinx. - -How do I configure my peer2? ----------------------------- - -On GNU/Linux, you just have to be sure that the interface name -corresponds to the one that you want to use. Use the ``hciconfig`` tool -to check that. By default it is set to hci0 but you can change it. - -A basic configuration looks like this: - -:: - - [transport-bluetooth] - # Name of the interface (typically hciX) - INTERFACE = hci0 - # Real hardware, no testing - TESTMODE = 0 TESTING_IGNORE_KEYS = ACCEPT_FROM; - -In order to use the Bluetooth transport plugin when the transport -service is started, you must add the plugin name to the default -transport service plugins list. For example: - -:: - - [transport] ... PLUGINS = dns bluetooth ... - -If you want to use only the Bluetooth plugin set *PLUGINS = bluetooth* - -On Windows, you cannot specify which device to use. The only thing that -you should do is to add *bluetooth* on the plugins list of the transport -service. - -.. _How-can-I-test-it_003f: - -How can I test it? ------------------- - -If you have two Bluetooth devices on the same machine and you are using -GNU/Linux you must: - -- create two different file configuration (one which will use the first - interface (*hci0*) and the other which will use the second interface - (*hci1*)). Let's name them *peer1.conf* and *peer2.conf*. - -- run *gnunet-peerinfo -c peerX.conf -s* in order to generate the peers - private keys. The **X** must be replace with 1 or 2. - -- run *gnunet-arm -c peerX.conf -s -i=transport* in order to start the - transport service. (Make sure that you have \"bluetooth\" on the - transport plugins list if the Bluetooth transport service doesn't - start.) - -- run *gnunet-peerinfo -c peer1.conf -s* to get the first peer's ID. If - you already know your peer ID (you saved it from the first command), - this can be skipped. - -- run *gnunet-transport -c peer2.conf -p=PEER1_ID -s* to start sending - data for benchmarking to the other peer. - -This scenario will try to connect the second peer to the first one and -then start sending data for benchmarking. - -If you have two different machines and your configuration files are good -you can use the same scenario presented on the beginning of this -section. - -Another way to test the plugin functionality is to create your own -application which will use the GNUnet framework with the Bluetooth -transport service. - -.. _The-implementation-of-the-Bluetooth-transport-plugin: - -The implementation of the Bluetooth transport plugin ----------------------------------------------------- - -This page describes the implementation of the Bluetooth transport -plugin. - -First I want to remind you that the Bluetooth transport plugin uses -virtually the same code as the WLAN plugin and only the helper binary is -different. Also the scope of the helper binary from the Bluetooth -transport plugin is the same as the one used for the WLAN transport -plugin: it accesses the interface and then it forwards traffic in both -directions between the Bluetooth interface and stdin/stdout of the -process involved. - -The Bluetooth plugin transport could be used both on GNU/Linux and -Windows platforms. - -- Linux functionality - -- Pending Features - -.. _Linux-functionality: - -Linux functionality -^^^^^^^^^^^^^^^^^^^ - -In order to implement the plugin functionality on GNU/Linux I used the -BlueZ stack. For the communication with the other devices I used the -RFCOMM protocol. Also I used the HCI protocol to gain some control over -the device. The helper binary takes a single argument (the name of the -Bluetooth interface) and is separated in two stages: - -.. _THE-INITIALIZATION: - -.. todo:: 'THE INITIALIZATION' should be in bigger letters or stand out, not - starting a new section? - -THE INITIALIZATION -^^^^^^^^^^^^^^^^^^ - -- first, it checks if we have root privileges (*Remember that we need - to have root privileges in order to be able to bring the interface up - if it is down or to change its state.*). - -- second, it verifies if the interface with the given name exists. - - **If the interface with that name exists and it is a Bluetooth - interface:** - -- it creates a RFCOMM socket which will be used for listening and call - the *open_device* method - - On the *open_device* method: - - - creates a HCI socket used to send control events to the device - - - searches for the device ID using the interface name - - - saves the device MAC address - - - checks if the interface is down and tries to bring it UP - - - checks if the interface is in discoverable mode and tries to make - it discoverable - - - closes the HCI socket and binds the RFCOMM one - - - switches the RFCOMM socket in listening mode - - - registers the SDP service (the service will be used by the other - devices to get the port on which this device is listening on) - -- drops the root privileges - - **If the interface is not a Bluetooth interface the helper exits with - a suitable error** - -.. _THE-LOOP: - -THE LOOP -^^^^^^^^ - -The helper binary uses a list where it saves all the connected neighbour -devices (*neighbours.devices*) and two buffers (*write_pout* and -*write_std*). The first message which is send is a control message with -the device's MAC address in order to announce the peer presence to the -neighbours. Here are a short description of what happens in the main -loop: - -- Every time when it receives something from the STDIN it processes the - data and saves the message in the first buffer (*write_pout*). When - it has something in the buffer, it gets the destination address from - the buffer, searches the destination address in the list (if there is - no connection with that device, it creates a new one and saves it to - the list) and sends the message. - -- Every time when it receives something on the listening socket it - accepts the connection and saves the socket on a list with the - reading sockets. - -- Every time when it receives something from a reading socket it parses - the message, verifies the CRC and saves it in the *write_std* buffer - in order to be sent later to the STDOUT. - -So in the main loop we use the select function to wait until one of the -file descriptor saved in one of the two file descriptors sets used is -ready to use. The first set (*rfds*) represents the reading set and it -could contain the list with the reading sockets, the STDIN file -descriptor or the listening socket. The second set (*wfds*) is the -writing set and it could contain the sending socket or the STDOUT file -descriptor. After the select function returns, we check which file -descriptor is ready to use and we do what is supposed to do on that kind -of event. *For example:* if it is the listening socket then we accept a -new connection and save the socket in the reading list; if it is the -STDOUT file descriptor, then we write to STDOUT the message from the -*write_std* buffer. - -To find out on which port a device is listening on we connect to the -local SDP server and search the registered service for that device. - -*You should be aware of the fact that if the device fails to connect to -another one when trying to send a message it will attempt one more time. -If it fails again, then it skips the message.* *Also you should know -that the transport Bluetooth plugin has support for*\ **broadcast -messages**\ *.* - -.. _Details-about-the-broadcast-implementation: - -Details about the broadcast implementation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -First I want to point out that the broadcast functionality for the -CONTROL messages is not implemented in a conventional way. Since the -inquiry scan time is too big and it will take some time to send a -message to all the discoverable devices I decided to tackle the problem -in a different way. Here is how I did it: - -- If it is the first time when I have to broadcast a message I make an - inquiry scan and save all the devices' addresses to a vector. - -- After the inquiry scan ends I take the first address from the list - and I try to connect to it. If it fails, I try to connect to the next - one. If it succeeds, I save the socket to a list and send the message - to the device. - -- When I have to broadcast another message, first I search on the list - for a new device which I'm not connected to. If there is no new - device on the list I go to the beginning of the list and send the - message to the old devices. After 5 cycles I make a new inquiry scan - to check out if there are new discoverable devices and save them to - the list. If there are no new discoverable devices I reset the - cycling counter and go again through the old list and send messages - to the devices saved in it. - -**Therefore**: - -- every time when I have a broadcast message I look up on the list for - a new device and send the message to it - -- if I reached the end of the list for 5 times and I'm connected to all - the devices from the list I make a new inquiry scan. *The number of - the list's cycles after an inquiry scan could be increased by - redefining the MAX_LOOPS variable* - -- when there are no new devices I send messages to the old ones. - -Doing so, the broadcast control messages will reach the devices but with -delay. - -*NOTICE:* When I have to send a message to a certain device first I -check on the broadcast list to see if we are connected to that device. -If not we try to connect to it and in case of success we save the -address and the socket on the list. If we are already connected to that -device we simply use the socket. - -.. _Pending-features: - -Pending features -^^^^^^^^^^^^^^^^ - -- Implement a testcase for the helper : *The testcase consists of a - program which emulates the plugin and uses the helper. It will - simulate connections, disconnections and data transfers.* - -If you have a new idea about a feature of the plugin or suggestions -about how I could improve the implementation you are welcome to comment -or to contact me. - -.. _WLAN-plugin: - -WLAN plugin ------------ - -This section documents how the wlan transport plugin works. Parts which -are not implemented yet or could be better implemented are described at -the end. - -.. [Transport2014] https://bib.gnunet.org/date.html#paper_5fshort2014 diff --git a/subsystems/vpnstack.rst b/subsystems/vpnstack.rst @@ -1,6 +0,0 @@ - -VPN and VPN Support -=================== - -.. toctree:: - diff --git a/users/index.rst b/users/index.rst @@ -21,6 +21,7 @@ welcome. :maxdepth: 2 start + subsystems gns reclaim fs diff --git a/users/subsystems.rst b/users/subsystems.rst @@ -0,0 +1,2111 @@ +Subsystems +========== + +These services comprise a backbone of core services for +peer-to-peer applications to use. + +CADET - Decentralized End-to-end Transport +------------------------------------------ + +The Confidential Ad-hoc Decentralized End-to-end Transport (CADET) subsystem +in GNUnet is responsible for secure end-to-end +communications between nodes in the GNUnet overlay network. CADET builds +on the CORE subsystem, which provides for the link-layer communication, +by adding routing, forwarding, and additional security to the +connections. CADET offers the same cryptographic services as CORE, but +on an end-to-end level. This is done so peers retransmitting traffic on +behalf of other peers cannot access the payload data. + +- CADET provides confidentiality with so-called perfect forward + secrecy; we use ECDHE powered by Curve25519 for the key exchange and + then use symmetric encryption, encrypting with both AES-256 and + Twofish + +- authentication is achieved by signing the ephemeral keys using + Ed25519, a deterministic variant of ECDSA + +- integrity protection (using SHA-512 to do encrypt-then-MAC, although + only 256 bits are sent to reduce overhead) + +- replay protection (using nonces, timestamps, challenge-response, + message counters and ephemeral keys) + +- liveness (keep-alive messages, timeout) + +Additional to the CORE-like security benefits, CADET offers other +properties that make it a more universal service than CORE. + +- CADET can establish channels to arbitrary peers in GNUnet. If a peer + is not immediately reachable, CADET will find a path through the + network and ask other peers to retransmit the traffic on its behalf. + +- CADET offers (optional) reliability mechanisms. In a reliable channel + traffic is guaranteed to arrive complete, unchanged and in-order. + +- CADET takes care of flow and congestion control mechanisms, not + allowing the sender to send more traffic than the receiver or the + network are able to process. + +.. _CORE-Subsystem: + +.. index:: + double: CORE; subsystem + +CORE - GNUnet link layer +======================== + +The CORE subsystem in GNUnet is responsible for securing link-layer +communications between nodes in the GNUnet overlay network. CORE builds +on the TRANSPORT subsystem which provides for the actual, insecure, +unreliable link-layer communication (for example, via UDP or WLAN), and +then adds fundamental security to the connections: + +- confidentiality with so-called perfect forward secrecy; we use ECDHE + (`Elliptic-curve + Diffie—Hellman <http://en.wikipedia.org/wiki/Elliptic_curve_Diffie%E2%80%93Hellman>`__) + powered by Curve25519 (`Curve25519 <http://cr.yp.to/ecdh.html>`__) + for the key exchange and then use symmetric encryption, encrypting + with both AES-256 + (`AES-256 <http://en.wikipedia.org/wiki/Rijndael>`__) and Twofish + (`Twofish <http://en.wikipedia.org/wiki/Twofish>`__) + +- `authentication <http://en.wikipedia.org/wiki/Authentication>`__ is + achieved by signing the ephemeral keys using Ed25519 + (`Ed25519 <http://ed25519.cr.yp.to/>`__), a deterministic variant of + ECDSA (`ECDSA <http://en.wikipedia.org/wiki/ECDSA>`__) + +- integrity protection (using SHA-512 + (`SHA-512 <http://en.wikipedia.org/wiki/SHA-2>`__) to do + encrypt-then-MAC + (`encrypt-then-MAC <http://en.wikipedia.org/wiki/Authenticated_encryption>`__)) + +- Replay (`replay <http://en.wikipedia.org/wiki/Replay_attack>`__) + protection (using nonces, timestamps, challenge-response, message + counters and ephemeral keys) + +- liveness (keep-alive messages, timeout) + +.. _Limitations: + +:index:`Limitations <CORE; limitations>` +Limitations +----------- + +CORE does not perform +`routing <http://en.wikipedia.org/wiki/Routing>`__; using CORE it is +only possible to communicate with peers that happen to already be +\"directly\" connected with each other. CORE also does not have an API +to allow applications to establish such \"direct\" connections --- for +this, applications can ask TRANSPORT, but TRANSPORT might not be able to +establish a \"direct\" connection. The TOPOLOGY subsystem is responsible +for trying to keep a few \"direct\" connections open at all times. +Applications that need to talk to particular peers should use the CADET +subsystem, as it can establish arbitrary \"indirect\" connections. + +Because CORE does not perform routing, CORE must only be used directly +by applications that either perform their own routing logic (such as +anonymous file-sharing) or that do not require routing, for example +because they are based on flooding the network. CORE communication is +unreliable and delivery is possibly out-of-order. Applications that +require reliable communication should use the CADET service. Each +application can only queue one message per target peer with the CORE +service at any time; messages cannot be larger than approximately 63 +kilobytes. If messages are small, CORE may group multiple messages +(possibly from different applications) prior to encryption. If permitted +by the application (using the `cork <http://baus.net/on-tcp_cork/>`__ +option), CORE may delay transmissions to facilitate grouping of multiple +small messages. If cork is not enabled, CORE will transmit the message +as soon as TRANSPORT allows it (TRANSPORT is responsible for limiting +bandwidth and congestion control). CORE does not allow flow control; +applications are expected to process messages at line-speed. If flow +control is needed, applications should use the CADET service. + +.. when is a peer connected +.. _When-is-a-peer-_0022connected_0022_003f: + +When is a peer \"connected\"? +----------------------------- + +In addition to the security features mentioned above, CORE also provides +one additional key feature to applications using it, and that is a +limited form of protocol-compatibility checking. CORE distinguishes +between TRANSPORT-level connections (which enable communication with +other peers) and application-level connections. Applications using the +CORE API will (typically) learn about application-level connections from +CORE, and not about TRANSPORT-level connections. When a typical +application uses CORE, it will specify a set of message types (from +``gnunet_protocols.h``) that it understands. CORE will then notify the +application about connections it has with other peers if and only if +those applications registered an intersecting set of message types with +their CORE service. Thus, it is quite possible that CORE only exposes a +subset of the established direct connections to a particular application +--- and different applications running above CORE might see different +sets of connections at the same time. + +A special case are applications that do not register a handler for any +message type. CORE assumes that these applications merely want to +monitor connections (or \"all\" messages via other callbacks) and will +notify those applications about all connections. This is used, for +example, by the ``gnunet-core`` command-line tool to display the active +connections. Note that it is also possible that the TRANSPORT service +has more active connections than the CORE service, as the CORE service +first has to perform a key exchange with connecting peers before +exchanging information about supported message types and notifying +applications about the new connection. +.. _Distributed-Hash-Table-_0028DHT_0029: + +.. index:: + double: Distributed hash table; subsystem + see: DHT; Distributed hash table + +DHT - Distributed Hash Table +============================ + +GNUnet includes a generic distributed hash table that can be used by +developers building P2P applications in the framework. This section +documents high-level features and how developers are expected to use the +DHT. We have a research paper detailing how the DHT works. Also, Nate's +thesis includes a detailed description and performance analysis (in +chapter 6). [R5N2011]_ + +.. todo:: Confirm: Are "Nate's thesis" and the "research paper" separate + entities? + +Key features of GNUnet's DHT include: + +- stores key-value pairs with values up to (approximately) 63k in size + +- works with many underlay network topologies (small-world, random + graph), underlay does not need to be a full mesh / clique + +- support for extended queries (more than just a simple 'key'), + filtering duplicate replies within the network (bloomfilter) and + content validation (for details, please read the subsection on the + block library) + +- can (optionally) return paths taken by the PUT and GET operations to + the application + +- provides content replication to handle churn + +GNUnet's DHT is randomized and unreliable. Unreliable means that there +is no strict guarantee that a value stored in the DHT is always found +— values are only found with high probability. While this is somewhat +true in all P2P DHTs, GNUnet developers should be particularly wary of +this fact (this will help you write secure, fault-tolerant code). Thus, +when writing any application using the DHT, you should always consider +the possibility that a value stored in the DHT by you or some other peer +might simply not be returned, or returned with a significant delay. Your +application logic must be written to tolerate this (naturally, some loss +of performance or quality of service is expected in this case). + +.. _Block-library-and-plugins: + +Block library and plugins +------------------------- + +.. _What-is-a-Block_003f: + +What is a Block? +^^^^^^^^^^^^^^^^ + +Blocks are small (< 63k) pieces of data stored under a key (struct +GNUNET_HashCode). Blocks have a type (enum GNUNET_BlockType) which +defines their data format. Blocks are used in GNUnet as units of static +data exchanged between peers and stored (or cached) locally. Uses of +blocks include file-sharing (the files are broken up into blocks), the +VPN (DNS information is stored in blocks) and the DHT (all information +in the DHT and meta-information for the maintenance of the DHT are both +stored using blocks). The block subsystem provides a few common +functions that must be available for any type of block. + + +.. [R5N2011] https://bib.gnunet.org/date.html#R5N +.. index:: + double: File sharing; subsystem + see: FS; File sharing + +.. _File_002dsharing-_0028FS_0029-Subsystem: + +FS — File sharing over GNUnet +============================= + +This chapter describes the details of how the file-sharing service +works. As with all services, it is split into an API (libgnunetfs), the +service process (gnunet-service-fs) and user interface(s). The +file-sharing service uses the datastore service to store blocks and the +DHT (and indirectly datacache) for lookups for non-anonymous +file-sharing. Furthermore, the file-sharing service uses the block +library (and the block fs plugin) for validation of DHT operations. + +In contrast to many other services, libgnunetfs is rather complex since +the client library includes a large number of high-level abstractions; +this is necessary since the FS service itself largely only operates on +the block level. The FS library is responsible for providing a +file-based abstraction to applications, including directories, meta +data, keyword search, verification, and so on. + +The method used by GNUnet to break large files into blocks and to use +keyword search is called the \"Encoding for Censorship Resistant +Sharing\" (ECRS). ECRS is largely implemented in the fs library; block +validation is also reflected in the block FS plugin and the FS service. +ECRS on-demand encoding is implemented in the FS service. + +.. note:: The documentation in this chapter is quite incomplete. + +.. _Encoding-for-Censorship_002dResistant-Sharing-_0028ECRS_0029: + +.. index:: + see: Encoding for Censorship-Resistant Sharing; ECRS + +:index:`ECRS — Encoding for Censorship-Resistant Sharing <single: ECRS>` +ECRS — Encoding for Censorship-Resistant Sharing +------------------------------------------------ + +When GNUnet shares files, it uses a content encoding that is called +ECRS, the Encoding for Censorship-Resistant Sharing. Most of ECRS is +described in the (so far unpublished) research paper attached to this +page. ECRS obsoletes the previous ESED and ESED II encodings which were +used in GNUnet before version 0.7.0. The rest of this page assumes that +the reader is familiar with the attached paper. What follows is a +description of some minor extensions that GNUnet makes over what is +described in the paper. The reason why these extensions are not in the +paper is that we felt that they were obvious or trivial extensions to +the original scheme and thus did not warrant space in the research +report. + +.. todo:: Find missing link to file system paper. +.. index:: + double: GNU Name System; subsystem + see: GNS; GNU Name System + +.. _GNU-Name-System-_0028GNS_0029: + +GNS - the GNU Name System +========================= + +The GNU Name System (GNS) is a decentralized database that enables users +to securely resolve names to values. Names can be used to identify other +users (for example, in social networking), or network services (for +example, VPN services running at a peer in GNUnet, or purely IP-based +services on the Internet). Users interact with GNS by typing in a +hostname that ends in a top-level domain that is configured in the "GNS" +section, matches an identity of the user or ends in a Base32-encoded +public key. + +Videos giving an overview of most of the GNS and the motivations behind +it is available here and here. The remainder of this chapter targets +developers that are familiar with high level concepts of GNS as +presented in these talks. + +.. todo:: Link to videos and GNS talks? + +GNS-aware applications should use the GNS resolver to obtain the +respective records that are stored under that name in GNS. Each record +consists of a type, value, expiration time and flags. + +The type specifies the format of the value. Types below 65536 correspond +to DNS record types, larger values are used for GNS-specific records. +Applications can define new GNS record types by reserving a number and +implementing a plugin (which mostly needs to convert the binary value +representation to a human-readable text format and vice-versa). The +expiration time specifies how long the record is to be valid. The GNS +API ensures that applications are only given non-expired values. The +flags are typically irrelevant for applications, as GNS uses them +internally to control visibility and validity of records. + +Records are stored along with a signature. The signature is generated +using the private key of the authoritative zone. This allows any GNS +resolver to verify the correctness of a name-value mapping. + +Internally, GNS uses the NAMECACHE to cache information obtained from +other users, the NAMESTORE to store information specific to the local +users, and the DHT to exchange data between users. A plugin API is used +to enable applications to define new GNS record types. + +.. index:: + double: HOSTLIST; subsystem + +.. _HOSTLIST-Subsystem: + +HOSTLIST — HELLO bootstrapping and gossip +========================================= + +Peers in the GNUnet overlay network need address information so that +they can connect with other peers. GNUnet uses so called HELLO messages +to store and exchange peer addresses. GNUnet provides several methods +for peers to obtain this information: + +- out-of-band exchange of HELLO messages (manually, using for example + gnunet-peerinfo) + +- HELLO messages shipped with GNUnet (automatic with distribution) + +- UDP neighbor discovery in LAN (IPv4 broadcast, IPv6 multicast) + +- topology gossiping (learning from other peers we already connected + to), and + +- the HOSTLIST daemon covered in this section, which is particularly + relevant for bootstrapping new peers. + +New peers have no existing connections (and thus cannot learn from +gossip among peers), may not have other peers in their LAN and might be +started with an outdated set of HELLO messages from the distribution. In +this case, getting new peers to connect to the network requires either +manual effort or the use of a HOSTLIST to obtain HELLOs. + +.. _HELLOs: + +HELLOs +------ + +The basic information peers require to connect to other peers are +contained in so called HELLO messages you can think of as a business +card. Besides the identity of the peer (based on the cryptographic +public key) a HELLO message may contain address information that +specifies ways to contact a peer. By obtaining HELLO messages, a peer +can learn how to contact other peers. + +.. _Overview-for-the-HOSTLIST-subsystem: + +Overview for the HOSTLIST subsystem +----------------------------------- + +The HOSTLIST subsystem provides a way to distribute and obtain contact +information to connect to other peers using a simple HTTP GET request. +Its implementation is split in three parts, the main file for the +daemon itself (``gnunet-daemon-hostlist.c``), the HTTP client used to +download peer information (``hostlist-client.c``) and the server +component used to provide this information to other peers +(``hostlist-server.c``). The server is basically a small HTTP web server +(based on GNU libmicrohttpd) which provides a list of HELLOs known to +the local peer for download. The client component is basically a HTTP +client (based on libcurl) which can download hostlists from one or more +websites. The hostlist format is a binary blob containing a sequence of +HELLO messages. Note that any HTTP server can theoretically serve a +hostlist, the built-in hostlist server makes it simply convenient to +offer this service. + +.. _Features: + +Features +^^^^^^^^ + +The HOSTLIST daemon can: + +- provide HELLO messages with validated addresses obtained from + PEERINFO to download for other peers + +- download HELLO messages and forward these message to the TRANSPORT + subsystem for validation + +- advertises the URL of this peer's hostlist address to other peers via + gossip + +- automatically learn about hostlist servers from the gossip of other + peers + +.. _HOSTLIST-_002d-Limitations: + +HOSTLIST - Limitations +^^^^^^^^^^^^^^^^^^^^^^ + +The HOSTLIST daemon does not: + +- verify the cryptographic information in the HELLO messages + +- verify the address information in the HELLO messages + +.. _Interacting-with-the-HOSTLIST-daemon: + +Interacting with the HOSTLIST daemon +------------------------------------ + +The HOSTLIST subsystem is currently implemented as a daemon, so there is +no need for the user to interact with it and therefore there is no +command line tool and no API to communicate with the daemon. In the +future, we can envision changing this to allow users to manually trigger +the download of a hostlist. + +Since there is no command line interface to interact with HOSTLIST, the +only way to interact with the hostlist is to use STATISTICS to obtain or +modify information about the status of HOSTLIST: + +:: + + $ gnunet-statistics -s hostlist + +In particular, HOSTLIST includes a **persistent** value in statistics +that specifies when the hostlist server might be queried next. As this +value is exponentially increasing during runtime, developers may want to +reset or manually adjust it. Note that HOSTLIST (but not STATISTICS) +needs to be shutdown if changes to this value are to have any effect on +the daemon (as HOSTLIST does not monitor STATISTICS for changes to the +download frequency). + +.. _Hostlist-security-address-validation: + +Hostlist security address validation +------------------------------------ + +Since information obtained from other parties cannot be trusted without +validation, we have to distinguish between *validated* and *not +validated* addresses. Before using (and so trusting) information from +other parties, this information has to be double-checked (validated). +Address validation is not done by HOSTLIST but by the TRANSPORT service. + +The HOSTLIST component is functionally located between the PEERINFO and +the TRANSPORT subsystem. When acting as a server, the daemon obtains +valid (*validated*) peer information (HELLO messages) from the PEERINFO +service and provides it to other peers. When acting as a client, it +contacts the HOSTLIST servers specified in the configuration, downloads +the (unvalidated) list of HELLO messages and forwards these information +to the TRANSPORT server to validate the addresses. + +.. _The-HOSTLIST-daemon: + +:index:`The HOSTLIST daemon <double: daemon; HOSTLIST>` +The HOSTLIST daemon +------------------- + +The hostlist daemon is the main component of the HOSTLIST subsystem. It +is started by the ARM service and (if configured) starts the HOSTLIST +client and server components. + +GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT +If the daemon provides a hostlist itself it can advertise it's own +hostlist to other peers. To do so it sends a +``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to other peers +when they connect to this peer on the CORE level. This hostlist +advertisement message contains the URL to access the HOSTLIST HTTP +server of the sender. The daemon may also subscribe to this type of +message from CORE service, and then forward these kind of message to the +HOSTLIST client. The client then uses all available URLs to download +peer information when necessary. + +When starting, the HOSTLIST daemon first connects to the CORE subsystem +and if hostlist learning is enabled, registers a CORE handler to receive +this kind of messages. Next it starts (if configured) the client and +server. It passes pointers to CORE connect and disconnect and receive +handlers where the client and server store their functions, so the +daemon can notify them about CORE events. + +To clean up on shutdown, the daemon has a cleaning task, shutting down +all subsystems and disconnecting from CORE. + +.. _The-HOSTLIST-server: + +:index:`The HOSTLIST server <single: HOSTLIST; server>` +The HOSTLIST server +------------------- + +The server provides a way for other peers to obtain HELLOs. Basically it +is a small web server other peers can connect to and download a list of +HELLOs using standard HTTP; it may also advertise the URL of the +hostlist to other peers connecting on CORE level. + +.. _The-HTTP-Server: + +The HTTP Server +^^^^^^^^^^^^^^^ + +During startup, the server starts a web server listening on the port +specified with the HTTPPORT value (default 8080). In addition it +connects to the PEERINFO service to obtain peer information. The +HOSTLIST server uses the GNUNET_PEERINFO_iterate function to request +HELLO information for all peers and adds their information to a new +hostlist if they are suitable (expired addresses and HELLOs without +addresses are both not suitable) and the maximum size for a hostlist is +not exceeded (MAX_BYTES_PER_HOSTLISTS = 500000). When PEERINFO finishes +(with a last NULL callback), the server destroys the previous hostlist +response available for download on the web server and replaces it with +the updated hostlist. The hostlist format is basically a sequence of +HELLO messages (as obtained from PEERINFO) without any special +tokenization. Since each HELLO message contains a size field, the +response can easily be split into separate HELLO messages by the client. + +A HOSTLIST client connecting to the HOSTLIST server will receive the +hostlist as an HTTP response and the server will terminate the +connection with the result code ``HTTP 200 OK``. The connection will be +closed immediately if no hostlist is available. + +.. _Advertising-the-URL: + +Advertising the URL +^^^^^^^^^^^^^^^^^^^ + +The server also advertises the URL to download the hostlist to other +peers if hostlist advertisement is enabled. When a new peer connects and +has hostlist learning enabled, the server sends a +``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` message to this peer +using the CORE service. + +HOSTLIST client +.. _The-HOSTLIST-client: + +The HOSTLIST client +------------------- + +The client provides the functionality to download the list of HELLOs +from a set of URLs. It performs a standard HTTP request to the URLs +configured and learned from advertisement messages received from other +peers. When a HELLO is downloaded, the HOSTLIST client forwards the +HELLO to the TRANSPORT service for validation. + +The client supports two modes of operation: + +- download of HELLOs (bootstrapping) + +- learning of URLs + +.. _Bootstrapping: + +Bootstrapping +^^^^^^^^^^^^^ + +For bootstrapping, it schedules a task to download the hostlist from the +set of known URLs. The downloads are only performed if the number of +current connections is smaller than a minimum number of connections (at +the moment 4). The interval between downloads increases exponentially; +however, the exponential growth is limited if it becomes longer than an +hour. At that point, the frequency growth is capped at (#number of +connections \* 1h). + +Once the decision has been taken to download HELLOs, the daemon chooses +a random URL from the list of known URLs. URLs can be configured in the +configuration or be learned from advertisement messages. The client uses +a HTTP client library (libcurl) to initiate the download using the +libcurl multi interface. Libcurl passes the data to the +callback_download function which stores the data in a buffer if space is +available and the maximum size for a hostlist download is not exceeded +(MAX_BYTES_PER_HOSTLISTS = 500000). When a full HELLO was downloaded, +the HOSTLIST client offers this HELLO message to the TRANSPORT service +for validation. When the download is finished or failed, statistical +information about the quality of this URL is updated. + +.. _Learning: + +:index:`Learning <single: HOSTLIST; learning>` +Learning +^^^^^^^^ + +The client also manages hostlist advertisements from other peers. The +HOSTLIST daemon forwards ``GNUNET_MESSAGE_TYPE_HOSTLIST_ADVERTISEMENT`` +messages to the client subsystem, which extracts the URL from the +message. Next, a test of the newly obtained URL is performed by +triggering a download from the new URL. If the URL works correctly, it +is added to the list of working URLs. + +The size of the list of URLs is restricted, so if an additional server +is added and the list is full, the URL with the worst quality ranking +(determined through successful downloads and number of HELLOs e.g.) is +discarded. During shutdown the list of URLs is saved to a file for +persistence and loaded on startup. URLs from the configuration file are +never discarded. + +.. _Usage: + +Usage +----- + +To start HOSTLIST by default, it has to be added to the DEFAULTSERVICES +section for the ARM services. This is done in the default configuration. + +For more information on how to configure the HOSTLIST subsystem see the +installation handbook: Configuring the hostlist to bootstrap Configuring +your peer to provide a hostlist + +.. index:: + double: IDENTITY; subsystem + +.. _IDENTITY-Subsystem: + +IDENTITY — Ego management +========================= + +Identities of \"users\" in GNUnet are called egos. Egos can be used as +pseudonyms (\"fake names\") or be tied to an organization (for example, +\"GNU\") or even the actual identity of a human. GNUnet users are +expected to have many egos. They might have one tied to their real +identity, some for organizations they manage, and more for different +domains where they want to operate under a pseudonym. + +The IDENTITY service allows users to manage their egos. The identity +service manages the private keys egos of the local user; it does not +manage identities of other users (public keys). Public keys for other +users need names to become manageable. GNUnet uses the GNU Name System +(GNS) to give names to other users and manage their public keys +securely. This chapter is about the IDENTITY service, which is about the +management of private keys. + +On the network, an ego corresponds to an ECDSA key (over Curve25519, +using RFC 6979, as required by GNS). Thus, users can perform actions +under a particular ego by using (signing with) a particular private key. +Other users can then confirm that the action was really performed by +that ego by checking the signature against the respective public key. + +The IDENTITY service allows users to associate a human-readable name +with each ego. This way, users can use names that will remind them of +the purpose of a particular ego. The IDENTITY service will store the +respective private keys and allows applications to access key +information by name. Users can change the name that is locally (!) +associated with an ego. Egos can also be deleted, which means that the +private key will be removed and it thus will not be possible to perform +actions with that ego in the future. + +Additionally, the IDENTITY subsystem can associate service functions +with egos. For example, GNS requires the ego that should be used for the +shorten zone. GNS will ask IDENTITY for an ego for the \"gns-short\" +service. The IDENTITY service has a mapping of such service strings to +the name of the ego that the user wants to use for this service, for +example \"my-short-zone-ego\". + +Finally, the IDENTITY API provides access to a special ego, the +anonymous ego. The anonymous ego is special in that its private key is +not really private, but fixed and known to everyone. Thus, anyone can +perform actions as anonymous. This can be useful as with this trick, +code does not have to contain a special case to distinguish between +anonymous and pseudonymous egos. + +.. index:: + double: subsystem; MESSENGER + +.. _MESSENGER-Subsystem: + +MESSENGER — Room-based end-to-end messaging +=========================================== + +The MESSENGER subsystem is responsible for secure end-to-end +communication in groups of nodes in the GNUnet overlay network. +MESSENGER builds on the CADET subsystem which provides a reliable and +secure end-to-end communication between the nodes inside of these +groups. + +Additionally to the CADET security benefits, MESSENGER provides +following properties designed for application level usage: + +- MESSENGER provides integrity by signing the messages with the users + provided ego + +- MESSENGER adds (optional) forward secrecy by replacing the key pair + of the used ego and signing the propagation of the new one with old + one (chaining egos) + +- MESSENGER provides verification of a original sender by checking + against all used egos from a member which are currently in active use + (active use depends on the state of a member session) + +- MESSENGER offsers (optional) decentralized message forwarding between + all nodes in a group to improve availability and prevent MITM-attacks + +- MESSENGER handles new connections and disconnections from nodes in + the group by reconnecting them preserving an efficient structure for + message distribution (ensuring availability and accountablity) + +- MESSENGER provides replay protection (messages can be uniquely + identified via SHA-512, include a timestamp and the hash of the last + message) + +- MESSENGER allows detection for dropped messages by chaining them + (messages refer to the last message by their hash) improving + accountability + +- MESSENGER allows requesting messages from other peers explicitly to + ensure availability + +- MESSENGER provides confidentiality by padding messages to few + different sizes (512 bytes, 4096 bytes, 32768 bytes and maximal + message size from CADET) + +- MESSENGER adds (optional) confidentiality with ECDHE to exchange and + use symmetric encryption, encrypting with both AES-256 and Twofish + but allowing only selected members to decrypt (using the receivers + ego for ECDHE) + +Also MESSENGER provides multiple features with privacy in mind: + +- MESSENGER allows deleting messages from all peers in the group by the + original sender (uses the MESSENGER provided verification) + +- MESSENGER allows using the publicly known anonymous ego instead of + any unique identifying ego + +- MESSENGER allows your node to decide between acting as host of the + used messaging room (sharing your peer's identity with all nodes in + the group) or acting as guest (sharing your peer's identity only with + the nodes you explicitly open a connection to) + +- MESSENGER handles members independently of the peer's identity making + forwarded messages indistinguishable from directly received ones ( + complicating the tracking of messages and identifying its origin) + +- MESSENGER allows names of members being not unique (also names are + optional) + +- MESSENGER does not include information about the selected receiver of + an explicitly encrypted message in its header, complicating it for + other members to draw conclusions from communication partners + + +.. index:: + single: GNS; name cache + double: subsystem; NAMECACHE + +.. _GNS-Namecache: + +NAMECACHE — DHT caching of GNS results +====================================== + +The NAMECACHE subsystem is responsible for caching (encrypted) +resolution results of the GNU Name System (GNS). GNS makes zone +information available to other users via the DHT. However, as accessing +the DHT for every lookup is expensive (and as the DHT's local cache is +lost whenever the peer is restarted), GNS uses the NAMECACHE as a more +persistent cache for DHT lookups. Thus, instead of always looking up +every name in the DHT, GNS first checks if the result is already +available locally in the NAMECACHE. Only if there is no result in the +NAMECACHE, GNS queries the DHT. The NAMECACHE stores data in the same +(encrypted) format as the DHT. It thus makes no sense to iterate over +all items in the NAMECACHE – the NAMECACHE does not have a way to +provide the keys required to decrypt the entries. + +Blocks in the NAMECACHE share the same expiration mechanism as blocks in +the DHT – the block expires wheneever any of the records in the +(encrypted) block expires. The expiration time of the block is the only +information stored in plaintext. The NAMECACHE service internally +performs all of the required work to expire blocks, clients do not have +to worry about this. Also, given that NAMECACHE stores only GNS blocks +that local users requested, there is no configuration option to limit +the size of the NAMECACHE. It is assumed to be always small enough (a +few MB) to fit on the drive. + +The NAMECACHE supports the use of different database backends via a +plugin API. + +.. index:: + double: subsystem; NAMESTORE + +.. _NAMESTORE-Subsystem: + +NAMESTORE — Storage of local GNS zones +====================================== + +The NAMESTORE subsystem provides persistent storage for local GNS zone +information. All local GNS zone information are managed by NAMESTORE. It +provides both the functionality to administer local GNS information +(e.g. delete and add records) as well as to retrieve GNS information +(e.g to list name information in a client). NAMESTORE does only manage +the persistent storage of zone information belonging to the user running +the service: GNS information from other users obtained from the DHT are +stored by the NAMECACHE subsystem. + +NAMESTORE uses a plugin-based database backend to store GNS information +with good performance. Here sqlite and PostgreSQL are supported +database backends. NAMESTORE clients interact with the IDENTITY +subsystem to obtain cryptographic information about zones based on egos +as described with the IDENTITY subsystem, but internally NAMESTORE +refers to zones using the respective private key. + +NAMESTORE is queried and monitored by the ZONEMASTER service which periodically +publishes public records of GNS zones. ZONEMASTER also +collaborates with the NAMECACHE subsystem and stores zone information +when local information are modified in the NAMECACHE cache to increase look-up +performance for local information and to enable local access to private records +in zones through GNS. + +NAMESTORE provides functionality to look-up and store records, to +iterate over a specific or all zones and to monitor zones for changes. +NAMESTORE functionality can be accessed using the NAMESTORE C API, the NAMESTORE +REST API, or the NAMESTORE command line tool. + +.. index:: + single: subsystem; Network size estimation + see: NSE; Network size estimation + +.. _NSE-Subsystem: + +NSE — Network size estimation +============================= + +NSE stands for Network Size Estimation. The NSE subsystem provides other +subsystems and users with a rough estimate of the number of peers +currently participating in the GNUnet overlay. The computed value is not +a precise number as producing a precise number in a decentralized, +efficient and secure way is impossible. While NSE's estimate is +inherently imprecise, NSE also gives the expected range. For a peer that +has been running in a stable network for a while, the real network size +will typically (99.7% of the time) be in the range of [2/3 estimate, 3/2 +estimate]. We will now give an overview of the algorithm used to +calculate the estimate; all of the details can be found in this +technical report. + +.. todo:: link to the report. + +.. _Motivation: + +Motivation +---------- + +Some subsystems, like DHT, need to know the size of the GNUnet network +to optimize some parameters of their own protocol. The decentralized +nature of GNUnet makes efficient and securely counting the exact number +of peers infeasible. Although there are several decentralized algorithms +to count the number of peers in a system, so far there is none to do so +securely. Other protocols may allow any malicious peer to manipulate the +final result or to take advantage of the system to perform Denial of +Service (DoS) attacks against the network. GNUnet's NSE protocol avoids +these drawbacks. + +NSE security +.. _Security: + +:index:`Security <single: NSE; security>` +Security +^^^^^^^^ + +The NSE subsystem is designed to be resilient against these attacks. It +uses `proofs of +work <http://en.wikipedia.org/wiki/Proof-of-work_system>`__ to prevent +one peer from impersonating a large number of participants, which would +otherwise allow an adversary to artificially inflate the estimate. The +DoS protection comes from the time-based nature of the protocol: the +estimates are calculated periodically and out-of-time traffic is either +ignored or stored for later retransmission by benign peers. In +particular, peers cannot trigger global network communication at will. + +.. _Principle: + +:index:`Principle <single: NSE; principle of operation>` +Principle +--------- + +The algorithm calculates the estimate by finding the globally closest +peer ID to a random, time-based value. + +The idea is that the closer the ID is to the random value, the more +\"densely packed\" the ID space is, and therefore, more peers are in the +network. + +.. _Example: + +Example +^^^^^^^ + +Suppose all peers have IDs between 0 and 100 (our ID space), and the +random value is 42. If the closest peer has the ID 70 we can imagine +that the average \"distance\" between peers is around 30 and therefore +the are around 3 peers in the whole ID space. On the other hand, if the +closest peer has the ID 44, we can imagine that the space is rather +packed with peers, maybe as much as 50 of them. Naturally, we could have +been rather unlucky, and there is only one peer and happens to have the +ID 44. Thus, the current estimate is calculated as the average over +multiple rounds, and not just a single sample. + +.. _Algorithm: + +Algorithm +^^^^^^^^^ + +Given that example, one can imagine that the job of the subsystem is to +efficiently communicate the ID of the closest peer to the target value +to all the other peers, who will calculate the estimate from it. + +.. _Target-value: + +Target value +^^^^^^^^^^^^ + +The target value itself is generated by hashing the current time, +rounded down to an agreed value. If the rounding amount is 1h (default) +and the time is 12:34:56, the time to hash would be 12:00:00. The +process is repeated each rounding amount (in this example would be every +hour). Every repetition is called a round. + +.. _Timing: + +Timing +^^^^^^ + +The NSE subsystem has some timing control to avoid everybody +broadcasting its ID all at one. Once each peer has the target random +value, it compares its own ID to the target and calculates the +hypothetical size of the network if that peer were to be the closest. +Then it compares the hypothetical size with the estimate from the +previous rounds. For each value there is an associated point in the +period, let's call it \"broadcast time\". If its own hypothetical +estimate is the same as the previous global estimate, its \"broadcast +time\" will be in the middle of the round. If its bigger it will be +earlier and if its smaller (the most likely case) it will be later. This +ensures that the peers closest to the target value start broadcasting +their ID the first. + +.. _Controlled-Flooding: + +Controlled Flooding +^^^^^^^^^^^^^^^^^^^ + +When a peer receives a value, first it verifies that it is closer than +the closest value it had so far, otherwise it answers the incoming +message with a message containing the better value. Then it checks a +proof of work that must be included in the incoming message, to ensure +that the other peer's ID is not made up (otherwise a malicious peer +could claim to have an ID of exactly the target value every round). Once +validated, it compares the broadcast time of the received value with the +current time and if it's not too early, sends the received value to its +neighbors. Otherwise it stores the value until the correct broadcast +time comes. This prevents unnecessary traffic of sub-optimal values, +since a better value can come before the broadcast time, rendering the +previous one obsolete and saving the traffic that would have been used +to broadcast it to the neighbors. + +.. _Calculating-the-estimate: + +Calculating the estimate +^^^^^^^^^^^^^^^^^^^^^^^^ + +Once the closest ID has been spread across the network each peer gets +the exact distance between this ID and the target value of the round and +calculates the estimate with a mathematical formula described in the +tech report. The estimate generated with this method for a single round +is not very precise. Remember the case of the example, where the only +peer is the ID 44 and we happen to generate the target value 42, +thinking there are 50 peers in the network. Therefore, the NSE subsystem +remembers the last 64 estimates and calculates an average over them, +giving a result of which usually has one bit of uncertainty (the real +size could be half of the estimate or twice as much). Note that the +actual network size is calculated in powers of two of the raw input, +thus one bit of uncertainty means a factor of two in the size estimate. + +.. index:: + double: subsystem; PEERINFO + +.. _PEERINFO-Subsystem: + +PEERINFO — Persistent HELLO storage +=================================== + +The PEERINFO subsystem is used to store verified (validated) information +about known peers in a persistent way. It obtains these addresses for +example from TRANSPORT service which is in charge of address validation. +Validation means that the information in the HELLO message are checked +by connecting to the addresses and performing a cryptographic handshake +to authenticate the peer instance stating to be reachable with these +addresses. Peerinfo does not validate the HELLO messages itself but only +stores them and gives them to interested clients. + +As future work, we think about moving from storing just HELLO messages +to providing a generic persistent per-peer information store. More and +more subsystems tend to need to store per-peer information in persistent +way. To not duplicate this functionality we plan to provide a PEERSTORE +service providing this functionality. + +.. _PEERINFO-_002d-Features: + +PEERINFO - Features +------------------- + +- Persistent storage + +- Client notification mechanism on update + +- Periodic clean up for expired information + +- Differentiation between public and friend-only HELLO + +.. _PEERINFO-_002d-Limitations: + +PEERINFO - Limitations +---------------------- + +- Does not perform HELLO validation + +.. _DeveloperPeer-Information: + +DeveloperPeer Information +------------------------- + +The PEERINFO subsystem stores these information in the form of HELLO +messages you can think of as business cards. These HELLO messages +contain the public key of a peer and the addresses a peer can be reached +under. The addresses include an expiration date describing how long they +are valid. This information is updated regularly by the TRANSPORT +service by revalidating the address. If an address is expired and not +renewed, it can be removed from the HELLO message. + +Some peer do not want to have their HELLO messages distributed to other +peers, especially when GNUnet's friend-to-friend modus is enabled. To +prevent this undesired distribution. PEERINFO distinguishes between +*public* and *friend-only* HELLO messages. Public HELLO messages can be +freely distributed to other (possibly unknown) peers (for example using +the hostlist, gossiping, broadcasting), whereas friend-only HELLO +messages may not be distributed to other peers. Friend-only HELLO +messages have an additional flag ``friend_only`` set internally. For +public HELLO message this flag is not set. PEERINFO does and cannot not +check if a client is allowed to obtain a specific HELLO type. + +The HELLO messages can be managed using the GNUnet HELLO library. Other +GNUnet systems can obtain these information from PEERINFO and use it for +their purposes. Clients are for example the HOSTLIST component providing +these information to other peers in form of a hostlist or the TRANSPORT +subsystem using these information to maintain connections to other +peers. + +.. _Startup: + +Startup +------- + +During startup the PEERINFO services loads persistent HELLOs from disk. +First PEERINFO parses the directory configured in the HOSTS value of the +``PEERINFO`` configuration section to store PEERINFO information. For +all files found in this directory valid HELLO messages are extracted. In +addition it loads HELLO messages shipped with the GNUnet distribution. +These HELLOs are used to simplify network bootstrapping by providing +valid peer information with the distribution. The use of these HELLOs +can be prevented by setting the ``USE_INCLUDED_HELLOS`` in the +``PEERINFO`` configuration section to ``NO``. Files containing invalid +information are removed. + +.. _Managing-Information: + +Managing Information +-------------------- + +The PEERINFO services stores information about known PEERS and a single +HELLO message for every peer. A peer does not need to have a HELLO if no +information are available. HELLO information from different sources, for +example a HELLO obtained from a remote HOSTLIST and a second HELLO +stored on disk, are combined and merged into one single HELLO message +per peer which will be given to clients. During this merge process the +HELLO is immediately written to disk to ensure persistence. + +PEERINFO in addition periodically scans the directory where information +are stored for empty HELLO messages with expired TRANSPORT addresses. +This periodic task scans all files in the directory and recreates the +HELLO messages it finds. Expired TRANSPORT addresses are removed from +the HELLO and if the HELLO does not contain any valid addresses, it is +discarded and removed from the disk. + +.. _Obtaining-Information: + +Obtaining Information +--------------------- + +When a client requests information from PEERINFO, PEERINFO performs a +lookup for the respective peer or all peers if desired and transmits +this information to the client. The client can specify if friend-only +HELLOs have to be included or not and PEERINFO filters the respective +HELLO messages before transmitting information. + +To notify clients about changes to PEERINFO information, PEERINFO +maintains a list of clients interested in this notifications. Such a +notification occurs if a HELLO for a peer was updated (due to a merge +for example) or a new peer was added. + +.. index:: + double: subsystem; PEERSTORE + +.. _PEERSTORE-Subsystem: + +PEERSTORE — Extensible local persistent data storage +==================================================== + +GNUnet's PEERSTORE subsystem offers persistent per-peer storage for +other GNUnet subsystems. GNUnet subsystems can use PEERSTORE to +persistently store and retrieve arbitrary data. Each data record stored +with PEERSTORE contains the following fields: + +- subsystem: Name of the subsystem responsible for the record. + +- peerid: Identity of the peer this record is related to. + +- key: a key string identifying the record. + +- value: binary record value. + +- expiry: record expiry date. + +.. _Functionality: + +Functionality +------------- + +Subsystems can store any type of value under a (subsystem, peerid, key) +combination. A \"replace\" flag set during store operations forces the +PEERSTORE to replace any old values stored under the same (subsystem, +peerid, key) combination with the new value. Additionally, an expiry +date is set after which the record is \*possibly\* deleted by PEERSTORE. + +Subsystems can iterate over all values stored under any of the following +combination of fields: + +- (subsystem) + +- (subsystem, peerid) + +- (subsystem, key) + +- (subsystem, peerid, key) + +Subsystems can also request to be notified about any new values stored +under a (subsystem, peerid, key) combination by sending a \"watch\" +request to PEERSTORE. + +.. _Architecture: + +Architecture +------------ + +PEERSTORE implements the following components: + +- PEERSTORE service: Handles store, iterate and watch operations. + +- PEERSTORE API: API to be used by other subsystems to communicate and + issue commands to the PEERSTORE service. + +- PEERSTORE plugins: Handles the persistent storage. At the moment, + only an \"sqlite\" plugin is implemented. + +.. index:: + double: subsystem; REGEX + +.. _REGEX-Subsystem: + +REGEX — Service discovery using regular expressions +=================================================== + +Using the REGEX subsystem, you can discover peers that offer a +particular service using regular expressions. The peers that offer a +service specify it using a regular expressions. Peers that want to +patronize a service search using a string. The REGEX subsystem will then +use the DHT to return a set of matching offerers to the patrons. + +For the technical details, we have Max's defense talk and Max's Master's +thesis. + +.. note:: An additional publication is under preparation and available + to team members (in Git). + +.. todo:: Missing links to Max's talk and Master's thesis + +.. _How-to-run-the-regex-profiler: + +How to run the regex profiler +----------------------------- + +The gnunet-regex-profiler can be used to profile the usage of mesh/regex +for a given set of regular expressions and strings. Mesh/regex allows +you to announce your peer ID under a certain regex and search for peers +matching a particular regex using a string. See +`szengel2012ms <https://bib.gnunet.org/full/date.html#2012_5f2>`__ for a +full introduction. + +First of all, the regex profiler uses GNUnet testbed, thus all the +implications for testbed also apply to the regex profiler (for example +you need password-less ssh login to the machines listed in your hosts +file). + +**Configuration** + +Moreover, an appropriate configuration file is needed. In the following +paragraph the important details are highlighted. + +Announcing of the regular expressions is done by the +gnunet-daemon-regexprofiler, therefore you have to make sure it is +started, by adding it to the START_ON_DEMAND set of ARM: + +:: + + [regexprofiler] + START_ON_DEMAND = YES + +Furthermore you have to specify the location of the binary: + +:: + + [regexprofiler] + # Location of the gnunet-daemon-regexprofiler binary. + BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler + # Regex prefix that will be applied to all regular expressions and + # search string. + REGEX_PREFIX = "GNVPN-0001-PAD" + +When running the profiler with a large scale deployment, you probably +want to reduce the workload of each peer. Use the following options to +do this. + +:: + + [dht] + # Force network size estimation + FORCE_NSE = 1 + + [dhtcache] + DATABASE = heap + # Disable RC-file for Bloom filter? (for benchmarking with limited IO + # availability) + DISABLE_BF_RC = YES + # Disable Bloom filter entirely + DISABLE_BF = YES + + [nse] + # Minimize proof-of-work CPU consumption by NSE + WORKBITS = 1 + +**Options** + +To finally run the profiler some options and the input data need to be +specified on the command line. + +:: + + gnunet-regex-profiler -c config-file -d log-file -n num-links \ + -p path-compression-length -s search-delay -t matching-timeout \ + -a num-search-strings hosts-file policy-dir search-strings-file + +Where\... + +- \... ``config-file`` means the configuration file created earlier. + +- \... ``log-file`` is the file where to write statistics output. + +- \... ``num-links`` indicates the number of random links between + started peers. + +- \... ``path-compression-length`` is the maximum path compression + length in the DFA. + +- \... ``search-delay`` time to wait between peers finished linking and + starting to match strings. + +- \... ``matching-timeout`` timeout after which to cancel the + searching. + +- \... ``num-search-strings`` number of strings in the + search-strings-file. + +- \... the ``hosts-file`` should contain a list of hosts for the + testbed, one per line in the following format: + + - ``user@host_ip:port`` + +- \... the ``policy-dir`` is a folder containing text files containing + one or more regular expressions. A peer is started for each file in + that folder and the regular expressions in the corresponding file are + announced by this peer. + +- \... the ``search-strings-file`` is a text file containing search + strings, one in each line. + +You can create regular expressions and search strings for every AS in +the Internet using the attached scripts. You need one of the `CAIDA +routeviews +prefix2as <http://data.caida.org/datasets/routing/routeviews-prefix2as/>`__ +data files for this. Run + +:: + + create_regex.py <filename> <output path> + +to create the regular expressions and + +:: + + create_strings.py <input path> <outfile> + +to create a search strings file from the previously created regular +expressions. + + + +.. index:: + double: subsystem; REST + +.. _REST-Subsystem: + +REST — RESTful GNUnet Web APIs +============================== + +.. todo:: Define REST + +Using the REST subsystem, you can expose REST-based APIs or services. +The REST service is designed as a pluggable architecture. + +**Configuration** + +The REST service can be configured in various ways. The reference config +file can be found in ``src/rest/rest.conf``: + +:: + + [rest] + REST_PORT=7776 + REST_ALLOW_HEADERS=Authorization,Accept,Content-Type + REST_ALLOW_ORIGIN=* + REST_ALLOW_CREDENTIALS=true + +The port as well as CORS (cross-origin resource sharing) headers +that are supposed to be advertised by the rest service are configurable. + +.. index:: + double: subsystem; REVOCATION + +.. _REVOCATION-Subsystem: + +REVOCATION — Ego key revocation +=============================== + +The REVOCATION subsystem is responsible for key revocation of Egos. If a +user learns that their private key has been compromised or has lost it, +they can use the REVOCATION system to inform all of the other users that +their private key is no longer valid. The subsystem thus includes ways +to query for the validity of keys and to propagate revocation messages. + +.. _Dissemination: + +Dissemination +------------- + +When a revocation is performed, the revocation is first of all +disseminated by flooding the overlay network. The goal is to reach every +peer, so that when a peer needs to check if a key has been revoked, this +will be purely a local operation where the peer looks at its local +revocation list. Flooding the network is also the most robust form of +key revocation --- an adversary would have to control a separator of the +overlay graph to restrict the propagation of the revocation message. +Flooding is also very easy to implement --- peers that receive a +revocation message for a key that they have never seen before simply +pass the message to all of their neighbours. + +Flooding can only distribute the revocation message to peers that are +online. In order to notify peers that join the network later, the +revocation service performs efficient set reconciliation over the sets +of known revocation messages whenever two peers (that both support +REVOCATION dissemination) connect. The SET service is used to perform +this operation efficiently. + +.. _Revocation-Message-Design-Requirements: + +Revocation Message Design Requirements +-------------------------------------- + +However, flooding is also quite costly, creating O(\|E\|) messages on a +network with \|E\| edges. Thus, revocation messages are required to +contain a proof-of-work, the result of an expensive computation (which, +however, is cheap to verify). Only peers that have expended the CPU time +necessary to provide this proof will be able to flood the network with +the revocation message. This ensures that an attacker cannot simply +flood the network with millions of revocation messages. The +proof-of-work required by GNUnet is set to take days on a typical PC to +compute; if the ability to quickly revoke a key is needed, users have +the option to pre-compute revocation messages to store off-line and use +instantly after their key has expired. + +Revocation messages must also be signed by the private key that is being +revoked. Thus, they can only be created while the private key is in the +possession of the respective user. This is another reason to create a +revocation message ahead of time and store it in a secure location. + +.. index:: + double: subsystems; Random peer sampling + see: RPS; Random peer sampling + +.. _RPS-Subsystem: + +RPS — Random peer sampling +========================== + +In literature, Random Peer Sampling (RPS) refers to the problem of +reliably [1]_ drawing random samples from an unstructured p2p network. + +Doing so in a reliable manner is not only hard because of inherent +problems but also because of possible malicious peers that could try to +bias the selection. + +It is useful for all kind of gossip protocols that require the selection +of random peers in the whole network like gathering statistics, +spreading and aggregating information in the network, load balancing and +overlay topology management. + +The approach chosen in the RPS service implementation in GNUnet follows +the `Brahms <https://bib.gnunet.org/full/date.html\#2009_5f0>`__ design. + +The current state is \"work in progress\". There are a lot of things +that need to be done, primarily finishing the experimental evaluation +and a re-design of the API. + +The abstract idea is to subscribe to connect to/start the RPS service +and request random peers that will be returned when they represent a +random selection from the whole network with high probability. + +An additional feature to the original Brahms-design is the selection of +sub-groups: The GNUnet implementation of RPS enables clients to ask for +random peers from a group that is defined by a common shared secret. +(The secret could of course also be public, depending on the use-case.) + +Another addition to the original protocol was made: The sampler +mechanism that was introduced in Brahms was slightly adapted and used to +actually sample the peers and returned to the client. This is necessary +as the original design only keeps peers connected to random other peers +in the network. In order to return random peers to client requests +independently random, they cannot be drawn from the connected peers. The +adapted sampler makes sure that each request for random peers is +independent from the others. + +.. _Brahms: + +Brahms +------ + +The high-level concept of Brahms is two-fold: Combining push-pull gossip +with locally fixing a assumed bias using cryptographic min-wise +permutations. The central data structure is the view - a peer's current +local sample. This view is used to select peers to push to and pull +from. This simple mechanism can be biased easily. For this reason Brahms +'fixes' the bias by using the so-called sampler. A data structure that +takes a list of elements as input and outputs a random one of them +independently of the frequency in the input set. Both an element that +was put into the sampler a single time and an element that was put into +it a million times have the same probability of being the output. This +is achieved with exploiting min-wise independent permutations. In the +RPS service we use HMACs: On the initialisation of a sampler element, a +key is chosen at random. On each input the HMAC with the random key is +computed. The sampler element keeps the element with the minimal HMAC. + +In order to fix the bias in the view, a fraction of the elements in the +view are sampled through the sampler from the random stream of peer IDs. + +According to the theoretical analysis of Bortnikov et al. this suffices +to keep the network connected and having random peers in the view. + +.. [1] + \"Reliable\" in this context means having no bias, neither spatial, + nor temporal, nor through malicious activity. + +.. index:: + double: STATISTICS; subsystem + +.. _STATISTICS-Subsystem: + +STATISTICS — Runtime statistics publication +=========================================== + +In GNUnet, the STATISTICS subsystem offers a central place for all +subsystems to publish unsigned 64-bit integer run-time statistics. +Keeping this information centrally means that there is a unified way for +the user to obtain data on all subsystems, and individual subsystems do +not have to always include a custom data export method for performance +metrics and other statistics. For example, the TRANSPORT system uses +STATISTICS to update information about the number of directly connected +peers and the bandwidth that has been consumed by the various plugins. +This information is valuable for diagnosing connectivity and performance +issues. + +Following the GNUnet service architecture, the STATISTICS subsystem is +divided into an API which is exposed through the header +**gnunet_statistics_service.h** and the STATISTICS service +**gnunet-service-statistics**. The **gnunet-statistics** command-line +tool can be used to obtain (and change) information about the values +stored by the STATISTICS service. The STATISTICS service does not +communicate with other peers. + +Data is stored in the STATISTICS service in the form of tuples +**(subsystem, name, value, persistence)**. The subsystem determines to +which other GNUnet's subsystem the data belongs. name is the name +through which value is associated. It uniquely identifies the record +from among other records belonging to the same subsystem. In some parts +of the code, the pair **(subsystem, name)** is called a **statistic** as +it identifies the values stored in the STATISTCS service.The persistence +flag determines if the record has to be preserved across service +restarts. A record is said to be persistent if this flag is set for it; +if not, the record is treated as a non-persistent record and it is lost +after service restart. Persistent records are written to and read from +the file **statistics.data** before shutdown and upon startup. The file +is located in the HOME directory of the peer. + +An anomaly of the STATISTICS service is that it does not terminate +immediately upon receiving a shutdown signal if it has any clients +connected to it. It waits for all the clients that are not monitors to +close their connections before terminating itself. This is to prevent +the loss of data during peer shutdown — delaying the STATISTICS +service shutdown helps other services to store important data to +STATISTICS during shutdown. + +.. index:: + double: TRANSPORT Next Generation; subsystem + +.. _TRANSPORT_002dNG-Subsystem: + +TRANSPORT-NG — Next-generation transport management +=================================================== + +The current GNUnet TRANSPORT architecture is rooted in the GNUnet 0.4 +design of using plugins for the actual transmission operations and the +ATS subsystem to select a plugin and allocate bandwidth. The following +key issues have been identified with this design: + +- Bugs in one plugin can affect the TRANSPORT service and other + plugins. There is at least one open bug that affects sockets, where + the origin is difficult to pinpoint due to the large code base. + +- Relevant operating system default configurations often impose a limit + of 1024 file descriptors per process. Thus, one plugin may impact + other plugin's connectivity choices. + +- Plugins are required to offer bi-directional connectivity. However, + firewalls (incl. NAT boxes) and physical environments sometimes only + allow uni-directional connectivity, which then currently cannot be + utilized at all. + +- Distance vector routing was implemented in 209 but shortly afterwards + broken and due to the complexity of implementing it as a plugin and + dealing with the resource allocation consequences was never useful. + +- Most existing plugins communicate completely using cleartext, + exposing metad data (message size) and making it easy to fingerprint + and possibly block GNUnet traffic. + +- Various NAT traversal methods are not supported. + +- The service logic is cluttered with \"manipulation\" support code for + TESTBED to enable faking network characteristics like lossy + connections or firewewalls. + +- Bandwidth allocation is done in ATS, requiring the duplication of + state and resulting in much delayed allocation decisions. As a + result, often available bandwidth goes unused. Users are expected to + manually configure bandwidth limits, instead of TRANSPORT using + congestion control to adapt automatically. + +- TRANSPORT is difficult to test and has bad test coverage. + +- HELLOs include an absolute expiration time. Nodes with unsynchronized + clocks cannot connect. + +- Displaying the contents of a HELLO requires the respective plugin as + the plugin-specific data is encoded in binary. This also complicates + logging. + +.. _Design-goals-of-TNG: + +Design goals of TNG +------------------- + +In order to address the above issues, we want to: + +- Move plugins into separate processes which we shall call + *communicators*. Communicators connect as clients to the transport + service. + +- TRANSPORT should be able to utilize any number of communicators to the + same peer at the same time. + +- TRANSPORT should be responsible for fragmentation, retransmission, + flow- and congestion-control. Users should no longer have to + configure bandwidth limits: TRANSPORT should detect what is available + and use it. + +- Communicators should be allowed to be uni-directional and + unreliable. TRANSPORT shall create bi-directional channels from this + whenever possible. + +- DV should no longer be a plugin, but part of TRANSPORT. + +- TRANSPORT should provide communicators help communicating, for + example in the case of uni-directional communicators or the need for + out-of-band signalling for NAT traversal. We call this functionality + *backchannels*. + +- Transport manipulation should be signalled to CORE on a per-message + basis instead of an approximate bandwidth. + +- CORE should signal performance requirements (reliability, latency, + etc.) on a per-message basis to TRANSPORT. If possible, TRANSPORT + should consider those options when scheduling messages for + transmission. + +- HELLOs should be in a human-readable format with monotonic time + expirations. + +The new architecture is planned as follows: + +.. image:: /images/tng.png + +TRANSPORT's main objective is to establish bi-directional virtual links +using a variety of possibly uni-directional communicators. Links undergo +the following steps: + +1. Communicator informs TRANSPORT A that a queue (direct neighbour) is + available, or equivalently TRANSPORT A discovers a (DV) path to a + target B. + +2. TRANSPORT A sends a challenge to the target peer, trying to confirm + that the peer can receive. FIXME: This is not implemented properly + for DV. Here we should really take a validated DVH and send a + challenge exactly down that path! + +3. The other TRANSPORT, TRANSPORT B, receives the challenge, and sends + back a response, possibly using a dierent path. If TRANSPORT B does + not yet have a virtual link to A, it must try to establish a virtual + link. + +4. Upon receiving the response, TRANSPORT A creates the virtual link. If + the response included a challenge, TRANSPORT A must respond to this + challenge as well, eectively re-creating the TCP 3-way handshake + (just with longer challenge values). + +.. _HELLO_002dNG: + +HELLO-NG +-------- + +HELLOs change in three ways. First of all, communicators encode the +respective addresses in a human-readable URL-like string. This way, we +do no longer require the communicator to print the contents of a HELLO. +Second, HELLOs no longer contain an expiration time, only a creation +time. The receiver must only compare the respective absolute values. So +given a HELLO from the same sender with a larger creation time, then the +old one is no longer valid. This also obsoletes the need for the +gnunet-hello binary to set HELLO expiration times to never. Third, a +peer no longer generates one big HELLO that always contains all of the +addresses. Instead, each address is signed individually and shared only +over the address scopes where it makes sense to share the address. In +particular, care should be taken to not share MACs across the Internet +and confine their use to the LAN. As each address is signed separately, +having multiple addresses valid at the same time (given the new creation +time expiration logic) requires that those addresses must have exactly +the same creation time. Whenever that monotonic time is increased, all +addresses must be re-signed and re-distributed. + +.. _Priorities-and-preferences: + +Priorities and preferences +-------------------------- + +In the new design, TRANSPORT adopts a feature (which was previously +already available in CORE) of the MQ API to allow applications to +specify priorities and preferences per message (or rather, per MQ +envelope). The (updated) MQ API allows applications to specify one of +four priority levels as well as desired preferences for transmission by +setting options on an envelope. These preferences currently are: + +- GNUNET_MQ_PREF_UNRELIABLE: Disables TRANSPORT waiting for ACKS on + unreliable channels like UDP. Now it is fire and forget. These + messages then cannot be used for RTT estimates either. + +- GNUNET_MQ_PREF_LOW_LATENCY: Directs TRANSPORT to select the + lowest-latency transmission choices possible. + +- GNUNET_MQ_PREF_CORK_ALLOWED: Allows TRANSPORT to delay transmission + to group the message with other messages into a larger batch to + reduce the number of packets sent. + +- GNUNET_MQ_PREF_GOODPUT: Directs TRANSPORT to select the highest + goodput channel available. + +- GNUNET_MQ_PREF_OUT_OF_ORDER: Allows TRANSPORT to reorder the messages + as it sees fit, otherwise TRANSPORT should attempt to preserve + transmission order. + +Each MQ envelope is always able to store those options (and the +priority), and in the future this uniform API will be used by TRANSPORT, +CORE, CADET and possibly other subsystems that send messages (like +LAKE). When CORE sets preferences and priorities, it is supposed to +respect the preferences and priorities it is given from higher layers. +Similarly, CADET also simply passes on the preferences and priorities of +the layer above CADET. When a layer combines multiple smaller messages +into one larger transmission, the ``GNUNET_MQ_env_combine_options()`` +should be used to calculate options for the combined message. We note +that the exact semantics of the options may differ by layer. For +example, CADET will always strictly implement reliable and in-order +delivery of messages, while the same options are only advisory for +TRANSPORT and CORE: they should try (using ACKs on unreliable +communicators, not changing the message order themselves), but if +messages are lost anyway (e.g. because a TCP is dropped in the middle), +or if messages are reordered (e.g. because they took different paths +over the network and arrived in a different order) TRANSPORT and CORE do +not have to correct this. Whether a preference is strict or loose is +thus dened by the respective layer. + +.. _Communicators: + +Communicators +------------- + +The API for communicators is defined in +``gnunet_transport_communication_service.h``. Each communicator must +specify its (global) communication characteristics, which for now only +say whether the communication is reliable (e.g. TCP, HTTPS) or +unreliable (e.g. UDP, WLAN). Each communicator must specify a unique +address prex, or NULL if the communicator cannot establish outgoing +connections (for example because it is only acting as a TCP server). A +communicator must tell TRANSPORT which addresses it is reachable under. +Addresses may be added or removed at any time. A communicator may have +zero addresses (transmission only). Addresses do not have to match the +address prefix. + +TRANSPORT may ask a communicator to try to connect to another address. +TRANSPORT will only ask for connections where the address matches the +communicator's address prefix that was provided when the connection was +established. Communicators should then attempt to establish a +connection. +It is under the discretion of the communicator whether to honor this request. +Reasons for not honoring such a request may be that an existing connection exists +or resource limitations. +No response is provided to TRANSPORT service on failure. +The TRANSPORT service has to ask the communicator explicitly to retry. + +If a communicator succeeds in establishing an outgoing connection for +transmission, or if a communicator receives an incoming bi-directional +connection, the communicator must inform the TRANSPORT service that a +message queue (MQ) for transmission is now available. +For that MQ, the communicator must provide the peer identity claimed by the other end. +It must also provide a human-readable address (for debugging) and a maximum transfer unit +(MTU). A MTU of zero means sending is not supported, SIZE_MAX should be +used for no MTU. The communicator should also tell TRANSPORT what +network type is used for the queue. The communicator may tell TRANSPORT +anytime that the queue was deleted and is no longer available. + +The communicator API also provides for flow control. First, +communicators exhibit back-pressure on TRANSPORT: the number of messages +TRANSPORT may add to a queue for transmission will be limited. So by not +draining the transmission queue, back-pressure is provided to TRANSPORT. +In the other direction, communicators may allow TRANSPORT to give +back-pressure towards the communicator by providing a non-NULL +``GNUNET_TRANSPORT_MessageCompletedCallback`` argument to the +``GNUNET_TRANSPORT_communicator_receive`` function. In this case, +TRANSPORT will only invoke this function once it has processed the +message and is ready to receive more. Communicators should then limit +how much traffic they receive based on this backpressure. Note that +communicators do not have to provide a +``GNUNET_TRANSPORT_MessageCompletedCallback``; for example, UDP cannot +support back-pressure due to the nature of the UDP protocol. In this +case, TRANSPORT will implement its own TRANSPORT-to-TRANSPORT flow +control to reduce the sender's data rate to acceptable levels. + +TRANSPORT may notify a communicator about backchannel messages TRANSPORT +received from other peers for this communicator. Similarly, +communicators can ask TRANSPORT to try to send a backchannel message to +other communicators of other peers. The semantics of the backchannel +message are up to the communicators which use them. TRANSPORT may fail +transmitting backchannel messages, and TRANSPORT will not attempt to +retransmit them. + +UDP communicator +^^^^^^^^^^^^^^^^ + +The UDP communicator implements a basic encryption layer to protect from +metadata leakage. +The layer tries to establish a shared secret using an Elliptic-Curve Diffie-Hellman +key exchange in which the initiator of a packet creates an ephemeral key pair +to encrypt a message for the target peer identity. +The communicator always offers this kind of transmission queue to a (reachable) +peer in which messages are encrypted with dedicated keys. +The performance of this queue is not suitable for high volume data transfer. + +If the UDP connection is bi-directional, or the TRANSPORT is able to offer a +backchannel connection, the resulting key can be re-used if the recieving peer +is able to ACK the reception. +This will cause the communicator to offer a new queue (with a higher priority +than the default queue) to TRANSPORT with a limited capacity. +The capacity is increased whenever the communicator receives an ACK for a +transmission. +This queue is suitable for high-volume data transfer and TRANSPORT will likely +prioritize this queue (if available). + +Communicators that try to establish a connection to a target peer authenticate +their peer ID (public key) in the first packets by signing a monotonic time +stamp, its peer ID, and the target peerID and send this data as well as the signature +in one of the first packets. +Receivers should keep track (persist) of the monotonic time stamps for each +peer ID to reject possible replay attacks. + +FIXME: Handshake wire format? KX, Flow. + +TCP communicator +^^^^^^^^^^^^^^^^ + +FIXME: Handshake wire format? KX, Flow. + +QUIC communicator +^^^^^^^^^^^^^^^^^ +The QUIC communicator runs over a bi-directional UDP connection. +TLS layer with self-signed certificates (binding/signed with peer ID?). +Single, bi-directional stream? +FIXME: Handshake wire format? KX, Flow. + +.. index:: + double: TRANSPORT; subsystem + +.. _TRANSPORT-Subsystem: + +TRANSPORT — Overlay transport management +======================================== + +This chapter documents how the GNUnet transport subsystem works. The +GNUnet transport subsystem consists of three main components: the +transport API (the interface used by the rest of the system to access +the transport service), the transport service itself (most of the +interesting functions, such as choosing transports, happens here) and +the transport plugins. A transport plugin is a concrete implementation +for how two GNUnet peers communicate; many plugins exist, for example +for communication via TCP, UDP, HTTP, HTTPS and others. Finally, the +transport subsystem uses supporting code, especially the NAT/UPnP +library to help with tasks such as NAT traversal. + +Key tasks of the transport service include: + +- Create our HELLO message, notify clients and neighbours if our HELLO + changes (using NAT library as necessary) + +- Validate HELLOs from other peers (send PING), allow other peers to + validate our HELLO's addresses (send PONG) + +- Upon request, establish connections to other peers (using address + selection from ATS subsystem) and maintain them (again using PINGs + and PONGs) as long as desired + +- Accept incoming connections, give ATS service the opportunity to + switch communication channels + +- Notify clients about peers that have connected to us or that have + been disconnected from us + +- If a (stateful) connection goes down unexpectedly (without explicit + DISCONNECT), quickly attempt to recover (without notifying clients) + but do notify clients quickly if reconnecting fails + +- Send (payload) messages arriving from clients to other peers via + transport plugins and receive messages from other peers, forwarding + those to clients + +- Enforce inbound traffic limits (using flow-control if it is + applicable); outbound traffic limits are enforced by CORE, not by us + (!) + +- Enforce restrictions on P2P connection as specified by the blacklist + configuration and blacklisting clients + +Note that the term \"clients\" in the list above really refers to the +GNUnet-CORE service, as CORE is typically the only client of the +transport service. + +.. index:: + double: subsystem; SET + +.. _SET-Subsystem: + +SET — Peer to peer set operations (Deprecated) +============================================== + +.. note:: + + The SET subsystem is in process of being replaced by the SETU and SETI + subsystems, which provide basically the same functionality, just using + two different subsystems. SETI and SETU should be used for new code. + +The SET service implements efficient set operations between two peers +over a CADET tunnel. Currently, set union and set intersection are the +only supported operations. Elements of a set consist of an *element +type* and arbitrary binary *data*. The size of an element's data is +limited to around 62 KB. + +.. _Local-Sets: + +Local Sets +---------- + +Sets created by a local client can be modified and reused for multiple +operations. As each set operation requires potentially expensive special +auxiliary data to be computed for each element of a set, a set can only +participate in one type of set operation (either union or intersection). +The type of a set is determined upon its creation. If a the elements of +a set are needed for an operation of a different type, all of the set's +element must be copied to a new set of appropriate type. + +.. _Set-Modifications: + +Set Modifications +----------------- + +Even when set operations are active, one can add to and remove elements +from a set. However, these changes will only be visible to operations +that have been created after the changes have taken place. That is, +every set operation only sees a snapshot of the set from the time the +operation was started. This mechanism is *not* implemented by copying +the whole set, but by attaching *generation information* to each element +and operation. + +.. _Set-Operations: + +Set Operations +-------------- + +Set operations can be started in two ways: Either by accepting an +operation request from a remote peer, or by requesting a set operation +from a remote peer. Set operations are uniquely identified by the +involved *peers*, an *application id* and the *operation type*. + +The client is notified of incoming set operations by *set listeners*. A +set listener listens for incoming operations of a specific operation +type and application id. Once notified of an incoming set request, the +client can accept the set request (providing a local set for the +operation) or reject it. + +.. _Result-Elements: + +Result Elements +--------------- + +The SET service has three *result modes* that determine how an +operation's result set is delivered to the client: + +- **Full Result Set.** All elements of set resulting from the set + operation are returned to the client. + +- **Added Elements.** Only elements that result from the operation and + are not already in the local peer's set are returned. Note that for + some operations (like set intersection) this result mode will never + return any elements. This can be useful if only the remove peer is + actually interested in the result of the set operation. + +- **Removed Elements.** Only elements that are in the local peer's + initial set but not in the operation's result set are returned. Note + that for some operations (like set union) this result mode will never + return any elements. This can be useful if only the remove peer is + actually interested in the result of the set operation. + +.. index:: + double: subsystem; SETI + +.. _SETI-Subsystem: + +SETI — Peer to peer set intersections +===================================== + +The SETI service implements efficient set intersection between two peers +over a CADET tunnel. Elements of a set consist of an *element type* and +arbitrary binary *data*. The size of an element's data is limited to +around 62 KB. + +.. _Intersection-Sets: + +Intersection Sets +----------------- + +Sets created by a local client can be modified (by adding additional +elements) and reused for multiple operations. If elements are to be +removed, a fresh set must be created by the client. + +.. _Set-Intersection-Modifications: + +Set Intersection Modifications +------------------------------ + +Even when set operations are active, one can add elements to a set. +However, these changes will only be visible to operations that have been +created after the changes have taken place. That is, every set operation +only sees a snapshot of the set from the time the operation was started. +This mechanism is *not* implemented by copying the whole set, but by +attaching *generation information* to each element and operation. + +.. _Set-Intersection-Operations: + +Set Intersection Operations +--------------------------- + +Set operations can be started in two ways: Either by accepting an +operation request from a remote peer, or by requesting a set operation +from a remote peer. Set operations are uniquely identified by the +involved *peers*, an *application id* and the *operation type*. + +The client is notified of incoming set operations by *set listeners*. A +set listener listens for incoming operations of a specific operation +type and application id. Once notified of an incoming set request, the +client can accept the set request (providing a local set for the +operation) or reject it. + +.. _Intersection-Result-Elements: + +Intersection Result Elements +---------------------------- + +The SET service has two *result modes* that determine how an operation's +result set is delivered to the client: + +- **Return intersection.** All elements of set resulting from the set + intersection are returned to the client. + +- **Removed Elements.** Only elements that are in the local peer's + initial set but not in the intersection are returned. + + + + +.. index:: + double: SETU; subsystem + +.. _SETU-Subsystem: + +SETU — Peer to peer set unions +============================== + +The SETU service implements efficient set union operations between two +peers over a CADET tunnel. Elements of a set consist of an *element +type* and arbitrary binary *data*. The size of an element's data is +limited to around 62 KB. + +.. _Union-Sets: + +Union Sets +---------- + +Sets created by a local client can be modified (by adding additional +elements) and reused for multiple operations. If elements are to be +removed, a fresh set must be created by the client. + +.. _Set-Union-Modifications: + +Set Union Modifications +----------------------- + +Even when set operations are active, one can add elements to a set. +However, these changes will only be visible to operations that have been +created after the changes have taken place. That is, every set operation +only sees a snapshot of the set from the time the operation was started. +This mechanism is *not* implemented by copying the whole set, but by +attaching *generation information* to each element and operation. + +.. _Set-Union-Operations: + +Set Union Operations +-------------------- + +Set operations can be started in two ways: Either by accepting an +operation request from a remote peer, or by requesting a set operation +from a remote peer. Set operations are uniquely identified by the +involved *peers*, an *application id* and the *operation type*. + +The client is notified of incoming set operations by *set listeners*. A +set listener listens for incoming operations of a specific operation +type and application id. Once notified of an incoming set request, the +client can accept the set request (providing a local set for the +operation) or reject it. + +.. _Union-Result-Elements: + +Union Result Elements +--------------------- + +The SET service has three *result modes* that determine how an +operation's result set is delivered to the client: + +- **Locally added Elements.** Elements that are in the union but not + already in the local peer's set are returned. + +- **Remote added Elements.** Additionally, notify the client if the + remote peer lacked some elements and thus also return to the local + client those elements that we are sending to the remote peer to be + added to its union. Obtaining these elements requires setting the + ``GNUNET_SETU_OPTION_SYMMETRIC`` option.