arch (11142B)
1 * Architecture 2 3 Secure Share intends to implement a scalable P2P social network enabling 4 real-time one-to-one, one-to-many and many-to-many message distribution for 5 applications using the network while fulfilling the privacy requirements 6 described in the previous chapter. 7 8 It provides private and group messaging, status updates and profiles in the 9 first prototype version, while keeping the protocol extensible allowing various 10 social applications to be built on top later. 11 12 By combining PSYC with a P2P network architecture we get an efficient and 13 extensible protocol provided by PSYC and security and privacy properties 14 provided by the underlying P2P network. 15 16 ** P2P network architecture 17 18 Many P2P networks use an architecture where nodes connect to arbitrary peers, no 19 trust relation exists between them. A problem with this approach is that some 20 nodes could use more resources of the network than they contribute to it 21 (freeloaders), which can be alleviated by applying an economic model in the 22 network. For instance GNUnet uses an excess-based economy: a node when idle does 23 favors for free, but when busy it only works for nodes it likes and charges them 24 for favors they request, which they can pay back by doing a favor in return. 25 26 Another problem that could arise in this architecture are malicious nodes who 27 can perform various active attacks, including blocking access to parts of the 28 network, or returning false information to certain requests. These can be 29 avoided to some extent by randomized routing and by making it harder to create 30 new identities in the network. 31 32 A different approach we use is a friend-to-friend (F2F) architecture where nodes 33 only connect to friendly peers whom they trust. This has the advantage that it 34 avoids many attacks involving malicious nodes in the network. An attacker has to 35 infiltrate a user's social circle to perform a successful attack, which is much 36 harder. By adding a trust level metric to social connections we can further 37 differentiate between more and less trusted nodes in the network. 38 39 Also, a F2F architecture gives better incentives to participants in the network: 40 users help their friends by forwarding packets for them instead of random 41 strangers. Nodes with high bandwidth and no connection restrictions -- 42 e.g. server machines in data centers -- can improve throughput and connectivity 43 in the network by serving their owner's social circle. 44 45 Other systems based on a F2F architecture include Freenet \cite{dark-freenet}, 46 Drac \cite{drac}, Tonika, and GNUnet has a F2F mode as well. 47 48 ** Structure of the network 49 50 Another aspect of P2P networks is whether they're structured or not. In 51 structured networks the structure of the network is predefined, the node ID 52 determines the position of the node in the network, this information is enough 53 to be able to route packets to their destination. Often a distributed hash table 54 (DHT) is used in structured P2P networks which provides hash table functionality 55 distributed over many nodes in the network. 56 57 A different approach is an unstructured network like the Internet, where 58 arbitrary nodes can connect, no structure is imposed upon the nodes. In this 59 case a routing table is needed to be able to route a packet to its destination. 60 61 A social network could be built purely using a DHT, LifeSocial \cite{lifesocial} 62 is an example of such a network. In this case every shared status message, image 63 or document would become an entry in the DHT, and a profile consists of a 64 collection of links to other DHT entries. To ensure only the intended recipients 65 have access to private data, DHT entries are encrypted with a symmetric key, 66 which is attached to the entry encrypted with every user's public key who should 67 have access to the entry. This means that there's no forward secrecy in this 68 network, if a user's private key is compromised all these entries can still be 69 decrypted with that key. Even if noticed in time, re-encrypting all entries 70 affected by a compromised key is quite a costly operation when the number of 71 entries become larger after using the system over the years. 72 73 For our case either an unstructured network is suitable, or a structured network 74 where the structure is only used for routing, and not for storing user data in a 75 DHT. In our architecture data is pushed once to recipients who store it locally 76 as long as they need it, which means all profile data, messages and received 77 files are all available locally -- even offline -- and can be viewed and 78 searched using local tools on the personal device. 79 80 ** Software components 81 82 In a P2P network every user runs the P2P software on their devices, so it's 83 important that it is multi-platform, lightweight, and written in a compiled 84 language, so we can easily run it on all popular desktop platforms and small 85 devices as well, including plug computers, home routers, and even smartphones. 86 87 In our case the P2P software runs as a daemon -- a background process -- on the 88 local machine or on another device on the network. Client applications connect 89 to this daemon and integrate into the desktop or mobile GUI environment running 90 on the system. 91 92 Server machines, home routers and plug computers act as intermediary nodes in the 93 system, helping their owners' social network by forwarding packets for them. 94 95 Mobile phones require a different approach. Continuous network usage would drain 96 the battery quite fast, so we'll have to minimize it by disabling packet 97 forwarding for mobile nodes and connecting only to a trusted node with good 98 connectivity -- e.g. a server machine or a plug computer at home -- which would 99 forward the necessary packets for the mobile node. 100 101 ** Peer-to-peer framework 102 103 We have examined various P2P systems looking for an implementation that can 104 serve as a basis for our social messaging platform. The criteria for a suitable 105 P2P framework was: 106 107 - Free/libre/open-source software. 108 - Multi-platform, lightweight and written in a compiled language. 109 - Implements and provides an API for essential P2P features such as 110 bootstrapping, addressing, routing, encryption and NAT traversal. 111 112 We have found GNUnet to be the most promising implementation out there 113 satisfying these requirements. It is a modular P2P framework written in C, 114 providing an API for essential P2P functionalities. It supports advanced NAT 115 (Network Address Translation) traversal, which enables contacting nodes without 116 a public IP address typically found in home or corporate networks. Furthermore 117 it has several transport mechanisms with automatic transport selection, 118 including TCP, UDP, HTTP(S), SMTP and ad-hoc WiFi mesh networks. It also 119 provides various routing schemes and a distributed hash table. 120 121 It has three operation modes: in P2P mode it makes connections with any peer in 122 the network, in friend-to-friend (F2F) mode only trusted nodes are connected, 123 and in mixed mode a minimum number of trusted nodes are required to be connected 124 at all times. 125 126 GNUnet currently has two options for routing packets in the network: the 127 distance vector and the mesh service. 128 129 The distance vector (DV) service uses a fish-eye bounded distance vector 130 protocol \cite{gnunet-decrouting}, which builds a routing table by gossiping 131 about neighboring peers within a limited number of hops distance. It is a 132 link-state routing protocol with improved efficiency: nodes only know about the 133 state of a local neighborhood, and link state of nodes close to each other are 134 updated more often than of nodes multiple hops away. The DV service also 135 provides onion routing of packets through multiple hops, which improves network 136 connectivity by connecting two peers behind NAT through an intermediary hop, and 137 makes it harder for an observer to determine who is talking to whom. 138 139 The mesh service creates tunnels through several hops and supports multicast as 140 well. Initial routes to recipients are discovered using the DHT. It is still 141 being heavily worked on by the GNUnet team, for instance encryption is missing 142 and has to be implemented for the multicast groups in order to make it useful 143 for our purpose. 144 145 These routing methods only support delivery of packets to connected nodes, in 146 order to provide offline messaging, we'll need a store-and-forward mechanism in 147 the network. This can be implemented by storing encrypted packets on more stable 148 nodes in the network, until the recipient comes back online. 149 150 #+BEGIN_COMMENT 151 GNUnet's DHT component can be used for facilitating the bootstrapping process by 152 storing user public key to current node ID mappings in the DHT. This allows 153 peers offline for a longer period to look up the current node of a contact 154 in order to re-establish connection to the network, or it can be used to publish 155 addresses of nodes hosting public groups or providing a public news feed. 156 #+END_COMMENT 157 158 GNUnet also has an anonymous file sharing component which uses a DHT together 159 with the GNUnet Anonymity Protocol (GAP). For our use case -- transferring files 160 between friends -- this is not needed, instead we transfer files just like other 161 messages, using PSYC's multicast distribution channels. As the PSYC packet 162 syntax supports binary data without any encoding, this causes no additional 163 overhead. In order to transfer files, we would have to split them up into 164 smaller fragments, as the maximum packet size supported by GNUnet is 64KB. 165 166 #+CAPTION: Components and message flow in GNUnet 167 #+LABEL: fig:arch 168 #+ATTR_LaTeX: width=8.2cm placement=[h!] 169 [[./gnunet.png]] 170 171 ** Messaging daemon 172 173 GNUnet's modular architecture allows us to extend it with a service that 174 implements a messaging protocol, manages the connections between people, and 175 provides a local client interface. This service -- called psycd -- uses the PSYC 176 protocol for communication with both other peers and local clients. 177 178 Psycd sends messages through GNUnet core, which encrypts the message and passes 179 it to the modular transport system, sending packets through one of its transport 180 plugins. 181 182 In our prototype we use direct connections to peers. Users manually add their 183 friends by exchanging hello messages, which contain their public key and current 184 addresses. For the prototype version the focus was on the implementation of the 185 messaging daemon, and we intend to work on the underlying routing mechanism in 186 future versions. 187 188 See figure \ref{fig:arch} for an illustration of the components used in the 189 system. Dotted parts are not existing yet, only planned. The arrows depict the 190 flow of messages between components. 191 192 ** Functionality 193 194 One of the core concepts of PSYC is programmable channels with their own 195 subscription lists. Using this combined with custom user interfaces makes it 196 possible to implement the usual functionality found in centralized and federated 197 social networks, like private and group messages, status updates, photo and link 198 sharing, as well as features not found in those networks, like sharing of files 199 and custom content, or real-time notifications for custom events. 200 201 As Secure Share runs on the users' own device and stores all incoming messages 202 and data locally, this enables offline usage and local search in the data 203 received from subscribed friends or groups.