presentations

Presentations
Log | Files | Refs

req (7910B)


      1 * Requirements and Related Work
      2 
      3 This chapter describes our requirements for a system that we can use to build a
      4 secure social network and introduces currently available alternatives to
      5 centralized social networks. This chapter is partly based on \cite{fsw-paranoia}.
      6 
      7 ** Privacy
      8 
      9 Our goal is to provide a system for social interaction in a privacy-protecting
     10 and scalable manner. A truly private communication system we're aiming for
     11 should have the following properties:
     12 
     13 - End-to-end encryption: only the intended recipients can read the messages, no
     14   server or network operators along the way between the communicating
     15   parties. To ensure this, it is not enough to use link-level encryption between
     16   a client and a server, end-to-end encryption is needed, which means that every
     17   participant in the system has to manage their own cryptographic keys on their
     18   own systems.
     19 - Perfect forward secrecy: messages transmitted over the network can't be
     20   decrypted later if a user's private key is compromised. To achieve this,
     21   temporary session keys need to be used when encrypting messages.
     22 - When logging a message to disk it should not contain a cryptographic signature
     23   of the sender, so if someone gains access to the log, it does not provide a
     24   proof that someone actually transmitted the messages.
     25 - An observer cannot determine for sure when two parties are communicating and
     26   how much data they exchange with each other. This requires a trade-off: while
     27   sending packets through other participants in the network would ensure this,
     28   this also increases message delay.
     29 - Padding of packets is necessary to prevent attacks based on statistical
     30   analysis of packet lengths. This is absolutely necessary when sending messages
     31   through multiple hops, otherwise it would be enough to monitor packet lengths
     32   to determine where a packet is forwarded to.
     33 - Delayed forwarding is also necessary to prevent correlation of received and
     34   transmitted packets when forwarding. Sending multiple packets at once at
     35   certain intervals would help to prevent this.
     36 - Private contact list: only visible to whom it needs be -- typically other
     37   friends -- not available publicly or managed on servers where server operators
     38   have access to it.
     39 - Every component of the system should be open source, so one can ensure it
     40   really works as advertised. A closed component would be a security risk, as it
     41   could leak information or otherwise weaken the security of the system, which
     42   is harder to detect when no source code is available. This can be enforced
     43   with a copyleft license, such as the Affero General Public License (AGPL).
     44 
     45 Currently available alternatives to centralized social network services are in
     46 most cases federated networks, which use a standardized protocol between servers
     47 enabling many service providers to take part in the network and communicate with
     48 each other. Examples for such systems include web-based platforms like Diaspora
     49 or Friendica, and others using a messaging protocol extended with social network
     50 functionalities -- friendship establishment, status messages to friends -- like
     51 OneSocialWeb, which is based on XMPP (Extensible Messaging and Presence
     52 Protocol) or PSYC (Protocol for SYnchronous Conferencing).
     53 
     54 These federated systems intend to offer more privacy than centralized systems,
     55 but they still not fulfill most of the requirements above, in most cases they
     56 only provide link-level encryption. They still store personal data on servers
     57 unencrypted, just like centralized systems. Users can have a server themselves,
     58 but that requires server administration skills which average users do not have,
     59 so we'll end up with a few larger servers and several smaller ones, just like in
     60 the case of email. Privacy is an even more serious issue in this case as it's no
     61 longer enough to trust one company, there are several server operators in this
     62 architecture sharing personal data with each other -- users' messages and
     63 profile data are transmitted to and stored unencrypted on servers of their
     64 friends as well. Even if some users run their own server, they would still
     65 communicate with people without their own server, exposing personal data to even
     66 more server operators this way.
     67 
     68 It is possible to enhance privacy of these federated protocols by adding
     69 end-to-end encryption on top of them, this is what PGP (Pretty Good Privacy)
     70 does for e-mail and OTR (Off-The-Record Messaging) does for instant messaging
     71 protocols. While this prevents servers from reading the content of messages,
     72 they still know everything else about a message, e.g. its sender, recipient, and
     73 size. There's an additional overhead of base64 encoding, which is needed because
     74 the underlying messaging protocols often do not support binary data
     75 transfer. Furthermore PGP and OTR can only be used for one-to-one messaging,
     76 one-to-many and many-to-many messaging are not supported by them.
     77 
     78 ** Scalability
     79 
     80 Efficient message distribution is crucial in social networks, as one of their
     81 most prevalent features is sending one-to-many status updates, but many-to-many
     82 group messaging is frequently used as well. To deliver these messages most
     83 efficiently, multicast message distribution would be necessary. IP multicast
     84 does not scale to a large number of channels, as multicast routing tables would
     85 fill up very fast -- at least one channel would be needed for a user's status
     86 updates, and similarly, at least one for each group -- thus this has to be
     87 implemented on the application layer to make it work.
     88 
     89 XMPP has a simple distribution strategy, it sends one message per recipient
     90 server, which is only efficient if there are many large sites. XMPP's
     91 scalability is also limited by the way it handles presence updates, the majority
     92 of inter-server traffic in the XMPP network consists of this type of messages.
     93 
     94 XMPP's use of an XML stream as network protocol without any framing makes it
     95 less efficient, as it complicates parsing and makes it impossible to transport
     96 binary data without Base64 or similar encoding. Also, protocol extensions
     97 described in XML add a large amount of unnecessary verbosity to the protocol.
     98 
     99 PSYC is another federated messaging protocol with a compact but extensible
    100 syntax, which enables fast parsing and small bandwidth usage. It is a text-based
    101 protocol with length prefixes for binary data. Benchmarks we made show that it
    102 outperforms XMPP and JSON when it comes to parsing speed \cite{psyc-bench}.
    103 
    104 PSYC sends out one message per recipient server when distributing messages, but
    105 it also has manual multicast tree configuration.
    106 
    107 ** Peer-to-peer networks
    108 
    109 Peer-to-peer (P2P) networks come closer to fulfilling these privacy
    110 requirements, as in many cases they're designed with security and privacy in
    111 mind from the ground up.
    112 
    113 Projects such as Tor and I2P aim to create an anonymous overlay network, while
    114 Freenet and GNUnet focus on anonymous information storage and retrieval. GNUnet
    115 also provides an extensive framework for writing P2P applications, including
    116 packet-based communication over different transport mechanisms.
    117 
    118 In a P2P network every user of the network runs the P2P software on their own
    119 computers (a computer in the P2P network is referred to as a node). This allows
    120 for creating a network architecture where servers are not needed to store and
    121 manage user data, every user can do so on their own node, giving them more
    122 control over their data. High-capacity servers we had in federated networks
    123 would be still useful in a P2P network, they can forward (and store when needed)
    124 encrypted data without being able to decrypt them, this way improving
    125 throughput, connectivity and stability of the network.
    126 
    127 Combining peer-to-peer network technology with social network semantics allows
    128 for creating a scalable, privacy-protecting social network based on connections
    129 of trusted peers. The next section describes the architecture of such a network.