lsd0003

LSD0003: Set Union
Log | Files | Refs | README

commit 4700a3757eb6d158c32f8e457693b0702d62fa58
parent 17ed015d6a636ea650c965a06e3c9a90e5232b34
Author: Christian Grothoff <christian@grothoff.org>
Date:   Sun, 13 Jun 2021 15:02:32 +0200

finish chapter 6

Diffstat:
Mdraft-summermatter-set-union.xml | 646+++++++++++++++++++++++++++++++++++++++++--------------------------------------
1 file changed, 338 insertions(+), 308 deletions(-)

diff --git a/draft-summermatter-set-union.xml b/draft-summermatter-set-union.xml @@ -952,225 +952,235 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | </dl> </section> <section anchor="modeofoperation_individual-elements" numbered="true" toc="default"> - <name>Differential Synchronisation Mode</name> - <t> - The message format used by the protocol limits the maximum message size to - 64 kb. Given that L can be large, an IBF will not always fit within that - size limit. To deal with this, larger IBFs are split into multiple messages. - </t> - <t> - When the initiating peer in the <strong>Expected SE</strong> state decides to use the differential synchronisation mode, it - sends an IBF, which may - consist of several <em><xref target="messages_ibf" format="title" /></em> messages, - to the receiving peer and transitions into the <strong>Passive Decoding</strong> state. - </t> - <t> - The receiving peer in the <strong>Expecting IBF</strong> state receives the - first <em><xref target="messages_ibf" format="title" /></em> message from - the initiating peer, and transitions into the <strong>Expecting IBF Last</strong> state - if the IBF was split into multiple <em><xref target="messages_ibf" format="title" /></em> - messages. If there is just a single <em><xref target="messages_ibf" format="title" /></em> - message, the receiving peer - transitions directly to the <strong>Active Decoding</strong> state. - </t> - <t> - The peer that is in the <strong>Active Decoding</strong>, <strong>Finish Closing</strong> or in the <strong>Expecting IBF Last</strong> - state is called the active peer, and the peer that is in either the <strong>Passive Decoding</strong> or the <strong>Finish Waiting</strong> state - is called the passive peer. - </t> - <t> - A state diagram illustrating the state machine used during differential synchronization - is provided + <name>Differential Synchronisation Mode</name> + <t> + The message format used by the protocol limits the maximum message size to + 64 kb. Given that L can be large, an IBF will not always fit within that + size limit. To deal with this, larger IBFs are split into multiple messages. + </t> + <t> + When the initiating peer in the <strong>Expected SE</strong> state decides to use the differential synchronisation mode, it + sends an IBF, which may + consist of several <em><xref target="messages_ibf" format="title" /></em> messages, + to the receiving peer and transitions into the <strong>Passive Decoding</strong> state. + </t> + <t> + The receiving peer in the <strong>Expecting IBF</strong> state receives the + first <em><xref target="messages_ibf" format="title" /></em> message from + the initiating peer, and transitions into the <strong>Expecting IBF Last</strong> state + if the IBF was split into multiple <em><xref target="messages_ibf" format="title" /></em> + messages. If there is just a single <em><xref target="messages_ibf" format="title" /></em> + message, the receiving peer + transitions directly to the <strong>Active Decoding</strong> state. + </t> + <t> + The peer that is in the <strong>Active Decoding</strong>, <strong>Finish Closing</strong> or in the <strong>Expecting IBF Last</strong> + state is called the active peer, and the peer that is in either the <strong>Passive Decoding</strong> or the <strong>Finish Waiting</strong> state + is called the passive peer. + </t> + <t> + A state diagram illustrating the state machine used during differential synchronization + is provided <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/differential_state_machine.png">here</eref>. - </t> - <t><strong>The behavior of the participants the different states is described below:</strong></t> - <dl> - <dt><strong>Passive Decoding:</strong></dt> + </t> + <t><strong>The behavior of the participants the different states is described below:</strong></t> + <dl> + <dt><strong>Passive Decoding:</strong></dt> + <dd> + <t> + In the <strong>Passive Decoding</strong> state the passive peer reacts to requests from the active peer. + The action the passive peer executes depends on the message the passive peer receives in the <strong>Passive Decoding</strong> state from the active peer + and is described below on a per message basis. + </t> + + <dl> + <dt><em><xref target="messages_inquiry" format="title" /></em> message:</dt> <dd> - <t> - In the <strong>Passive Decoding</strong> state the passive peer reacts to requests from the active peer. - The action the passive peer executes depends on the message the passive peer receives in the <strong>Passive Decoding</strong> state from the active peer - and is described below on a per message basis. - </t> - - <dl> - <dt><em><xref target="messages_inquiry" format="title" /></em> message:</dt> - <dd> - The <em><xref target="messages_inquiry" format="title" /></em> message - is received if the active peer requests the SHA-512 hash of one or more elements (by sending the 64 bit element ID) - that are missing from the active peer's set. - In this case the passive peer answers with <em><xref target="messages_offer" format="title" /></em> messages - which contain the SHA-512 hash of the requested element. If the passive peer does not have an element with - a matching element ID, it MUST ignore the inquiry (in this case, a bucket was falsely classified as pure, decoding the IBF will eventually fail, and roles will be swapped). <!-- FIXME: should we not remember that this happened and FAIL if the other peer sends DONE instead of an IBF? --> - If multiple elements match the 64 bit element ID, the passive - peer MUST send offers for all of the matching elements. - </dd> - <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> - <dd> - The <em><xref target="messages_demand" format="title" /></em> message - is received if the active peer requests a complete element that is missing in the active peers set in response to an offer. If the requested element is known and has not yet been transmitted - the passive peer answers with an <em><xref target="messages_elements" format="title" /></em> message which contains the full, - application-dependent data of the requested element. If the passive peer receives a demand for a SHA-512 hash for which - it has no corresponding element, a protocol violation is detected and the protocol MUST be aborted. - Implementations MUST also abort when facing demands without previous matching offers or for which the passive peer previously transmitted the element to the active peer. - </dd> - <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> - <dd> - The <em><xref target="messages_offer" format="title" /></em> message - is received if the active peer has decoded an element that is present in the active peers set and is likely be missing in the - set of the passive peer. If the SHA-512 hash of the offer is indeed not a hash of any of the elements from the set of - the passive peer, the passive peer MUST answer with a <em><xref target="messages_demand" format="title" /></em> message - for that SHA-512 hash and remember that it issued this demand. The demand thus needs to be added to a list with unsatisfied demands. - </dd> - <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> - <dd> - When a new <em><xref target="messages_elements" format="title" /></em> message has been received the peer checks if a corresponding - <em><xref target="messages_demand" format="title" /></em> for the element has been sent - and the demand is still unsatisfied. - If the element has been demanded the peer checks the element for validity, removes it from the list - of pending demands and then saves the element to the set. Otherwise the peer - ignores the element. - </dd> - <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt> - <dd> - If an <em><xref target="messages_ibf" format="title" /></em> message is received, this - indicates that decoding of the IBF on the active site has failed and roles will be swapped. - The receiving passive peer transitions into the <strong>Expecting IBF Last</strong> state, - and waits for more <em><xref target="messages_ibf" format="title" /></em> messages. - There, once the final <em><xref target="messages_ibf_last" format="title" /></em> message has been received, it transitions to <strong>Active Decoding</strong>. - </dd> - <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt> - <dd> - If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this - indicates that there is just one IBF slice left and a direct state and role transition from - <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated. - </dd> - <dt><em><xref target="messages_done" format="title" /></em> message:</dt> - <dd> - Receiving the <em><xref target="messages_done" format="title" /></em> message signals - the passive peer that all demands of the active peer have been satisfied. Alas, the - active peer will continue to process demands from the passive peer. - Upon receiving this message, the passive peer transitions into the - <strong>Finish Waiting</strong> state. - </dd> - </dl> + The <em><xref target="messages_inquiry" format="title" /></em> message + is received if the active peer requests the SHA-512 hash of one or more elements (by sending the 64 bit element ID) + that are missing from the active peer's set. + In this case the passive peer answers with <em><xref target="messages_offer" format="title" /></em> messages + which contain the SHA-512 hash of the requested element. If the passive peer does not have an element with + a matching element ID, it MUST ignore the inquiry (in this case, a bucket was falsely classified as pure, decoding the IBF will eventually fail, and roles will be swapped). <!-- FIXME: should we not remember that this happened and FAIL if the other peer sends DONE instead of an IBF? --> + If multiple elements match the 64 bit element ID, the passive + peer MUST send offers for all of the matching elements. </dd> - <dt><strong>Active Decoding:</strong></dt> + <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> <dd> - <t> - In the <strong>Active Decoding</strong> state the active peer decodes the IBFs and evaluates the set difference - between the active and passive peer. Whenever an element ID is obtained by decoding the IBF, the active peer - sends either an offer or an inquiry to the passive peer, depending on which site the decoded element is missing. - </t> - <t> - If the IBF decodes a positive (1) pure bucket, the element is missing on the passive peers site. - Thus, the active peer sends an <em><xref target="messages_offer" format="title" /></em> to the passive peer. - A negative (-1) pure bucket indicates that an element is missing in the active peers set, so the active peer - sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer. - </t> - <t> - In case the IBF does not successfully decode anymore, the active peer sends a new IBF computed with a different IBF-salt to the passive peer - and changes into <strong>Passive Decoding</strong> state. This initiates a role swap. - To reduce overhead and prevent double transmission of offers and elements, the new IBF is created - on the local set after updating it with the all of the elements that have been successfully demanded. Note that the active peer MUST NOT wait for all active demands to be satisfied, as demands can fail if a bucket was falsely classified as pure. - </t> - <t> - As soon as the active peer successfully finished decoding the IBF, the active peer sends a - <em><xref target="messages_done" format="title" /></em> message to the passive peer. - </t> - <t> - All other actions taken by the active peer depend on the message the active peer receives from - the passive peer. The actions are described below on a per message basis: - </t> - <dl> - <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> - <dd> - The <em><xref target="messages_offer" format="title" /></em> message indicates that the - passive peer received a <em><xref target="messages_inquiry" format="title" /></em> message from - the active peer. If a inquiry has been sent and - the offered element is missing in the active peers set, - the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the - passive peer. The demand needs to be added to a list of unsatisfied demands. - In case the received offer is for an element that is already in the set of the peer, the offer MUST BE ignored. - </dd> - <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> - <dd> - The <em><xref target="messages_demand" format="title" /></em> message indicates that the - passive peer received a <em><xref target="messages_offer" format="title" /></em> from - the active peer. The active peer satisfies the demand of the passive peer by sending - <em><xref target="messages_elements" format="title" /></em> message if a offer request - for the element has been sent. - <!-- FIXME: I do not understand. How can this happen, even with impure buckets? => too late, look at tomorrow! --> - In case the demanded element does not exist in the - set, there was probably a bucket decoded that was not pure. Potentially all <em><xref target="messages_offer" format="title" /></em> - and <em><xref target="messages_demand" format="title" /></em> messages sent later are invalid. - In this case a role change active -> passive with a new IBF is easiest. - </dd> - <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> - <dd> - An element that is received is marked in the list of demanded elements as satisfied, validated and - saved and no further action is taken. - Elements that are not demanded or already known are discarded. - </dd> - <dt><em><xref target="messages_done" format="title" /></em> message:</dt> - <dd> - Receiving the message <em><xref target="messages_done" format="title" /></em> indicates - that all demands of the passive peer have been satisfied. The active peer then changes into the - <strong>Finish Closing</strong> state. - If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em> - is received, the other peer is not in compliance with the protocol and the set reconciliation MUST be aborted. - </dd> - </dl> + The <em><xref target="messages_demand" format="title" /></em> message + is received if the active peer requests a complete element that is missing in the active peers set in response to an offer. If the requested element is known and has not yet been transmitted + the passive peer answers with an <em><xref target="messages_elements" format="title" /></em> message which contains the full, + application-dependent data of the requested element. If the passive peer receives a demand for a SHA-512 hash for which + it has no corresponding element, a protocol violation is detected and the protocol MUST be aborted. + Implementations MUST also abort when facing demands without previous matching offers or for which the passive peer previously transmitted the element to the active peer. </dd> - <dt><strong>Expecing IBF Last</strong></dt> + <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> <dd> - <t> - In the <strong>Expecing IBF Last</strong> state the active peer continuously receives <em><xref target="messages_ibf" format="title" /></em> - messages from the passive peer. When the last <em><xref target="messages_ibf_last" format="title" /></em> message is received - the active peer changes into <strong>Active Decoding</strong> state. - </t> + The <em><xref target="messages_offer" format="title" /></em> message + is received if the active peer has decoded an element that is present in the active peers set and is likely be missing in the + set of the passive peer. If the SHA-512 hash of the offer is indeed not a hash of any of the elements from the set of + the passive peer, the passive peer MUST answer with a <em><xref target="messages_demand" format="title" /></em> message + for that SHA-512 hash and remember that it issued this demand. The demand thus needs to be added to a list with unsatisfied demands. </dd> - <dt><strong>Finish Closing</strong> / <strong>Finish Waiting</strong></dt> + <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> <dd> - <t> - In this states the peers are waiting for all demands to be satisfied and for the synchronisation - to be completed. When all demands are satisfied the peer changes into <strong>Finished</strong>state. - </t> + When a new <em><xref target="messages_elements" format="title" /></em> message has been received the peer checks if a corresponding + <em><xref target="messages_demand" format="title" /></em> for the element has been sent + and the demand is still unsatisfied. + If the element has been demanded the peer checks the element for validity, removes it from the list + of pending demands and then saves the element to the set. Otherwise the peer + ignores the element. </dd> - </dl> + <dt><em><xref target="messages_ibf" format="title" /></em> message:</dt> + <dd> + If an <em><xref target="messages_ibf" format="title" /></em> message is received, this + indicates that decoding of the IBF on the active site has failed and roles will be swapped. + The receiving passive peer transitions into the <strong>Expecting IBF Last</strong> state, + and waits for more <em><xref target="messages_ibf" format="title" /></em> messages. + There, once the final <em><xref target="messages_ibf_last" format="title" /></em> message has been received, it transitions to <strong>Active Decoding</strong>. + </dd> + <dt><em><xref target="messages_ibf_last" format="title" /></em> message:</dt> + <dd> + If an <em><xref target="messages_ibf_last" format="title" /></em> message is received this + indicates that there is just one IBF slice left and a direct state and role transition from + <strong>Passive Decoding</strong> to <strong>Active Decoding</strong> is initiated. + </dd> + <dt><em><xref target="messages_done" format="title" /></em> message:</dt> + <dd> + Receiving the <em><xref target="messages_done" format="title" /></em> message signals + the passive peer that all demands of the active peer have been satisfied. Alas, the + active peer will continue to process demands from the passive peer. + Upon receiving this message, the passive peer transitions into the + <strong>Finish Waiting</strong> state. + </dd> + </dl> + </dd> + <dt><strong>Active Decoding:</strong></dt> + <dd> + <t> + In the <strong>Active Decoding</strong> state the active peer decodes the IBFs and evaluates the set difference + between the active and passive peer. Whenever an element ID is obtained by decoding the IBF, the active peer + sends either an offer or an inquiry to the passive peer, depending on which site the decoded element is missing. + </t> + <t> + If the IBF decodes a positive (1) pure bucket, the element is missing on the passive peers site. + Thus, the active peer sends an <em><xref target="messages_offer" format="title" /></em> to the passive peer. + A negative (-1) pure bucket indicates that an element is missing in the active peers set, so the active peer + sends a <em><xref target="messages_inquiry" format="title" /></em> to the passive peer. + </t> + <t> + In case the IBF does not successfully decode anymore, the active peer sends a new IBF computed with a different IBF-salt to the passive peer + and changes into <strong>Passive Decoding</strong> state. This initiates a role swap. + To reduce overhead and prevent double transmission of offers and elements, the new IBF is created + on the local set after updating it with the all of the elements that have been successfully demanded. Note that the active peer MUST NOT wait for all active demands to be satisfied, as demands can fail if a bucket was falsely classified as pure. + </t> + <t> + As soon as the active peer successfully finished decoding the IBF, the active peer sends a + <em><xref target="messages_done" format="title" /></em> message to the passive peer. + </t> + <t> + All other actions taken by the active peer depend on the message the active peer receives from + the passive peer. The actions are described below on a per message basis: + </t> + <dl> + <dt><em><xref target="messages_offer" format="title" /></em> message:</dt> + <dd> + The <em><xref target="messages_offer" format="title" /></em> message indicates that the + passive peer received a <em><xref target="messages_inquiry" format="title" /></em> message from + the active peer. If a inquiry has been sent and + the offered element is missing in the active peers set, + the active peer sends a <em><xref target="messages_demand" format="title" /></em> message to the + passive peer. The demand needs to be added to a list of unsatisfied demands. + In case the received offer is for an element that is already in the set of the peer, the offer MUST BE ignored. <!-- FIXME: what do we do if the offer does not match any demand we ever issued? Is that OK? Worth checking? --> + </dd> + <dt><em><xref target="messages_demand" format="title" /></em> message:</dt> + <dd> + The <em><xref target="messages_demand" format="title" /></em> message indicates that the + passive peer received a <em><xref target="messages_offer" format="title" /></em> from + the active peer. The active peer satisfies the demand of the passive peer by sending an + <em><xref target="messages_elements" format="title" /></em> message if a offer request + for the element was sent earlier. Otherwise, the protocol MUST be aborted, as peers must never send demands for hashes that they have never been offered. + </dd> + <dt><em><xref target="messages_elements" format="title" /></em> message:</dt> + <dd> + If element is received that was not demanded or for which + the application-specific validation logic fails, the protocol + MUST be aborted. Otherwise, the corresponding demand is marked + as satisfied. Note that this applies only to the differential + synchronization mode. In full synchronization, it is perfectly + normal to receive + <xref target="messages_full_element" format="title" /> + messages for elements that were not demanded and that might + even already be known locally. + </dd> + <dt><em><xref target="messages_done" format="title" /></em> message:</dt> + <dd> + Receiving the message <em><xref target="messages_done" format="title" /></em> indicates + that all demands of the passive peer have been satisfied. The active peer then changes into the + <strong>Finish Closing</strong> state. + If the IBF has not finished decoding and the <em><xref target="messages_done" format="title" /></em> + is received, the other peer is not in compliance with the protocol and the protocol MUST be aborted. + </dd> + </dl> + </dd> + <dt><strong>Expecing IBF Last</strong></dt> + <dd> + <t> + In this state the active peer continuously receives <em><xref target="messages_ibf" format="title" /></em> + messages from the passive peer. When the last <em><xref target="messages_ibf_last" format="title" /></em> message is received, + the peer changes into the <strong>Active Decoding</strong> state. + </t> + </dd> + <dt><strong>Finish Closing</strong> / <strong>Finish Waiting</strong></dt> + <dd> + <t> + In this states the peers are waiting for all demands to be satisfied and for the synchronisation + to be completed. When all demands are satisfied the peer changes into <strong>Finished</strong> state. + </t> + </dd> + </dl> </section> <section anchor="modeofoperation_combined-mode" numbered="true" toc="default"> - <name>Combined Mode</name> - <t> - In the combined mode the <xref target="modeofoperation_full-sync" format="title" /> and - the <xref target="modeofoperation_individual-elements" format="title" /> - are combined to minimize resource consumption. - </t> - <t> - The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set - differences or if the byte-size of the elements is large. If the set difference is estimated to be large - the <xref target="modeofoperation_full-sync" format="title" /> is - more efficient. The exact heuristics and parameters on which the protocol decides which mode - MUST be used are described in the <xref target="performance" format="title" /> section of this document. - </t> - <t> - There are two main cases when a <xref target="modeofoperation_full-sync" format="title" /> - is always used. - The first case is when one of the peers announces having an empty set. This is announced by setting - the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0. - The second case is if the application requests full synchronisation explicitly. - This is useful for testing and MUST NOT be used in production. - </t> - <t> - <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/full_state_machine.png">Link to statemachine diagram</eref> - </t> + <name>Combined Mode</name> + <t> + In the <em>combined mode</em> the protocol decides between + <xref target="modeofoperation_full-sync" format="title" /> and + the <xref target="modeofoperation_individual-elements" format="title" /> + to minimize resource consumption. Typically, the protocol always runs + in combined mode, but implementations MAY allow applications to force + the use of one of the modes for testing. In this case, applications MUST + ensure that the respective options to force a particular mode are used by + both participants. + </t> + <t> + The <xref target="modeofoperation_individual-elements" format="title" /> is only efficient on small set + differences or if the byte-size of the elements is large. If the set difference is estimated to be large + the <xref target="modeofoperation_full-sync" format="title" /> is + more efficient. The exact heuristics and parameters on which the protocol decides which mode + MUST be used are described in the <xref target="performance" format="title" /> section of this document. + </t> + <t> + There are two main cases when a <xref target="modeofoperation_full-sync" format="title" /> + is always used. + The first case is when one of the peers announces having an empty set. This is announced by setting + the SETSIZE field in the <em><xref target="messages_se" format="title" /></em> to 0. + <!-- FIXME: why not also do this if sending the elements is about as + expensive as sending the SE? Should be a simple calculation. --> + The second case is if the application requests full synchronisation explicitly. + This is useful for testing and MUST NOT be used in production. + </t> + <t> + The state diagram illustrating the combined mode can be found + <eref target="https://git.gnunet.org/lsd0003.git/plain/statemachine/full_state_machine.png">here</eref>. + </t> </section> </section> - <section anchor="messages" numbered="true" toc="default"> <name>Messages</name> - + <t> + This section describes the various message formats used by the protocol. + </t> <section anchor="messages_operation_request" numbered="true" toc="default"> <name>Operation Request</name> @@ -1199,7 +1209,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | +-----+-----+-----+-----+-----+-----+-----+-----+ | APX +-----+-----+-----+-----+-----+-----+-----+-----+ / - / / + / APPLICATION DATA / / / ]]></artwork> </figure> @@ -1207,19 +1217,24 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_OPERATION_REQUEST as registered in <xref target="gana" format="title" />, in network byte order. + is the type of SETU_P2P_OPERATION_REQUEST as registered in <xref target="gana" format="title" />, in network byte order. </dd> <dt>ELEMENT COUNT</dt> <dd> - is the number of the elements the requesting party has in its set, as a 32-bit unsigned integer in network byte order. + is the number of the elements the requesting party has in its set, as a 32-bit unsigned integer in network byte order. </dd> <dt>APX</dt> <dd> - is a SHA-512 hash that identifies the application. + is a SHA-512 hash that identifies the application. + </dd> + <dt>APPLICATION DATA</dt> + <dd> + is optional, variable-size application specific data that can be used + by the application to decide if it would like to answer the request. </dd> </dl> </section> @@ -1237,8 +1252,8 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | The <em>IBF</em> message is sent at the start of the protocol from the initiating peer in the transaction between <strong>Expect SE</strong> -> <strong>Expecting IBF Last</strong> or when the IBF does not decode and there is a role change in the transition between <strong>Active Decoding</strong> -> <strong>Expecting IBF Last</strong>. - This message is only sent if there are more than one IBF slice to be sent, in case there is just - one slice the <xref target="messages_ibf_last" format="title" /> message is sent. + This message is only sent if there is more than one IBF slice to be sent. If there is just + one slice, then only the <xref target="messages_ibf_last" format="title" /> message is sent. </t> </section> <section anchor="messages_ibf_structure" numbered="true" toc="default"> @@ -1261,7 +1276,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte orderwhichdescribes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> @@ -1270,52 +1285,49 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dt>IBF SIZE</dt> <dd> - is a 32-bit unsigned integer which signals the number of buckets in the IBF. + is a 32-bit unsigned integer which signals the total number of buckets in the IBF. The minimal number of buckets is 79. </dd> - <dt>IMCS</dt> + <dt>IMCS</dt> <!-- FIXME: bad placement, destroys alignment of offset+salt. Move after salt! --> <dd> - IBF max counter size is a 8-bit unsigned integer, which describes the number of bit that - is required to store a single counter. This is used for the unpacking function as described + is a 8-bit unsigned integer, which describes the number of bits that + are required to store a single counter. This is used for the unpacking function as described in the <xref target="performance_counter_variable_size" format="title" /> section. </dd> <dt>OFFSET</dt> <dd> - is a 32-bit unsigned integer which signals the offset to the following ibf slices in the original. + is a 32-bit unsigned integer which signals the offset of the following IBF slices in the original. </dd> - <dt>SALT</dt> + <dt>SALT</dt> <!-- FIXME: do we need a 32-bit field here? Maybe use only 16 bits, and then have some room for IMCS without destroying alignment? --> <dd> - is a 32-bit unsigned integer that contains the salt which was used to create the + is a 32-bit unsigned integer that contains the IBF-salt which was used to create the IBF. </dd> <dt>IBF-SLICE</dt> <dd> - - <t> - are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS, - 32-bit HASHSUMS and 1-64bit COUNTERS of variable size. In the network order the array of IDSUMS is first, followed - by an array of HASHSUMS and ended with an array of COUNTERS (details are described in section - <xref target="performance_counter_variable_size" format="default"/>). Length of the array is - defined by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as - 32768 divided by the BUCKET_SIZE which is 13-byte (104-bit). The minimal number of buckets in a single - IBF is 79. - </t> - <t> - To get the IDSUM field, all IDs hitting a bucket are added up with a binary XOR operation. - See <xref target="ibf_format_id_generation" format="title" /> details about ID generation. - </t> - <t> - The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field: - all HASHes are added up with a binary XOR operation. - The HASH value is calculated as described in detail in section <xref target="ibf_format_HASH_calculation" format="title" />. - </t> - <t> - The algorithm to find the correct bucket in which the ID and the HASH have to be added - is described in detail in section <xref target="ibf_m" format="title" />. - </t> - - <t> - Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section - </t> + <t> + are variable numbers of slices in an array. A single slice contains multiple 64-bit IDSUMS, + 32-bit HASHSUMS and (1-64)-bit COUNTERS of variable size. All values are in the network byte order. The array of IDSUMS is serialized first, followed + by an array of HASHSUMS. Last comes an array of unsigned COUNTERS (details of the COUNTERS encoding are described in section + <xref target="performance_counter_variable_size" format="default"/>). The length of the array is + defined by MIN( SIZE - OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is defined as + 32768 divided by the BUCKET_SIZE which is 13 bytes (104 bits). <!-- FIXME: 13 byte is no longer correct, given variable-size counters! Should probably be computed by properly considering the IMCS value! --> + </t> + <t> + To get the IDSUM field, all IDs (computed with the IBF-salt) hitting a bucket under the map M are added up with a binary XOR operation. + See <xref target="ibf_format_id_generation" format="title" /> details about ID generation. + </t> + <t> + The calculation of the HASHSUM field is done accordingly to the calculation of the IDSUM field: + all HASHes are added up with a binary XOR operation. + The HASH value is calculated as described in detail in section <xref target="ibf_format_HASH_calculation" format="title" />. + </t> + <t> + The algorithm to find the correct bucket in which the ID and the HASH have to be added + is described in detail in section <xref target="ibf_m" format="title" />. + </t> + <t> + Test vectors for an implementation can be found in the <xref target="test_vectors" format="title" /> section + </t> </dd> </dl> <figure anchor="figure_ibf_slice"> @@ -1333,32 +1345,40 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | +-----+-----+-----+-----+-----+-----+-----+-----+ / / / / -* Counter size is variable. In this example the size is 32-bit (4-byte) +* Counter size is variable. In this example the IMCS is 32 (4 bytes). ]]></artwork> </figure> </section> </section> <section anchor="messages_ibf_last" numbered="true" toc="default"> - <name>IBF Last</name> + <name>IBF Last</name> - <section anchor="messages_ibf_last_description" numbered="true" toc="default"> - <name>Description</name> - <t> - This message indicates the remote peer that all slices of the bloom filter have been sent. - The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of - the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE" - which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST". - </t> - <t> - Receiving this message initiates the state transmissions - <strong>Expecting IBF Last</strong> -> <strong>Active Decoding</strong>, - <strong>Expecting IBF</strong> -> <strong>Active Decoding</strong> and - <strong>Passive Decoding</strong> -> <strong>Active Decoding</strong>. This message - can initiate a peer the roll change from <strong>Active Decoding</strong> to - <strong>Passive Decoding</strong>. - </t> - </section> + <section anchor="messages_ibf_last_description" numbered="true" toc="default"> + <name>Description</name> + <t> + This message indicates to the remote peer that this is the last slice + of the Bloom filter. The receiving peer MUST check that the sizes and + offsets of all received IBF slices add up to the total IBF SIZE that was + given. + </t> + <t> + Receiving this message initiates the state transmissions + <strong>Expecting IBF Last</strong> -> <strong>Active Decoding</strong>, + <strong>Expecting IBF</strong> -> <strong>Active Decoding</strong> and + <strong>Passive Decoding</strong> -> <strong>Active Decoding</strong>. This message + can initiate a peer the roll change from <strong>Active Decoding</strong> to + <strong>Passive Decoding</strong>. + </t> + </section> + <section anchor="messages_ibf_last_structure" numbered="true" toc="default"> + <name>Structure</name> + <t> + The binary structure is exactly the same as the <xref target="messages_ibf_structure" format="title" /> of + the message <xref target="messages_ibf" format="title" /> with a different "MSG TYPE" + which is defined in <xref target="gana" format="title" /> "SETU_P2P_IBF_LAST". + </t> + </section> </section> <section anchor="messages_elements" numbered="true" toc="default"> @@ -1374,11 +1394,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | This message is sent in the state <strong>Active Decoding</strong> and <strong>Passive Decoding</strong> as answer to a <em><xref target="messages_demand" format="title" /></em> message from the remote peer. The <em><xref target="messages_elements" format="title" /></em> message can also be received in the <strong>Finish Closing</strong> or <strong>Finish Waiting</strong> - state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer, in this + state after receiving a <em><xref target="messages_done" format="title" /></em> message from the remote peer. In this case the peer changes to the <strong>Finished</strong> state as soon as all demands for elements have been satisfied. </t> <t> - This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. + This message is exclusively used in the <xref target="modeofoperation_individual-elements" format="title" />. </t> </section> <section anchor="messages_elements_structure" numbered="true" toc="default"> @@ -1399,29 +1419,29 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_ELEMENTS as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_ELEMENTS as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>E TYPE</dt> <dd> - element type is a 16-bit unsigned integer which defines the element type for + is a 16-bit unsigned integer which defines the element type for the application. </dd> <dt>PADDING</dt> <dd> - is 16-bit always set to zero + is 16-bit always set to zero. </dd> <dt>E SIZE</dt> <dd> - element size is a 16-bit unsigned integer that signals the size of the elements data part. + is a 16-bit unsigned integer that signals the size of the elements data part. </dd> - <dt>AE TYPE</dt> + <dt>AE TYPE</dt><!-- FIXME: E TYPE vs. AE TYPE remains quite unclear to me. Why do we have both? If we get rid of one, we could also drop PADDING! --> <dd> - application specific element type is a 16-bit unsigned integer that is needed to identify - the type of element that is in the data field + is a 16-bit unsigned integer that is needed to identify + the type of element that is in the data field. </dd> <dt>DATA</dt> <dd> @@ -1470,12 +1490,12 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_OFFER as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_OFFER as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>HASH</dt> <dd> is a SHA 512-bit hash of the element that is requested with a <em><xref target="messages_inquiry" format="title" /></em> message. - </dd> + </dd><!-- FIXME: OFFER was defined as a variable-size message that CAN include multiple HASH values in one message. Current code does not use this, but we should continue to define it as such in the protocol! --> </dl> </section> </section> @@ -1511,17 +1531,18 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_INQUIRY as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_INQUIRY as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>IBF KEY</dt> <dd> is a 64-bit unsigned integer that contains the key for which the inquiry is sent. </dd> </dl> + <!-- FIXME: INQUIRY was defined as a variable-size message that CAN include multiple IBF KEY values in one message. I did not check if the current code uses this, but we should continue to define it as such in the protocol - especially as for INQUIRIES this is particularly easy to implement and an important optimization given that the overhead is otherwise rather large per IBF KEY! --> </section> </section> @@ -1568,6 +1589,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | is a 512-bit Hash of the element that is demanded. </dd> </dl> + <!-- FIXME: DEMAND was defined as a variable-size message that CAN include multiple HASH values in one message. Current code does not use this, but we should continue to define it as such in the protocol! --> </section> </section> @@ -1578,7 +1600,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <name>Description</name> <t> The <em><xref target="messages_done" format="title" /></em> message is sent when all <em><xref target="messages_demand" format="title" /></em> messages - have been successfully satisfied and the set is complete synchronized. + have been successfully satisfied and from the perspective of the sender the set is completely synchronized. </t> <t> This message is exclusively sent in the <xref target="modeofoperation_individual-elements" format="title" />. @@ -1599,11 +1621,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_DONE as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_DONE as registered in <xref target="gana" format="title" /> in network byte order. </dd> </dl> </section> @@ -1637,7 +1659,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 4 for this message type. </dd> <dt>MSG TYPE</dt> <dd> @@ -1680,11 +1702,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>REMOTE SET DIFF</dt> <dd> @@ -1739,11 +1761,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. The value is always 16 for this message type. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_REQUEST_FULL as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>REMOTE SET DIFF</dt> <dd> @@ -1775,7 +1797,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <xref target="messages_operation_request" format="title" /> message has been received. </t> <t> - The strata estimator is used to estimate the difference between the two sets as described in section <xref target="se" format="counter" />. + The strata estimator is used to estimate the difference between the two sets as described in section <xref target="se" format="title" />. </t> <t> When the initiating peer receives the strata estimator, the peer decides which <xref target="modeofoperation" format="title" /> to use @@ -1789,6 +1811,7 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <t> The IBFs in a strata estimator always have 79 buckets. The reason why can be found in Summermatter's work. <xref target="byzantine_fault_tolerant_set_reconciliation" format="default"/> </t> + <!-- Give a more precise reference into the thesis for this, do not cite the whole thesis! --> </section> <section anchor="messages_se_structure" numbered="true" toc="default"> <name>Structure</name> @@ -1808,11 +1831,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_SE as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_SE as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>SEC</dt> <dd> @@ -1828,27 +1851,32 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dd> is variable in size and contains the same structure as the IBF-SLICES field in the <xref target="messages_ibf" format="title" /> message. </dd> + <!-- FIXME: this is very imprecise, as there are multiple IBFs with that have + been sampled, and then again multiple IBFs with a different IBF-salt. + Please be MUCH more precise here! --> </dl> </section> </section> <section anchor="messages_sec" numbered="true" toc="default"> - <name>Strata Estimator Compressed</name> + <name>Strata Estimator Compressed</name> - <section anchor="messages_sec_description" numbered="true" toc="default"> - <name>Description</name> - <t> - The Strata estimator can be compressed with gzip to improve performance. This can be recognized - by the different message type number from <xref target="gana" format="title" />. - </t> - <t> - Since the content of the message is the same as the uncompressed Strata Estimator, the details - are not repeated here. For details see section <xref target="messages_se" format="counter" />. - </t> - </section> + <section anchor="messages_sec_description" numbered="true" toc="default"> + <name>Description</name> + <t> + The Strata estimator can be compressed with gzip to improve performance. This can be recognized + by the different message type number from <xref target="gana" format="title" />. + </t> + <t> + Since the content of the message is the same as the uncompressed Strata Estimator, the details + are not repeated here. For details see section <xref target="messages_se" format="counter" />. + </t> + <!-- FIXME: keep the 'structure' subsection, and also needs to clarify WHAT is being + compressed, and should cite the RFC on the compression method used! (IIRC gzip deflate + mode); note that the header is not subject to compression, so this must be clarified!) --> + </section> </section> - <section anchor="messages_full_element" numbered="true" toc="default"> <name>Full Element</name> @@ -1870,7 +1898,9 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | </t> </section> <section anchor="messages_full_element_structure" numbered="true" toc="default"> - <name>Structure</name> + <name>Structure</name> + <!-- MAYBE just refer to the "ELEMENT" section on structure and only + note the different MSG TYPE here? --> <figure anchor="figure_full_element"> <artwork name="" type="" align="left" alt=""><![CDATA[ 0 8 16 24 32 40 48 56 @@ -1887,15 +1917,15 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | <dl> <dt>MSG SIZE</dt> <dd> - is a 16-bit unsigned integer in network byte order, which describes the message size in bytes and header included. + is a 16-bit unsigned integer in network byte order, which describes the message size in bytes with the header included. </dd> <dt>MSG TYPE</dt> <dd> - the type of SETU_P2P_REQUEST_FULL_ELEMENT as registered in <xref target="gana" format="title" /> in network byte order. + is SETU_P2P_REQUEST_FULL_ELEMENT as registered in <xref target="gana" format="title" /> in network byte order. </dd> <dt>E TYPE</dt> <dd> - element type is a 16-bit unsigned integer which defines the element type for + is a 16-bit unsigned integer which defines the element type for the application. </dd> <dt>PADDING</dt> @@ -1904,11 +1934,11 @@ hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | </dd> <dt>E SIZE</dt> <dd> - element size is a 16-bit unsigned integer that signals the size of the elements data part. + is a 16-bit unsigned integer that signals the size of the elements data part. </dd> <dt>AE TYPE</dt> <dd> - application specific element type is a 16-bit unsigned integer that is needed to identify + is a 16-bit unsigned integer that is needed to identify the type of element that is in the data field </dd> <dt>DATA</dt>