Messages sent by a member of the group can only be read by other members of the group.
The content of messages sent by any member to the group remain untampered.
If the group chat reports that some message was delivered by a certain user, then this is true.
Any message in the group chat must have been sent by a user inside the group chat. Attackers can't plant unauthorised messages.
If a user's session state is compromised, their past messages in the group remain confidential. This holds even if an attacker has recorded past encrypted messages.
If a user's session state is compromised, security can be regained as secrets are automatically and frequently renewed, preventing an attacker from reading future messages.
Modelling real-life private off-the-record conversations, it should be possible for a user to deny that a message came from them. While this seems counter to authentication, practically what it means is that the recipients of a message know that the message is real and authenticated, but cannot prove it to an outside party.
Deniability can be achieved cryptographically, but there are practical constraints that may be considered, such as the fact that some messaging applications authenticate senders at the server level.
The opposite of deniability, non-repudiation makes it possible to prove to a third party that a message was sent by a certain user, which is usually the case when using external certificate authorities for authentication.
A pair of large prime numbers used for both asymmetric encryption and digital signatures. A public key uniquely matches to only one private key, and due to the maths involved, it is computationally unfeasible to calculate a private key from a public key, given that large enough numbers are used. The private key is kept secret and the public key freely shared.
If used for asymmetric encryption, the sender can encrypt their message with a public key, then only those who have the matching private key will be able to decipher it. For instance, if user A wants to send a secure message to user B, they would encrypt the message with B's public key. B is the only one with the matching private key and thus even if the encrypted message is captured in the network, B is still the only one who can read it.
If used for signatures, the message or document to be signed is hashed, producing a hash that represents the original data (this is to make it shorter and easier to sign). The hash is encrypted with the private key and attached to the original document as a signature. Any other user can decrypt the signature with the public key, and compare this to their own calculated hash of the document. If they are the same, this means that the document has not been changed from when it was signed.
One of the most well-known secure messaging protocols to date, it has all of the required security properties, but only for communication between two parties. Extending this to group chats means creating pairwise communication channels between each member and every other member. This scales in O(n2)! It takes O(n) just to send a single message, since it has to be encrypted separately for everyone. Way too much work!
Used by WhatsApp, the Sender Keys protocol involves a user sending the same symmetric key to all other group participants. (This assumes that secure two-way channels between individual members already exist or can be created temporarily.) Then sending messages is just O(1), encrypt once and let the server pass it on to all the other group members! And adding a new member is O(n), since the new member has to give their key to everyone else.
The bad news is that removing members is O(n2). All keys are deleted and the group essentially starts over, so that the leaving member doesn't have access to any of its secrets. Since everyone has to send their new key to everyone, this is quite inefficient.
Rotating a single member's key, in the case of compromise, is only O(n), but this is only weak post-compromise security, as an attacker is likely to have captured the entire user state, which includes the keys of the other users. It would be safer to do a full refresh, which is O(n2).
A paper from the ACM Conference on Computer and Communications Security which proposed an internal binary tree structure for efficient Diffie-Hellman key exchange within large groups. With O(1) message sending and O(lgn) key rotation and member removal, it formed the basis for a new protocol, MLS... which was codified as an Internet Standard in July 2023.
The secure messaging platform Wire boasts a full implementation of MLS, as one of the co-developers of the protocol itself. Google has also announced that it will be integrating MLS into Google Messages, giving this standard large industry backing.
The rachet tree is the underlying data stucture which makes key exchange so efficient. It's a perfect (complete) binary tree, composed of nodes that are either blank or have a public and private key pair. Clients are represented by leaf nodes, and there may be empty leaf nodes to complete the tree. The private key of any node is only known by clients who are children of said node. Another way to put it is that clients may only know the private keys of nodes on their direct path to the root.
In the above example, the clients know all the public keys associated with each client and each non-blank intermediate node. A and B may additionally know the private keys of X and Y, while C and D may know the private key of Y. However, it's not necessary for a client to know all private keys of nodes above them. If they have missing information we call these leaf nodes "unmerged" with respect to their ancestors. Each non-blank node contains an ordered list of unmerged leaves beneath them. (This is the empty list for any leaf nodes.)
In the above example, client C is unmerged, as it does not know the private key at Y.
The resolution of the node is an ordered list that represents the set of keys required to encrypt to every client that is a descendant of that node. For a non-blank node, it contains itself and any unmerged leaves. For instance, in the second tree shown the resolution of Y is [Y, C]. Any message would have to be encrypted with the key at Y, allowing A, B and D to receive it, then separately encrypted to C, as it does not have the private key of Y. The resolution of the blank node is [C, D], and the resolution of X is just [X], since A and B both have the private key at X.
Direct paths are, intuitively, the sequence of parent nodes between a node and the root. The copath of a node is the node's sibling and any siblings of nodes in its direct path. For example, in the second tree, the copath of D is [C, X], and the copath of A is [B, (the blank node)]. The filtered direct path of a node is its direct path, with any nodes whose corresponding copath child has an empty resolution is removed. Consider if client B did not exist, and there was an empty leaf node at that point. Then the filtered direct path of A would simply be [Y], since encrypting to X would be the same as encrypting to the non-copath child (in this case, A, which does not even need to be encrypted to).
Proposals are messages broadcast to suggest changes to the group composition or its secrets. Add proposals request for a certain client to be added to the group, given their KeyPackage. Update proposals replace the sender's own leaf node to update its keys. Remove proposals request that a member of the group be removed. There are some other proposal types, such as PreSharedKey (PSK), requesting that additional entropy be injected into the next epoch - for instance to prove that all group members were also members of a certain previous epoch, since every epoch has a "resumption PSK" secret. This could be used to restore the group after major compromise, taking the resumption PSK from an epoch before the comproimse. ReInit proposals suggest completely reinitialising the group with new parameters. Finally, ExternalInit allow external users to request joining the group, given that they have been provided with a GroupContext containing public information about the group.
I also add my own implementation-specific proposal type, AdministratorAdd and AdministratorRemove, which respectively make a user an administrator and remove the administrator position from a user. These are explained further in the Administrator section.
Proposals are usually sent by members of the group, but some exceptions exist. For instance, the server provides an inactivity service which is authorised to propose member removal given a certain period of inactivity, set by the original group creator (and able to be updated without advancing the epoch). The inactivity service is registered within the external_senders list of the GroupContext.
Proposals can be sent as either public or private (encrypted) messages. It is preferred to send them as private messages, to reduce leakage about the group composition and other information. However, external proposals must be public, as external senders don't have the keys necessary to encrypt the messages to the group.
The action of proposals are not carried out until a commit containing the proposals is received.
Clients receive proposals, which are ordered into a proposal list. Once a client has observed valid proposals, it cannot send regular messages until those proposals are committed. This ensures that the proposals are swiftly checked and their changes applied, reducing the number of messages sent under a possibly compromised epoch. Proposal lists can only be turned into a valid commit if the individual proposals are all valid. If it contains multiple Update or Remove proposals about the same leaf, only one can be applied at a time. The committer chooses the Remove over any Updates. If there are only multiple Updates, the most recent one is chosen. A commit also cannot remove the committer, nor can a commiter approve an Update that they themselves initiated. If duplicate Add proposals are given, only one is committed.
A client organises the proposals into a valid proposal list and commits the changes, performing any updates needed to their view of the tree and GroupContext. The committer sends all members of the group for the next epoch any additional information they need to similarly update their view. Commits are labelled with the relevant epoch number so that clients can identify whether the commit matches the epoch they are in.
Multiple commit messages may be sent for the same epoch simultaneously. My implementation uses the "eventually consistent" model to resolve commits, which places less trust in the delivery system. When clients receive a commit, they pause sending new messages for a certain amount of time based on recorded network latency, to wait for new commits to arrive. Ties are broken deterministically. Commits which contain Removes are prioritised, and after that, commits which contain AdministratorRemoves. If two commits share part of their proposal list, the one with more proposals is prioritised, unless the extra proposals involve more Removes. Commits which contain ReInit proposals (ReInit proposals are always sent solo within a Commit) are always prioritised last. The other commit is rejected and its proposals may neeed to be proposed again.
In some cases, network partition may occur, either from an extremely laggy network or malicious denial or reordering of commit messages. If a group significantly diverges, then it can be recreated using ReInit proposals in one subgroup towards members of another subgroup. This may involve the resumption PSK to prove membership in the original group.
In my implementation, some users of the group are administrators. There must be at least one administrator in the group. By default, the creator of the group is an administrator. Besides the ratchet tree representing membership in the group, there is a separate administrator rachet tree which contains administrator keys. Every member of the group has a public view of this tree, which is used to verify administrator signatures. All administrators are part of this administrator tree and have private keys corresponding to their node or higher intermediate nodes. To sign a message as an administrator, the highest (most general) signature in the tree that the administrator has knowledge of is used.
Administrator signatures are optional and separate from the signature designating who the sender is. It is padded with zero-bytes if not used. If sent attached to a private message, the signature is within the encryption. A message which does not have a valid administrator signature or zero-byte padding is rejected as malformed, to prevent it being used as a covert channel of communication. A client checks whether the signature matches a private key within the resolution of the administrator tree's root node.
Only administrator users are allowed to commit proposals. When any user receives a commit message, they check whether it is signed by an administrator. If not, then the commit is ignored. Additionally, most proposals are invalid unless proposed by administrator users. The only proposals that regular users are authorised to propose are leaving the group (self-Remove), or updating their own key (self-Update), in which case it is checked whether the target leaf node corresponds to the sender of the proposal.
Administrators may be added and removed from the administrator tree in the same way as the regular membership tree, through proposals initiated by other administrators. Similar to the regular remove, an administrator may not commit a proposal removing themselves as an administrator. If a commit would result in no administrators remaining, then it is rejected. Deterministically, if a client receives two valid commits in which administrators remove each other, the first one will be applied. Changes to the administrator tree do not necessarily create a new epoch secret.
Administrators may add new users by retrieving a key package associated with that user from the server, which contains a public key that can be used to temporarily encrypt messages to that client. Users upload key package to a semi-trusted key server, with different time limits such as one-time use packages, medium term limited reuse packages, and a fallback key package to be used when other key packages are exhausted. It is preferred to use key packages with the shortest lifetime possible, and discard after use, to prevent attackers from being able to crack the encryption or perform replay attacks. Key packages are signed with users' private identity keys whose public counterparts are stored by a semi-trusted credential authority, to verify that the keys are really owned by a given user. The server rejects excessive repeated key package requests from the same source within a small timeframe, to prevent malicious exhaustion of key packages (resulting in the less secure fallback being used). If a user has not been online for an excessive period of time, and their only uploaded key packages are stale, the application issues a warning when attempting to add them.
The key package's public key is used to add the new member to the group, assigning them a previously blank leaf node or extending the tree as needed. The new member is sent a "Welcome" message, encrypted with this public key, which contains the current group context, including the public view of the tree, and the private key to the root node. The new member may not yet know the private keys of intermediate nodes (ie. it is unmerged). It is the responsibility of other nodes in the same sub-trees to inform the new user of these private keys. The new member knows nothing about the previous epoch secret, maintaining forward security, as even if they receive messages from the previous epoch, they cannot decrypt them.
If it is necessary to add new nodes to the ratchet tree, it is preferred to keep the tree left-balanced. A new empty tree is created, the same size as the old ratchet tree. The old tree becomes the left child and the new tree becomes the right child. A new key pair is generated for the new root and distributed to all members. The new node is added as the left most child of the new right subtree.
How do we update the tree? When a member wants to update their key, or if a member is to be removed, then its node and all nodes on its path are blanked.
If a member is performing a key update, they generate a fresh HPKE pair for their leaf, and generate new path secrets for every node on its filtered direct path. The first path secret is randomly generated, while the next secrets are derived in sequence using a derivation function. These secrets are not directly used; instead, they are used to derive intermediate "node secrets", which are then used as material to derive a key pair for the node using the KEM. This is designed so that each secret is only used in one algorithm, making it more difficult for attackers to find out the private key of a node given the key of one of its descendants.
This diagram is taken from the MLS standard documentation:
path_secret[1] ---> node_secret[1] -------> node_priv[1], node_pub[1]
^
|
|
path_secret[0] ---> node_secret[0] -------> node_priv[0], node_pub[0]
^
|
|
leaf_secret ------> leaf_node_secret --+--> leaf_priv, leaf_pub
| |
'-------. .-------'
|
leaf_node
Derivation of Ratchet Tree Keys along a Direct Path
The proposal for a key update contains the new public keys for each updated node, as well as encrypted copies of the path secrets. These path secrets are encrypted with the public keys of children that are in the resolution of nodes to be updated.
In the above tree, if A wishes to update their key, they blank out nodes A, X, Y and Z. A generates a sequence of secret material from which it derives node secrets and then node keys. A attaches the new public keys for all the nodes to their key update proposal. Then, they encrypt path_secret[0], from which can be derived the new private key of X, with the current public key of B. This can be done because A's key update does not affect B's key; even if A's current user state is compromised, B's is independent and safe. path_secret[1], corresponding to Y, is attached twice, one encrypted by C's public key, the other encrypted by D's. Finally, path_secret[2] is also encrypted once, with the public key at V, which both E and F can decrypt.
Each member need only receive a single path secret. From this they can derive the path secrets further up the tree, populating their direct path, but cannot access previous path secrets, maintaining the tree invariant that a node's private key is only known by its direct children.
As another example, if F wished to update their key, they would blank out their node, W, V and Z. F only needs to generate new secrets and keys for [F, W, Z], its filtered direct path - V can remain blank as its other child has an empty resolution, ie. there are no leaf nodes in it. After generating the required keys and secrets, path_secret[0], corresponding to W, would be encrypted with E's public key, path_secret[1], corresponding to Z, would be encrypted two separate times, for node Y and node C (an unmerged leaf).
If a user is to be removed, then new entropy is introduced into the epoch secret by the proposer, which is then distributed to every member besides the leaving member in an efficient way based on the shared subgroup keys. In the tree above, if F was to be removed, the proposal would include a new secret which can be combined with the current epoch secret to derive a new secret. This is used to create a new public and private key pair for the root node. This new secret would be encrypted for Y, C and E, to distribute it to the other members.
If the entire right subtree of the root then contains an empty resolution, it can be removed entirely. This would be the case if both E and F were removed.
We've discussed that every epoch has a new epoch secret. The sequence of epoch secrets is called the Key Schedule. The epoch secret can be used to derive multiple other secrets. These include the resumption PSK for proving membership in this epoch, an epoch_authenticator which clients can compare out-of-band to ensure they are in the same epoch, and, relevant to sending messages, the sender_data_secret and encryption_secret. Respectively, these are used to derive keys for encrypting sender data and message contents.
We're not yet done with trees - there's one more important tree making up the MLS protocol. The secret key has the same structure as the current ratchet tree, with clients represented by leaves at the same position. The initial secrets of the root is derived from the encryption_secret, and each child node derives a secret from its parent. Leaf nodes are used to generate two symmetric ratchets, a handshake ratchet and an application rachet. On each step of the ratchet, a new key/nonce pair is generated, which can be used to encrypt a single message. The next step or "generation" of the ratchet is also derived. Each key/nonce pair MUST only be used once and thereafter discarded. Handshake messages are those that contain MLS Proposals or Commits, while application messages contain regular data or messages - the type of message determines which rachet is used.
Essentially, what the secret tree accomplishes is distributing a symmetric ratchet that represents each sender to every other member of the group. The sender uses values from their ratchet to encrypt messages, and receivers can decrypt messages using their copy of the ratchet. The sender_data_secret is used to protect the sender data from being read by anyone outside the group, because the receiver needs to know which user sent the message to select the correct ratchet for decrypting the message. A key/nonce pair is derived from a combination of the sender_data_secret and a sample of the message's ciphertext. The sender_data_secret is not used directly so that it is not consumed.
After all that setup... how do we send messsages???
There are two distinct access categories of messages: Public and Private messages. Public messages can be read by anyone, while private messages are encrypted so that only those within the group can read them. No matter the type, all messages are signed to authenticate the sender, producing an AuthenticatedContent object. The signature keys are stored at the leaf nodes of the ratchet tree. AuthenticatedContent objects are then wrapped as a public or private messages, then as an MLS message, and sent. (Note that Welcome messages, KeyPackages and GroupInfo are also considered high-level "MLS" messages.)
Public messages are relatively simple - they have two cases, either they were sent by a member or sent from an external source. If it was sent by a member, it must include a membership_tag derived from a combination of the membership_key (itself derived from the epoch secret) and the authenticated data sent in the public message. If it was sent from an external source, it should contain proposals for group operations such as joins or external removes, which may be committed by an administrator member.
Private messages are encrypted. Content is first zero-padded with an amount determined by the sender. A receiver must check whether the decoded bits are entirely zeroed in the padding area; if not, the message must be rejected. This padded content is encrypted using the ciphersuite's AEAD algorithm, with the key and nonce derived from the sender's relevant ratchet treem depending on whether this is a handshake or application message. To prevent accidental reuse of a key/nonce pair, the first four bytes of the nonce are first XOR'd with a fresh random value called a reuse guard. This is given to the receiver within the sender_data part of the message.
The sender_data contains the leaf index of the sender, the generation of the ratchet, and the reuse_guard. The sender_data is encrypted as described in the previous section so that it can only be read by members of the group.
MLS flexibly supports any ciphersuite composed of a combination of HPKE parameters, a Key Encapsulation Mechanism (KEM), an AEAD encryption scheme, a Signature authentication scheme and a mesasge authentication scheme. The aim is to balance wide support for ciphersuites, increasing the number of clients able to use the service, with security.
A group must use the same ciphersuite between all members. Members can have multiple key packages corresponding to different ciphersuites; this "advertises" the ciphersuites they support. My application actively blocks the use of outdated algorithms that are discouraged by IETF standard, such as the SHA-1 hashing algorithm, which has had major cryptographic breakthroughs in recent years, including relatively practical chosen-prefix attacks that would allow signed messages to be forged without possessing the correct signature private key.
MLS negotiates the highest supported ciphersuite between all members, and downgrades or upgrades existing ciphersuites as necessary as users are added and removed from the group. By blocking the use of insecure algorithms, downgrade attacks are prevented, where a malicious joiner tries to significantly decrease group security by only offerring weak ciphersuites
The MLS protocol does not inherently have a concept of "users", only clients, which represent individual devices. The application determines if devices are owned by the same user if their leaf nodes have the same signature key, which corresponds with the user's identity. Note that each device still has its own key pair corresponding to its position in the rachet tree. This makes it so that users, even those who are not administrators, have the authority to remove devices belonging to themselves from the group.
When adding a new device, a user shares their previous signature key, and the public group context, through a mechanism like a QR code, as the devices are likely to be in close proximity. They cannot directly share their view of the group, as this would violate the tree invariant, giving this new client access to keys that it shouldn't. Instead, the new client proposes an ExternalInit with a new KeyPackage, the signature key and group context provided by the previous device, as well as a key proving membership in the group in this epoch, derived from the epoch's membership_key and a fresh random value contained in the proposal. This adds the new device as a separate client within the group.
When removing a device, a user refreshes their signature key pair on all other devices. The secrets of the ratchet tree remain safe.
The attacker is not able to actually decrypt the message without the reuse_guard, which is encrypted behind a secret derived from the sender_data_secret. If the attacker was able to brute force the reuse guard (as it is only four bytes), they would be able to read the single message which was encrypted with this key/nonce pair. As such, confidentiality is broken. However, they would not be able to modify the message and resend it, because they are not able to sign the content with an authenticated signature from within the group, whether it be that of the original sender or a different group member. So, integrity and authentication is preserved.
They would not be able to read any other messages, as new secrets are derived from the ratchet tree, and not directly from used message keys. This preserves post-compromise security. Similarly, the attacker cannot derive past message keys, preserving perfect forward secrecy.
Message injection is not a problem as all messages must be authenticated, with the exception of external joins. However, external joins require having a copy of the group context, which is always sent over secure channels to new joiners. External joins are also always manually verified by administrators before being committed, so if an unexpected user attempts to join, it can be rejected.
Dropped messages can simply result in denial of service which prevents the group from operating. It is difficult to prove that denial of service is occurring without out-of-band comparison. There's no way for an application or protocol to work if the delivery service simply refuses to transmit. Otherwise, the network can deliberately drop some messages so as to cause a partition between the group based on different received commit messages. If discovered in time, through out-of-band comparison of the epoch_authenticator, then the split groups can be reinitialised back together; otherwise, a new group needs to be created with a resumption key from before the group split.
This could occur when an untrustworthy key server provides wrong keys for a member to be added to the group. Impersonation could occur, but this usually requires the impersonator to have sufficient knowledge of the person they are impersonating, if they actively send messages. This is a separate problem altogether out of the scope of the implementation. However, what is within scope is a man-in-the-middle attack where an attacker is added to a group in lieu of the real member, and they pass message back and forth between the real group and the real member.
This is more difficult than man-in-the-middle between two party communication, as the middleman may have to simulate an entire feasible fake group for the unknowing member to be added to, with separate clients masquerading as the real senders from the actual group. However, it is not impossible. To detect this, each epoch has an epoch_authenticator secret that can be compared out-of-band; if it is the same, then these two members know that they at least are in the same group during the same epoch. However, this changes with every epoch, so the comparison would have to be repeated each time, as epoch changes can introduce attackers or secretly split the group (as in the above section).
Currently, this implementation of MLS doesn't include deniability; it instead aims for non-repudiation. It would require exchanging the signatures over deniable channels, so that it is impossible for one user to prove that a certain key really corresponds to a certain user. These are not provided in this implementation; instead, we send over regular asymmetrically encrypted channels.
Additionally, for deniability to be used practically, it must extend beyond Cryptographic deniability, where it can be theoretically proven that a technically savvy adversary could generate false messages without knowledge of another user's private signature key. For deniability to be taken seriously, it needs to be made easier to access; for instance a feature within the application allowing users to change their local message history by sending messages as another person to themselves. This is beyond the scope of my implementation, but it's curious to note the gap between theoretical deniability, and the sort of deniability actually deemed plausible in scenarios such as the court of law.
Less to do with the security of the group itself, and more the security of the application in the context of the wider device. C2 attacks involve malware planted on a device being able to communicate with attackers through covert channels. This can be done through esoteric methods as companies increase their security, such as through printer networks or disguising data in Slack message packets, which are authorised to pass through the network. To prevent this application from being used in such a way, all padding fields are completely zeroed out when set; and any packets received with unexpected non-zero bits are rejected.