WhatsApp has been plagued by numerous issues in their security: easily stolen passwords, unencrypted messages and even a website that can change anyone’s status. But that streak is not yet over.
To be clear: this post is not about using IMEI numbers as your password. That issue has been fixed. Logging in on a new device currently works as follows:
The phone posts its phone number to a HTTPS URL to request an authentication code,
the phone receives an authentication code in the text message,
the authentication code is used, again over HTTPS, to obtain a password.
These passwords are quite long and never visible to the user, making them hard to steal from a phone.
With the password, the client can log in to the not-really-XMPP server that WhatsApp uses. For this it uses the custom SASL mechanism WAUTH-1. To log in with the phone number XXXXXXXXXXXX, the following happens (I’m showing the XML representation of the protocol, this is not what is actually sent):
To respond to the challenge, the client generates a key using PBKDF2 with the user’s password, the challenge data as the salt and SHA1 as the hash function. It only uses 16 iterations of PBKDF2, which is a little low these days, but we know the password is quite long and random so this does not concern me greatly. 20 bytes from the PBKDF2 result are used as a key for RC4, which is used to encrypt and MAC XXXXXXXXXXXX || YYYYYYYYYYYYYYYYYYYY || UNIX timestamp:
From now on, every message is encrypted and MACed (using HMAC-SHA1) using this key.
Mistake #1: The same encryption key in both directions
Lets recall how RC4 is supposed to work: RC4 is a PRNG that generates a stream of bytes, which are xored with the plaintext that is to be encrypted. By xoring the ciphertext with the same stream, the plaintext is recovered.
However, recall that:
(A ^ X) ^ (B ^ X) = A ^ B
In other words: if we have two messages encrypted with the same RC4 key, we can cancel the key stream out!
As WhatsApp uses the same key for the incoming and the outgoing RC4 stream, we know that ciphertext byte i on the incoming stream xored with ciphertext byte i on the outgoing stream will be equal to xoring plaintext byte i on the incoming stream with plaintext byte i of the outgoing stream. By xoring this with either of the plaintext bytes, we can uncover the other byte.
This does not directly reveal all bytes, but in many cases it will work: the first couple of messages exchanged are easy to predict from the information that is sent in plain. All messages still have a common structure, despite the binary encoding: for example, every stanza starts with 0xf8. Even if a byte is not known fully, sometimes it can be known that it must be alphanumeric or an integer in a specific range, which can give some information about the other byte.
Mistake #2: The same HMAC key in both directions
The purpose of a MAC is to authenticate messages. But a MAC by itself is not enough to detect all forms of tampering: an attacker could drop specific messages, swap them or even transmit them back to the sender. TLS counters this by including a sequence number in the plaintext of every message and by using a different key for the HMAC for messages from the server to the client and for messages from the client to the server. WhatsApp does not use such a sequence counter and it reuses the key used for RC4 for the HMAC.
When an attacker retransmits, swaps or drops messages the receiver can not notice that, except for the fact that the decryption of the message is unlikely to be a valid binary-XMPP message. However, by transmitting a message back to the sender at exactly the same place in the RC4 stream as it was originally transmitted will make it decrypt properly. Whether this can be exploited in any way, I don’t know.
You should assume that anyone who is able to eavesdrop on your WhatsApp connection is capable of decrypting your messages, given enough effort. You should consider all your previous WhatsApp conversations compromised. There is nothing a WhatsApp user can do about this but except to stop using it until the developers can update it.
There are many pitfalls when developing a streaming encryption protocol. Considering they don’t know how to use a xor correctly, maybe the WhatsApp developers should stop trying to do this themselves and accept the solution that has been reviewed, updated and fixed for more than 15 years, like TLS.
The following is a Python script which can intercept messages to WhatsApp and which tries to decrypt the incoming messages by guessing all outgoing messages. It uses the WhatsApp library WhatsPoke for the FunXMPP parser.
This does not work for the official WhatsApp client. It assumes the client logs in and sends only pings to the server. This was tested using yowsup-cli -l -d -c config -k from https://github.com/tgalal/yowsup. To use it with the official client, you would need to figure out which outgoing messages the real client sends and deal with the fact that they might contain data which is not as easy to predict, or even something more clever which can decrypt bytes in both streams using the other one.