February 14, 2014

HTTPS Attacks and XMPP 1: BEAST


In the past couple of years, a number of attacks have been found on “TLS”, but often those attacks were only shown with HTTPS. The majority of TLS encrypted traffic is probably HTTPS, but it’s important to understand which of these attacks can be translated to other protocols. I’ll use XMPP, but I’ll try to get the attacks down to the core features the used protocol needs to support to help others determine which other protocols are also vulnerable.

The core difference between HTTPS and XMPP is that connections are long-lived. While an attacker who has tricked you into running their JavaScript code can make multiple HTTPS requests per second, a reconnect on XMPP will be easier to notice and slower.

I shall mostly assume that the attacker knows your full JID and is therefore capable of routing stanzas to you. This may not always be realistic, but targeted attacks do happen.

The Target

First of all, there needs to be something that is going to be extracted. Something secret that was transmitted which the attacker wants to obtain. For HTTPS, this is often cookies: they are transmitted often, an attacker can cause them to be retransmitted whenever they want, and they (often) give full access to your account. Other attacks attempt to extract data from the body of a page, like email addresses.

On XMPP, possible targets include:

  1. The user’s password (assuming we’re using SASL PLAIN)
  2. If we wouldn’t have it yet, the user’s JID
  3. The user’s contact list
  4. Message meta-data
  5. Message content

BEAST

General explanation

BEAST exploits a problem with CBC mode in TLS which leads to predictable IVs. CBC mode is a block cipher mode that xors a plain text block with the previous cipher text block before encrypting it, to ensure equal plain text blocks do not turn into the same cipher text blocks. However, TLS 1.0 also does this when that block has already been transmitted. This means the attacker can observe that block and pick the next block to be encrypted based on that information.

In other words, suppose P[i] are the plain text blocks the client sends and C[i] are the cipher text blocks. Then C[i] := AES(P[i] XOR C[i-1]). Suppose C[a] was the last block of a packet. Then, if the server could somehow convince the client to send P[a+1] = C[a] XOR Q, then the encrypted packet will be C[a] = AES(P[a+1] XOR C[a]) = AES(C[a] XOR Q XOR C[a]). If you recall the properties of XOR, this is equal to AES(Q). So the attacker can obtain the encryption of any block Q!

This is what’s known as an encryption oracle. The attacker can pick any value Q and obtain its encryption, but it can not decrypt anything. But by making clever guesses, the attacker might be able to obtain the decryption of specific blocks, if they guessed correctly. Suppose the attacker is interested in the contents of the bth block, then the attacker must make a guess of what that block was, lets call its guess B, and it will make the client send B XOR C[b-1] (to account for the CBC mode). If AES(B XOR C[b-1]) was equal to C[b], then the attacker knows their guess was correct and that P[b] = B. If the guess was wrong, the attacker will have to try a different value instead of B and continue until the encryption is equal.

Now, guessing even a 10 character cookie or password correctly is going to take a long time, so BEAST uses another trick: it will insert content before the cookie to make sure the first character of the cookie falls in one block, but the rest of the cookie in the next. If the attacker does know the rest of the block, it only needs to guess one character, which can be done quite quickly. For the next connection, the attacker will insert one character less, so two characters of the cookie fall in the first block. It already knows the first one of the two, so it again needs to guess only one character. It can continue this until it has the entire cookie. This means the attacker is able to guess a long cookie in quite a low number of guesses.

On XMPP

Attacking the password on XMPP is much harder, as the attacker can not insert new content before the password. It is quite likely the password spans different blocks, but that still leaves a large number of possibilities. The contact list is likely also too hard to guess without any prior information. Guessing the JID might be possible if the attacker has a list of all users on the server, but if the attacker didn’t know the JID, it would be much harder for them to actually do this attack.

Guessing messages is also hard, but obtaining message meta-data might be possible.

Suppose the attacker already has a list of contacts who are on my contact list. I send a packet and the attacker is interested in finding out whether it was a message to one of my contacts, and if so, to whom. Here’s the plain text of the message, shown split at the block boundraries:
1: <message type='c
2: hat' id='purplea
3: 1074a2d' to='jul
4: iet@im.example.c
5: om'><active xmln
6: s='http://jabber
7: .org/protocol/ch
...

The interesting info is spread over blocks 3, 4 and 5. However, block 3 also contains part of purplea1074a2d. We can guess this (hint: it’s an incrementing counter), but lets suppose we don’t know it, just its length. The attacker can use BEAST to test every person on my contact list to the 4th block, leaving out the first 3 characters and using at most 16 characters of every JID. It might fail (the attacker doesn’t know for sure it’s a message, it could be an <iq/> stanza or groupchat message), but if the message is for one of my contacts, the attacker will find that.

There is, however, one more requirement on BEAST that we need to satisfy: the attacker must be able to fully specify the first block in a new packet. It won’t work if it’s a later block. In general, the client will only start new packet with a new stanza. So the packet will almost always start with <iq, <message or <presence, and anything the client sends must be valid UTF-8 and valid XML. Therefore, making the client send B XOR C[b-1] exactly is very hard.

One way the attacker might try to get the client to use certain contents in its first block is by sending the target an <id/> with the payload in the ‘id’ attribute:

1: <iq id='�u��_$ճH
2: ' type='get' to=
...
The client will then mirror the ‘id’ and send a reply or error back:
1: <iq id='�u��_$ճH
2: ' type='error' t
...

But because of the prefix, the attack can only be successfull when B XOR C[b-1] happens to start with <iq id='. The probabilty of that happening is 2-64, which is neglible.

Conclusion

The main requirement for BEAST is that the attacker must be able to choose the block a client will use as the start of a new packet. Additionally, the secret needs to be either easy to guess or it needs to be movable by the attacker so it can be placed on a block boundrary. I do not know of a way to do the former in XMPP, so it is unlikely XMPP is vulnerable.