January 26, 2016

The Wrong Number Attack

XMPP is federated, similar to email, which means different domains can connect to each other. Back in the early days, when a server initiated a connection to a server, the initiating server could be reasonably sure it connected to the right place as it resolved the DNS records (remember, it’s 1999). But the receiving server has no guarantee on whether the incoming connection was actually from the domain it claimed.


Thus dialback was introduced. The mechanism is simple: the initiating server sends a key (the dialback key) to the receiving server. Then the receiving server connects back to the server that the initiating server claimed to be and sends the key. That server replies whether that key is valid or not.

This mechanism creates quite a barrier for spoofing, phishing and spamming. Of course, it is not secure against attackers that can manipulate DNS results (as the receiving server has to resolve the initiating server’s domain), but that requires an active attacker.

To generate dialback keys, many implementations work as follows: they generate a random string (the dialback secret) and then calculate the key as a hash of the dialback secret, the domains and the stream ID. When verifying an incoming connection, servers often don’t check if an outgoing connection is actually pending, they simply recompute the key and compare it: the assumption is that everyone who has the dialback secret is authorized to make connections. The stream ID is generated by the receiving server to protect this from replay attacks.

Generating keys this way makes it easy to support clustering: all the servers for a domain share the dialback secret, so the initiating server and the server doing the validation (called the authoritative server in the XEP) don’t need to be the same. When not using clustering, it would be annoying to require the user to generate a secret, so many servers automatically generate one.



The first place where I found the problem was in Prosody. Prosody uses a randomly generated UUID as the dialback secret. However, Lua itself doesn’t include a cryptographically-secure pseudo-random number generator and neither do any of the required dependencies. So instead they took entropy from the few sources that were available and feed that iteratively into a custom PRNG.

The amount of entropy those sources contain varies between platforms, but in the worst case could be pretty low. The initial seed uses the current time, the amount of CPU time used so far and a pointer to a new Lua table. The moment a server restarts is quite easy to observe (all connections close and Prosody even sends its uptime to those who request it) and the CPU time used just after the server has restarted is typically very low (0.01-0.04). The pointer can take a few values, but not much more than 219.

The dialback secret is the very first thing the PRNG generates, so when the entropy was still minimal. This means the secret is easy to brute-force. A very naive implementation could do it in around 6 minutes.

All an attacker needs is a real domain to obtain a dialback key from the vulnerable server, which they can then use to brute-force the dialback secret and impersonate the vulnerable server to any other server, without any network interception or modification.

This was fixed by using /dev/urandom/ instead (and dropping support for platforms where /dev/urandom doesn’t exist): https://hg.prosody.im/0.9/rev/c633e1338554 and released in 0.9.9.

More info: https://prosody.im/security/advisory_20160108-2/


After I reported this, I wondered how it would work in servers that support clustering, so I looked over the ejabberd source code. I noticed ejabberd generates a random number to use as the key and then stores it in a database together with the target domain.

However, this number was also not generated using a CSPRNG, it was generated using random:uniform()1, just like stream IDs and random resources. Observing 2 stream IDs is enough to unambiguously compute the internal state from the PRNG in less than a second and then compute all numbers it has generated and will generate.

The number is stored in a database with the domain, but not with the stream ID, so it can be used multiple times to authenticate a different stream. The attacker can guess which number was used for this connection and open a second connection to the target server, which will then be authenticated successfully. So the attacker can only impersonate to the server the key was originally generated for and only while the original connection is open (as otherwise the entry gets deleted from the database). It will likely take a few guesses (but it’s unlikely anyone would notice), so it’s harder to pull of than the Prosody attack, but not impossible.

This was fixed by using crypto:rand_uniform instead: https://github.com/processone/ejabberd/commit/fb8a51136519a190145265736c4243095e2516ec and released in 16.01.


I had reported the issue hoping to get a coordinated release ready. But then I wondered about the other implementations out there, so I checked out Openfire.

There, I discovered that the implementation is very similar to Prosody, using a dialback secret to derive dialback keys. However, dialback secrets were generated using Java’s Random class, which is also not cryptographically-secure.

I did not try to brute-force this one myself, but there is enough documentation to show that it’s easy: http://stackoverflow.com/a/11052736/353187.

This was fixed by using SecureRandom instead: https://github.com/igniterealtime/Openfire/commit/ccfee2eac3f45cfcce31acb1b0132e76c122557d and released in 4.0.0.


Then I looked at Tigase, which was the only one that actually did things right, using a CSPRNG and following the recommendations of XEP-0185.


An attacker opening a connection that has incorrectly been validated using dialback means the attacker can send fake messages from any user on the vulnerable domain (or subscription requests, or file transfers, etc.). However, the attacker does not receive stanzas back: only the initiator of a stream can send stanzas, unless XEP-0288 is used.

The three implementations Prosody (including Metronome), ejabberd and Openfire together make up a large part of the network. This means that if you haven’t upgraded in the last month you really should get to it!

  1. This PRNG actually has 96-bits of internal state, which is close to enough. However, after the first number, only ~244 states are possible. Though this could be brute-forced, some number theory is enough to compute the internal state from a full output value.↩︎