XMPP is federated, similar to email, which means different domains can connect to each other. Back in the early days, when a server initiated a connection to a server, the initiating server could be reasonably sure it connected to the right place as it resolved the DNS records (remember, it’s 1999). But the receiving server has no guarantee on whether the incoming connection was actually from the domain it claimed.
Thus dialback was introduced. The mechanism is simple: the initiating server sends a key (the dialback key) to the receiving server. Then the receiving server connects back to the server that the initiating server claimed to be and sends the key. That server replies whether that key is valid or not.
This mechanism creates quite a barrier for spoofing, phishing and spamming. Of course, it is not secure against attackers that can manipulate DNS results (as the receiving server has to resolve the initiating server’s domain), but that requires an active attacker.
To generate dialback keys, many implementations work as follows: they generate a random string (the dialback secret) and then calculate the key as a hash of the dialback secret, the domains and the stream ID. When verifying an incoming connection, servers often don’t check if an outgoing connection is actually pending, they simply recompute the key and compare it: the assumption is that everyone who has the dialback secret is authorized to make connections. The stream ID is generated by the receiving server to protect this from replay attacks.
Generating keys this way makes it easy to support clustering: all the servers for a domain share the dialback secret, so the initiating server and the server doing the validation (called the authoritative server in the XEP) don’t need to be the same. When not using clustering, it would be annoying to require the user to generate a secret, so many servers automatically generate one.
The first place where I found the problem was in Prosody. Prosody uses a randomly generated UUID as the dialback secret. However, Lua itself doesn’t include a cryptographically-secure pseudo-random number generator and neither do any of the required dependencies. So instead they took entropy from the few sources that were available and feed that iteratively into a custom PRNG.
The amount of entropy those sources contain varies between platforms, but in the worst case could be pretty low. The initial seed uses the current time, the amount of CPU time used so far and a pointer to a new Lua table. The moment a server restarts is quite easy to observe (all connections close and Prosody even sends its uptime to those who request it) and the CPU time used just after the server has restarted is typically very low (0.01-0.04). The pointer can take a few values, but not much more than 219.
The dialback secret is the very first thing the PRNG generates, so when the entropy was still minimal. This means the secret is easy to brute-force. A very naive implementation could do it in around 6 minutes.
All an attacker needs is a real domain to obtain a dialback key from the vulnerable server, which they can then use to brute-force the dialback secret and impersonate the vulnerable server to any other server, without any network interception or modification.
This was fixed by using /dev/urandom/
instead (and dropping support for platforms where /dev/urandom
doesn’t exist): https://hg.prosody.im/0.9/rev/c633e1338554 and released in 0.9.9.
More info: https://prosody.im/security/advisory_20160108-2/
After I reported this, I wondered how it would work in servers that support clustering, so I looked over the ejabberd source code. I noticed ejabberd generates a random number to use as the key and then stores it in a database together with the target domain.
However, this number was also not generated using a CSPRNG, it was generated using random:uniform()
1, just like stream IDs and random resources. Observing 2 stream IDs is enough to unambiguously compute the internal state from the PRNG in less than a second and then compute all numbers it has generated and will generate.
The number is stored in a database with the domain, but not with the stream ID, so it can be used multiple times to authenticate a different stream. The attacker can guess which number was used for this connection and open a second connection to the target server, which will then be authenticated successfully. So the attacker can only impersonate to the server the key was originally generated for and only while the original connection is open (as otherwise the entry gets deleted from the database). It will likely take a few guesses (but it’s unlikely anyone would notice), so it’s harder to pull of than the Prosody attack, but not impossible.
This was fixed by using crypto:rand_uniform
instead: https://github.com/processone/ejabberd/commit/fb8a51136519a190145265736c4243095e2516ec and released in 16.01.
I had reported the issue hoping to get a coordinated release ready. But then I wondered about the other implementations out there, so I checked out Openfire.
There, I discovered that the implementation is very similar to Prosody, using a dialback secret to derive dialback keys. However, dialback secrets were generated using Java’s Random
class, which is also not cryptographically-secure.
I did not try to brute-force this one myself, but there is enough documentation to show that it’s easy: http://stackoverflow.com/a/11052736/353187.
This was fixed by using SecureRandom
instead: https://github.com/igniterealtime/Openfire/commit/ccfee2eac3f45cfcce31acb1b0132e76c122557d and released in 4.0.0.
Then I looked at Tigase, which was the only one that actually did things right, using a CSPRNG and following the recommendations of XEP-0185.
An attacker opening a connection that has incorrectly been validated using dialback means the attacker can send fake messages from any user on the vulnerable domain (or subscription requests, or file transfers, etc.). However, the attacker does not receive stanzas back: only the initiator of a stream can send stanzas, unless XEP-0288 is used.
The three implementations Prosody (including Metronome), ejabberd and Openfire together make up a large part of the network. This means that if you haven’t upgraded in the last month you really should get to it!
This PRNG actually has 96-bits of internal state, which is close to enough. However, after the first number, only ~244 states are possible. Though this could be brute-forced, some number theory is enough to compute the internal state from a full output value.↩︎
To deal with the logjam attack, I was looking for a set of all “common” Diffie-Hellman parameters to update xmpp.net, yet I wasn’t able to find those easily. Even just searching for the hexadecimal representation of commonly encounted primes often didn’t lead to the document they were specified. Here I’m documenting those that I found and where they are from.
Java “helpfully” includes a number of hard-coded default parameters. I’m unsure about how old these are exactly, probably from 1996/1997, maybe earlier.
FCA682CE8E12CABA26EFCCF7110E526DB078B05EDECBCD1E
B4A208F3AE1617AE01F35B91A47E6DF63413C5E12ED0899B
CD132ACD50D99151BDC43EE737592E17
Generator:
678471B27A9CF44EE91A49C5147DB1A9AAF244F05A434D64
86931D2D14271B9E35030B71FD73DA179069B32E2935630E
1C2062354D0DA20A6C416E50BE794CA4
E9E642599D355F37C97FFD3567120B8E25C9CD43E927B3A9
670FBEC5D890141922D2C3B3AD2480093799869D1E846AAB
49FAB0AD26D2CE6A22219D470BCE7D777D4A21FBE9C270B5
7F607002F3CEF8393694CF45EE3688C11A8C56AB127A3DAF
Generator:
30470AD5A005FB14CE2D9DCD87E38BC7D1B1C5FACBAECBE9
5F190AA7A31D23C4DBBCBE06174544401A5B2C020965D8C2
BD2171D3668445771F74BA084D2029D83C1C158547F3A9F1
A2715BE23D51AE4D3E5A1F6A7064F316933A346D3F529252
FD7F53811D75122952DF4A9C2EECE4E7F611B7523CEF4400
C31E3F80B6512669455D402251FB593D8D58FABFC5F5BA30
F6CB9B556CD7813B801D346FF26660B76B9950A5A49F9FE8
047B1022C24FBBA9D7FEB7C61BF83B57E7C6A8A6150F04FB
83F6D3C51EC3023554135A169132F675F3AE2B61D72AEFF2
2203199DD14801C7
Generator:
F7E1A085D69B3DDECBBCAB5C36B857B97994AFBBFA3AEA82
F9574C0B3D0782675159578EBAD4594FE67107108180B449
167123E84C281613B7CF09328CC8A6E13C167A8B547C8D28
E0A3AE1E2BB3A675916EA37F0BFA213562F1FB627A01243B
CCA4F1BEA8519089A883DFE15AE59F06928B665E807B5525
64014C3BFECF492A
RFC 2409 “The Internet Key Exchange (IKE)”, written in 1998, includes a number of DH parameters in §6.
Prime:
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A63A3620FFFFFFFFFFFFFFFF
Generator:
02
Prime:
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE65381
FFFFFFFFFFFFFFFF
Generator:
02
RFC 3526 “More Modular Exponential (MODP) Diffie-Hellman groups for Internet Key Exchange (IKE)”, written in 2003, documents a number of DH parameters.
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA237327FFFFFFFFFFFFFFFF
Generator:
02
(This is also the group used by OTR.)
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9
DE2BCBF6955817183995497CEA956AE515D2261898FA0510
15728E5A8AACAA68FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9
DE2BCBF6955817183995497CEA956AE515D2261898FA0510
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31
43DB5BFCE0FD108E4B82D120A93AD2CAFFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9
DE2BCBF6955817183995497CEA956AE515D2261898FA0510
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934063199
FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9
DE2BCBF6955817183995497CEA956AE515D2261898FA0510
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934028492
36C3FAB4D27C7026C1D4DCB2602646DEC9751E763DBA37BD
F8FF9406AD9E530EE5DB382F413001AEB06A53ED9027D831
179727B0865A8918DA3EDBEBCF9B14ED44CE6CBACED4BB1B
DB7F1447E6CC254B332051512BD7AF426FB8F401378CD2BF
5983CA01C64B92ECF032EA15D1721D03F482D7CE6E74FEF6
D55E702F46980C82B5A84031900B1C9E59E7C97FBEC7E8F3
23A97A7E36CC88BE0F1D45B7FF585AC54BD407B22B4154AA
CC8F6D7EBF48E1D814CC5ED20F8037E0A79715EEF29BE328
06A1D58BB7C5DA76F550AA3D8A1FBFF0EB19CCB1A313D55C
DA56C9EC2EF29632387FE8D76E3C0468043E8F663F4860EE
12BF2D5B0B7474D6E694F91E6DCC4024FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED
EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D
C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F
83655D23DCA3AD961C62F356208552BB9ED529077096966D
670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B
E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9
DE2BCBF6955817183995497CEA956AE515D2261898FA0510
15728E5A8AAAC42DAD33170D04507A33A85521ABDF1CBA64
ECFB850458DBEF0A8AEA71575D060C7DB3970F85A6E1E4C7
ABF5AE8CDB0933D71E8C94E04A25619DCEE3D2261AD2EE6B
F12FFA06D98A0864D87602733EC86A64521F2B18177B200C
BBE117577A615D6C770988C0BAD946E208E24FA074E5AB31
43DB5BFCE0FD108E4B82D120A92108011A723C12A787E6D7
88719A10BDBA5B2699C327186AF4E23C1A946834B6150BDA
2583E9CA2AD44CE8DBBBC2DB04DE8EF92E8EFC141FBECAA6
287C59474E6BC05D99B2964FA090C3A2233BA186515BE7ED
1F612970CEE2D7AFB81BDD762170481CD0069127D5B05AA9
93B4EA988D8FDDC186FFB7DC90A6C08F4DF435C934028492
36C3FAB4D27C7026C1D4DCB2602646DEC9751E763DBA37BD
F8FF9406AD9E530EE5DB382F413001AEB06A53ED9027D831
179727B0865A8918DA3EDBEBCF9B14ED44CE6CBACED4BB1B
DB7F1447E6CC254B332051512BD7AF426FB8F401378CD2BF
5983CA01C64B92ECF032EA15D1721D03F482D7CE6E74FEF6
D55E702F46980C82B5A84031900B1C9E59E7C97FBEC7E8F3
23A97A7E36CC88BE0F1D45B7FF585AC54BD407B22B4154AA
CC8F6D7EBF48E1D814CC5ED20F8037E0A79715EEF29BE328
06A1D58BB7C5DA76F550AA3D8A1FBFF0EB19CCB1A313D55C
DA56C9EC2EF29632387FE8D76E3C0468043E8F663F4860EE
12BF2D5B0B7474D6E694F91E6DBE115974A3926F12FEE5E4
38777CB6A932DF8CD8BEC4D073B931BA3BC832B68D9DD300
741FA7BF8AFC47ED2576F6936BA424663AAB639C5AE4F568
3423B4742BF1C978238F16CBE39D652DE3FDB8BEFC848AD9
22222E04A4037C0713EB57A81A23F0C73473FC646CEA306B
4BCBC8862F8385DDFA9D4B7FA2C087E879683303ED5BDD3A
062B3CF5B3A278A66D2A13F83F44F82DDF310EE074AB6A36
4597E899A0255DC164F31CC50846851DF9AB48195DED7EA1
B1D510BD7EE74D73FAF36BC31ECFA268359046F4EB879F92
4009438B481C6CD7889A002ED5EE382BC9190DA6FC026E47
9558E4475677E9AA9E3050E2765694DFC81F56E880B96E71
60C980DD98EDD3DFFFFFFFFFFFFFFFFF
Generator:
02
RFC 5114 “Additional Diffie-Hellman Groups for Use with IETF Standards”, written in 2008, documents even more DH parameters.
B10B8F96A080E01DDE92DE5EAE5D54EC52C99FBCFB06A3C6
9A6A9DCA52D23B616073E28675A23D189838EF1E2EE652C0
13ECB4AEA906112324975C3CD49B83BFACCBDD7D90C4BD70
98488E9C219A73724EFFD6FAE5644738FAA31A4FF55BCCC0
A151AF5F0DC8B4BD45BF37DF365C1A65E68CFDA76D4DA708
DF1FB2BC2E4A4371
Generator:
A4D1CBD5C3FD34126765A442EFB99905F8104DD258AC507F
D6406CFF14266D31266FEA1E5C41564B777E690F5504F213
160217B4B01B886A5E91547F9E2749F4D7FBD7D3B9A92EE1
909D0D2263F80A76A6A24C087A091F531DBF0A0169B6A28A
D662A4D18E73AFA32D779D5918D08BC8858F4DCEF97C2A24
855E6EEB22B3B2E5
AD107E1E9123A9D0D660FAA79559C51FA20D64E5683B9FD1
B54B1597B61D0A75E6FA141DF95A56DBAF9A3C407BA1DF15
EB3D688A309C180E1DE6B85A1274A0A66D3F8152AD6AC212
9037C9EDEFDA4DF8D91E8FEF55B7394B7AD5B7D0B6C12207
C9F98D11ED34DBF6C6BA0B2C8BBC27BE6A00E0A0B9C49708
B3BF8A317091883681286130BC8985DB1602E714415D9330
278273C7DE31EFDC7310F7121FD5A07415987D9ADC0A486D
CDF93ACC44328387315D75E198C641A480CD86A1B9E587E8
BE60E69CC928B2B9C52172E413042E9B23F10B0E16E79763
C9B53DCF4BA80A29E3FB73C16B8E75B97EF363E2FFA31F71
CF9DE5384E71B81C0AC4DFFE0C10E64F
Generator:
AC4032EF4F2D9AE39DF30B5C8FFDAC506CDEBE7B89998CAF
74866A08CFE4FFE3A6824A4E10B9A6F0DD921F01A70C4AFA
AB739D7700C29F52C57DB17C620A8652BE5E9001A8D66AD7
C17669101999024AF4D027275AC1348BB8A762D0521BC98A
E247150422EA1ED409939D54DA7460CDB5F6C6B250717CBE
F180EB34118E98D119529A45D6F834566E3025E316A330EF
BB77A86F0C1AB15B051AE3D428C8F8ACB70A8137150B8EEB
10E183EDD19963DDD9E263E4770589EF6AA21E7F5F2FF381
B539CCE3409D13CD566AFBB48D6C019181E1BCFE94B30269
EDFE72FE9B6AA4BD7B5A0F1C71CFFF4C19C418E1F6EC0179
81BC087F2A7065B384B890D3191F2BFA
87A8E61DB4B6663CFFBBD19C651959998CEEF608660DD0F2
5D2CEED4435E3B00E00DF8F1D61957D4FAF7DF4561B2AA30
16C3D91134096FAA3BF4296D830E9A7C209E0C6497517ABD
5A8A9D306BCF67ED91F9E6725B4758C022E0B1EF4275BF7B
6C5BFC11D45F9088B941F54EB1E59BB8BC39A0BF12307F5C
4FDB70C581B23F76B63ACAE1CAA6B7902D52526735488A0E
F13C6D9A51BFA4AB3AD8347796524D8EF6A167B5A41825D9
67E144E5140564251CCACB83E6B486F6B3CA3F7971506026
C0B857F689962856DED4010ABD0BE621C3A3960A54E710C3
75F26375D7014103A4B54330C198AF126116D2276E11715F
693877FAD7EF09CADB094AE91E1A1597
Generator:
3FB32C9B73134D0B2E77506660EDBD484CA7B18F21EF2054
07F4793A1A0BA12510DBC15077BE463FFF4FED4AAC0BB555
BE3A6C1B0C6B47B1BC3773BF7E8C6F62901228F8C28CBB18
A55AE31341000A650196F931C77A57F2DDF463E5E9EC144B
777DE62AAAB8A8628AC376D282D6ED3864E67982428EBC83
1D14348F6F2F9193B5045AF2767164E1DFC967C1FB3F2E55
A4BD1BFFE83B9C80D052B985D182EA0ADB2A3B7313D3FE14
C8484B1E052588B9B7D2BBD2DF016199ECD06E1557CD0915
B3353BBB64E0EC377FD028370DF92B52C7891428CDC67EB6
184B523D1DB246C32F63078490F00EF8D647D148D4795451
5E2327CFEF98C582664B4C0F6CC41659
There’s a draft draft-ietf-tls-negotiated-ff-dhe-10 “Negotiated Finite Field Diffie-Hellman Ephemeral Parameters for TLS” that adds some more DH groups specifically for TLS. The current version is from June 2015.
FFFFFFFFFFFFFFFFADF85458A2BB4A9AAFDC5620273D3CF1
D8B9C583CE2D3695A9E13641146433FBCC939DCE249B3EF9
7D2FE363630C75D8F681B202AEC4617AD3DF1ED5D5FD6561
2433F51F5F066ED0856365553DED1AF3B557135E7F57C935
984F0C70E0E68B77E2A689DAF3EFE8721DF158A136ADE735
30ACCA4F483A797ABC0AB182B324FB61D108A94BB2C8E3FB
B96ADAB760D7F4681D4F42A3DE394DF4AE56EDE76372BB19
0B07A7C8EE0A6D709E02FCE1CDF7E2ECC03404CD28342F61
9172FE9CE98583FF8E4F1232EEF28183C3FE3B1B4C6FAD73
3BB5FCBC2EC22005C58EF1837D1683B2C6F34A26C1B2EFFA
886B423861285C97FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFADF85458A2BB4A9AAFDC5620273D3CF1
D8B9C583CE2D3695A9E13641146433FBCC939DCE249B3EF9
7D2FE363630C75D8F681B202AEC4617AD3DF1ED5D5FD6561
2433F51F5F066ED0856365553DED1AF3B557135E7F57C935
984F0C70E0E68B77E2A689DAF3EFE8721DF158A136ADE735
30ACCA4F483A797ABC0AB182B324FB61D108A94BB2C8E3FB
B96ADAB760D7F4681D4F42A3DE394DF4AE56EDE76372BB19
0B07A7C8EE0A6D709E02FCE1CDF7E2ECC03404CD28342F61
9172FE9CE98583FF8E4F1232EEF28183C3FE3B1B4C6FAD73
3BB5FCBC2EC22005C58EF1837D1683B2C6F34A26C1B2EFFA
886B4238611FCFDCDE355B3B6519035BBC34F4DEF99C0238
61B46FC9D6E6C9077AD91D2691F7F7EE598CB0FAC186D91C
AEFE130985139270B4130C93BC437944F4FD4452E2D74DD3
64F2E21E71F54BFF5CAE82AB9C9DF69EE86D2BC522363A0D
ABC521979B0DEADA1DBF9A42D5C4484E0ABCD06BFA53DDEF
3C1B20EE3FD59D7C25E41D2B66C62E37FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFADF85458A2BB4A9AAFDC5620273D3CF1
D8B9C583CE2D3695A9E13641146433FBCC939DCE249B3EF9
7D2FE363630C75D8F681B202AEC4617AD3DF1ED5D5FD6561
2433F51F5F066ED0856365553DED1AF3B557135E7F57C935
984F0C70E0E68B77E2A689DAF3EFE8721DF158A136ADE735
30ACCA4F483A797ABC0AB182B324FB61D108A94BB2C8E3FB
B96ADAB760D7F4681D4F42A3DE394DF4AE56EDE76372BB19
0B07A7C8EE0A6D709E02FCE1CDF7E2ECC03404CD28342F61
9172FE9CE98583FF8E4F1232EEF28183C3FE3B1B4C6FAD73
3BB5FCBC2EC22005C58EF1837D1683B2C6F34A26C1B2EFFA
886B4238611FCFDCDE355B3B6519035BBC34F4DEF99C0238
61B46FC9D6E6C9077AD91D2691F7F7EE598CB0FAC186D91C
AEFE130985139270B4130C93BC437944F4FD4452E2D74DD3
64F2E21E71F54BFF5CAE82AB9C9DF69EE86D2BC522363A0D
ABC521979B0DEADA1DBF9A42D5C4484E0ABCD06BFA53DDEF
3C1B20EE3FD59D7C25E41D2B669E1EF16E6F52C3164DF4FB
7930E9E4E58857B6AC7D5F42D69F6D187763CF1D55034004
87F55BA57E31CC7A7135C886EFB4318AED6A1E012D9E6832
A907600A918130C46DC778F971AD0038092999A333CB8B7A
1A1DB93D7140003C2A4ECEA9F98D0ACC0A8291CDCEC97DCF
8EC9B55A7F88A46B4DB5A851F44182E1C68A007E5E655F6A
FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFADF85458A2BB4A9AAFDC5620273D3CF1
D8B9C583CE2D3695A9E13641146433FBCC939DCE249B3EF9
7D2FE363630C75D8F681B202AEC4617AD3DF1ED5D5FD6561
2433F51F5F066ED0856365553DED1AF3B557135E7F57C935
984F0C70E0E68B77E2A689DAF3EFE8721DF158A136ADE735
30ACCA4F483A797ABC0AB182B324FB61D108A94BB2C8E3FB
B96ADAB760D7F4681D4F42A3DE394DF4AE56EDE76372BB19
0B07A7C8EE0A6D709E02FCE1CDF7E2ECC03404CD28342F61
9172FE9CE98583FF8E4F1232EEF28183C3FE3B1B4C6FAD73
3BB5FCBC2EC22005C58EF1837D1683B2C6F34A26C1B2EFFA
886B4238611FCFDCDE355B3B6519035BBC34F4DEF99C0238
61B46FC9D6E6C9077AD91D2691F7F7EE598CB0FAC186D91C
AEFE130985139270B4130C93BC437944F4FD4452E2D74DD3
64F2E21E71F54BFF5CAE82AB9C9DF69EE86D2BC522363A0D
ABC521979B0DEADA1DBF9A42D5C4484E0ABCD06BFA53DDEF
3C1B20EE3FD59D7C25E41D2B669E1EF16E6F52C3164DF4FB
7930E9E4E58857B6AC7D5F42D69F6D187763CF1D55034004
87F55BA57E31CC7A7135C886EFB4318AED6A1E012D9E6832
A907600A918130C46DC778F971AD0038092999A333CB8B7A
1A1DB93D7140003C2A4ECEA9F98D0ACC0A8291CDCEC97DCF
8EC9B55A7F88A46B4DB5A851F44182E1C68A007E5E0DD902
0BFD64B645036C7A4E677D2C38532A3A23BA4442CAF53EA6
3BB454329B7624C8917BDD64B1C0FD4CB38E8C334C701C3A
CDAD0657FCCFEC719B1F5C3E4E46041F388147FB4CFDB477
A52471F7A9A96910B855322EDB6340D8A00EF092350511E3
0ABEC1FFF9E3A26E7FB29F8C183023C3587E38DA0077D9B4
763E4E4B94B2BBC194C6651E77CAF992EEAAC0232A281BF6
B3A739C1226116820AE8DB5847A67CBEF9C9091B462D538C
D72B03746AE77F5E62292C311562A846505DC82DB854338A
E49F5235C95B91178CCF2DD5CACEF403EC9D1810C6272B04
5B3B71F9DC6B80D63FDD4A8E9ADB1E6962A69526D43161C1
A41D570D7938DAD4A40E329CD0E40E65FFFFFFFFFFFFFFFF
Generator:
02
FFFFFFFFFFFFFFFFADF85458A2BB4A9AAFDC5620273D3CF1
D8B9C583CE2D3695A9E13641146433FBCC939DCE249B3EF9
7D2FE363630C75D8F681B202AEC4617AD3DF1ED5D5FD6561
2433F51F5F066ED0856365553DED1AF3B557135E7F57C935
984F0C70E0E68B77E2A689DAF3EFE8721DF158A136ADE735
30ACCA4F483A797ABC0AB182B324FB61D108A94BB2C8E3FB
B96ADAB760D7F4681D4F42A3DE394DF4AE56EDE76372BB19
0B07A7C8EE0A6D709E02FCE1CDF7E2ECC03404CD28342F61
9172FE9CE98583FF8E4F1232EEF28183C3FE3B1B4C6FAD73
3BB5FCBC2EC22005C58EF1837D1683B2C6F34A26C1B2EFFA
886B4238611FCFDCDE355B3B6519035BBC34F4DEF99C0238
61B46FC9D6E6C9077AD91D2691F7F7EE598CB0FAC186D91C
AEFE130985139270B4130C93BC437944F4FD4452E2D74DD3
64F2E21E71F54BFF5CAE82AB9C9DF69EE86D2BC522363A0D
ABC521979B0DEADA1DBF9A42D5C4484E0ABCD06BFA53DDEF
3C1B20EE3FD59D7C25E41D2B669E1EF16E6F52C3164DF4FB
7930E9E4E58857B6AC7D5F42D69F6D187763CF1D55034004
87F55BA57E31CC7A7135C886EFB4318AED6A1E012D9E6832
A907600A918130C46DC778F971AD0038092999A333CB8B7A
1A1DB93D7140003C2A4ECEA9F98D0ACC0A8291CDCEC97DCF
8EC9B55A7F88A46B4DB5A851F44182E1C68A007E5E0DD902
0BFD64B645036C7A4E677D2C38532A3A23BA4442CAF53EA6
3BB454329B7624C8917BDD64B1C0FD4CB38E8C334C701C3A
CDAD0657FCCFEC719B1F5C3E4E46041F388147FB4CFDB477
A52471F7A9A96910B855322EDB6340D8A00EF092350511E3
0ABEC1FFF9E3A26E7FB29F8C183023C3587E38DA0077D9B4
763E4E4B94B2BBC194C6651E77CAF992EEAAC0232A281BF6
B3A739C1226116820AE8DB5847A67CBEF9C9091B462D538C
D72B03746AE77F5E62292C311562A846505DC82DB854338A
E49F5235C95B91178CCF2DD5CACEF403EC9D1810C6272B04
5B3B71F9DC6B80D63FDD4A8E9ADB1E6962A69526D43161C1
A41D570D7938DAD4A40E329CCFF46AAA36AD004CF600C838
1E425A31D951AE64FDB23FCEC9509D43687FEB69EDD1CC5E
0B8CC3BDF64B10EF86B63142A3AB8829555B2F747C932665
CB2C0F1CC01BD70229388839D2AF05E454504AC78B758282
2846C0BA35C35F5C59160CC046FD8251541FC68C9C86B022
BB7099876A460E7451A8A93109703FEE1C217E6C3826E52C
51AA691E0E423CFC99E9E31650C1217B624816CDAD9A95F9
D5B8019488D9C0A0A1FE3075A577E23183F81D4A3F2FA457
1EFC8CE0BA8A4FE8B6855DFE72B0A66EDED2FBABFBE58A30
FAFABE1C5D71A87E2F741EF8C1FE86FEA6BBFDE530677F0D
97D11D49F7A8443D0822E506A9F4614E011E2A94838FF88C
D68C8BB7C5C6424CFFFFFFFFFFFFFFFF
Generator:
02
As you can see, the layout of this blog has changed. I was updating this blog so little, every time I did Ruby had broken everything (or so it felt like). I don’t like the idea of having to learn Ruby or gem to blog, so I’ve decided to switch to something else.
It’s now powered by Hakyll, which seems to be a lot faster at rebuilding everything too. I’ve made sure the paths to most pages are unchanged. For the theme threw bootstrap at it until I liked enough (that’s all the webdesign skills I have).
For those of you using the RSS feed: I know the id
s of the posts have changed, so sorry if that has caused a mess for you. I don’t see a clean way to change that.
The hardest part was the “Recent posts” column on the right here. On most pages it is easy to add, but on individual article pages it creates a cyclic dependency in Hakyll that I had to work around.
TLS is now also enabled on the bare domain (thijsalkema.de) again, now using letsencrypt.
Lets start with a simple example in php:
setlocale(LC_ALL, "nl_NL.UTF-8");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $_GET["url"]);
curl_exec($ch);
This code is broken, can you tell how?
But it’s not just php or libcurl, lets try glibc.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <locale.h>
#define BUF_SIZE 500
int
main(int argc, char *argv[])
{
struct addrinfo hints;
struct addrinfo *result, *rp;
int sfd, s, j;
size_t len;
ssize_t nread;
char buf[BUF_SIZE];
setlocale(LC_ALL, "nl_NL.UTF-8");
if (argc < 3) {
fprintf(stderr, "Usage: %s host port msg...\n", argv[0]);
exit(EXIT_FAILURE);
}
/* Obtain address(es) matching host/port */
memset(&hints, 0, sizeof(struct addrinfo));
hints.ai_family = AF_UNSPEC; /* Allow IPv4 or IPv6 */
hints.ai_socktype = SOCK_DGRAM; /* Datagram socket */
hints.ai_flags = AI_IDN;
hints.ai_protocol = 0; /* Any protocol */
s = getaddrinfo(argv[1], argv[2], &hints, &result);
if (s != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(s));
exit(EXIT_FAILURE);
}
}
This is a slight modification of the example from the man page for getaddrinfo
and it is broken in the exact same way.
The common factor is that both use libidn (well, glibc contains an in-tree copy of libidn, but the essence of it is the same). libidn is a library with various Unicode related funtions. For example, it can convert internationalized domain names (IDNs) to punycode. This is what converts яндекс.рф
to xn--d1acpjx3f.xn--p1ai
, which contains only characters that can be used safely by the DNS.
The idna_to_ascii_8z
documentation states:
Convert UTF-8 domain name to ASCII string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.
As it turns out, the effect of passing a string that is not valid UTF-8 to any of the libidn functions that expects an UTF-8 string can be disastrous. If the passed in data ends with an unfinished UTF-8 codepoint, then libidn will continue reading past the terminating null-byte. There could be unrelated information just past that byte, which then gets copied into the result. This could leak private information from the server!
For example, the UTF-8 encoding of ф
is, in hex:
d1 84
In fact, any valid UTF-8 sequence that starts with d1
should always consist of 2 bytes. But if we pass:
d1 00
instead, then it will instead interpret this as if it was passed:
d1 80
and it continues reading whatever is after our input.
Some applications don’t use idna_to_ascii_8z
, but idna_to_ascii_lz
instead. The documentation for idna_to_ascii_lz
states:
Convert domain name in the locale’s encoding to ASCII string. The domain name may contain several labels, separated by dots. The output buffer must be deallocated by the caller.
However, this is no problem if the locale is already an UTF-8 locale (which is why the examples needed the setlocale
calls): if the new locale and the old locale are identical, then no conversion is happening, which means the invalid data is not caught.
The effect of the php code above when passed a domain name with invalid UTF-8 is that a DNS request is started for a domain which contains extra data.
It is possible that this data contains passwords or fragments of a key, however, it has to continue to look UTF-8-like to libidn, so it is unlikely to continue on as long as Heartbleed could (for example, multiple successive null-bytes will stop the conversion). But it could easily allow an attacker to bypass ASLR.
The stringprep
functions in libidn are affected by the same issue. These are used, for example, to normalize usernames and passwords. Here, it could allow an attacker to reuse parts of the password from a previous login.
Luckily, the AI_IDN
flag of glibc is off by default, and I could not find many applications that ever set it.
The libidn developers show little motivation to fix this, pointing the blame to applications instead:
Applications should not pass unvalidated strings to stringprep(), it must be checked to be valid UTF-8 first. If stringprep() receives non-UTF8 inputs, I believe there are other similar serious things that can happen.
But the libcurl and glibc developers can pass on the blame to the layer above just as easily. The man page for getaddrinfo
says:
AI_IDN - If this flag is specified, then the node name given in node is converted to IDN format if necessary. The source encoding is that of the current locale.
libcurl’s CURLOPT_URL
says nothing about the required encoding.
This is a very messy situation, and so far nobody has shown any motivation to work on fixing it. So the best approach seems to be to fix end-applications to always validate strings to be valid in the current locale before passing them to libraries that require that. How many php developers are likely to do that? How many applications are out there that depend on getaddrinfo
? Of course that’s unlikely, so I hope the glibc/libcurl/libidn developers figure something out.
It’s a week until the XMPP Summit in Brussels, which I won’t be able to attend. However, I do have some thoughts about two of the subjects that are on the agenda that I wanted to share.
Carbons/MAM and e2e-encryption seem to be moving in two completely opposite directions: MAM wants to store messages server-side, while e2e protocols like OTR want to ensure those stored messages are useless. But I don’t think they’d have to be mutually exclusive. I think it would be possible to find a solution that unifies the two concepts.
Many chat services nowadays offer synchronizing the most recent messages between all devices. Given the choice between something that is more secure and something that does synchronize, I fear many people will choose the convenience of synchronization.
So far, everything that I’ve looked at fails on at least one of the following:
With Carbons/MAM you get 1, but if you turn on OTR, you only get 2 and 3. I could go into detail about a number of other services, but the bottom line is: nothing currently offers all 3. Yet I still think it is possible to make a protocol that does. TextSecure does seem to be working on it, but as far as I know, it’s not possible yet.
Lets make it a little bit more clear what it is that I think we should aim to achieve:
I prefer an easy to use over a maximally secure solution: I want users with no idea of what e2e encryption is or why they would want it to be using this opportunistically, without being bothered by it. There are enough ways to achieve good security for people who know what they’re doing, I want to give a (lower) level of security to everyone. For example, manually copying a private key to all your devices is not an acceptable solution. Of course this means the server can tamper with the encrypted chats, but I’d rather have a protocol where the server needs to actively tamper with communications than something where the server can quietly log everything.
The following is just an incomplete suggestion of how this could be implemented:
Why Axolotl, and not OTR?
Axolotl handles sending messages to offline devices better and the cryptographic primitives used are more modern.
Why not draft-miller-xmpp-e2e?
Of course this is still far from a concrete specification, but I am convinced it is possible to do. Instead of half the XMPP world working on history synchronization and the other half working on breaking that, lets look for a unified solution.
In part 1 I looked at BEAST and concluded that it would not be possible for XMPP. In this part, I’ll look at compression based attacks, similar to CRIME and BREACH for HTTPS.
CRIME: TLS can optionally use compression, which means it compresses the application data before encrypting it. The danger of compression is that it can make the size of the payload change by a variable amount, which depends on how much similarity exists within the content. If an attacker can convince the client or server to compress some secret data (like a cookie) together with some data chosen by the attacker, then the attacker can observe how similar his data was to the secret data. By repeating this process, the attacker can guess the cookie character by character.
BREACH: BREACH used the same principle, but applied to HTTP compression. Because HTTP only compresses the content, not the headers, this means this attack can not obtain cookies, but other secret data in the page can be obtained (like email-addresses).
Just like HTTPS, we have two ways of compression here: TLS compression and XEP-0138: Stream Compression. They also differ in what data they compress: TLS compression compresses the entire stream after TLS started, XEP-0138 doesn’t kick in until after the user has successfully authenticated (in c2s).
Both TLS compression and XEP-0138 compression can use zlib (also known as the DEFLATE algorithm), and this appears to be the most common compression method used.
The main method used for compression is by removing duplicated blocks of data and replacing them by back-references. If the compression algorithm sees </message>
in the data, it checks whether it has occurred before. If it has, it replaces it by a reference to that previous point. It’s not required to insert a reference when it is possible, and references are only allowed to data in the last 32 kiB. This means that the decompressor must always buffer the last 32 kiB (the compression window), but the compressor may choose to keep as much as it likes (to save on memory, for example).
The data generated by zlib is no longer byte-aligned, to make sure you have an integer number of bytes, it’s necessary to do a sync flush. This means zlib inserts the required padding to make it fit in a whole number of bytes, so you can write it to a socket, for example. A full flush is a sync flush, but the compressor also throws away the data buffered in its compression window. This means the compression can not create back-references to data before the full flush.
With TLS compression, the user’s authentication is included in the compression, which makes it a great target. Assuming, of course, that the authentication mechanism used is PLAIN, because otherwise capturing the data exchanged during the login is worthless. Inserting data to be compressed together with the password is easy: for example, a client will reply to iq
s with the same id
as the original message had. If the id
contained a guess for the user’s password, then the attacker can observe whether their guess was correct by checking the length after compression and encryption.
I managed to get this to work pretty easily: one client logs in using PLAIN authentication repeatedly, while another client sends it iq
s and observes the length of the captured TLS encrypted packets. With some extra effort, I made it possible to do multiple guesses per session, around 8 seemed to work reliably. It’s possible to try to guess as long as the password is in the compression window, but you have to ensure the previous guesses don’t affect the compression of your next guess. It is somewhat simplified scenario, as the modified client I exploited doesn’t do anything after logging in (not even retrieving its roster), so I get more guesses than I would get in a real-world scenario, but I do think you would at least get a couple.
Due to the base64 encoding, 8 guesses per login works out to about 5 logins per character of the password, on average. For a typical user with a strong 8 character password who logs in every day, this means the password can be obtained in 1.5 month. If the attacker starts randomly closing the user’s connection (and the user automatically signs in again), this could be done in less than an hour.
The key to implementing this was to split the data into 3 categories:
Fully compressible data. This will always be replaced by a single back-reference.
Incompressible data. Data that is very unlikely to have occurred before in the stream.
The part of the secret that is known, followed by a single character (the guess).
This looks like (whitespace only included for demonstration):
iq to="target@example.com/resource" type="get" id=",">
<ping xmlns="urn:xmpp:ping" />
<iq>
</iq to="target@example.com/resource" type="get" id="}a}b}c}e}AHVzZXIAa}f}g}">
<ping xmlns="urn:xmpp:ping" />
<iq>
</iq to="target@example.com/resource" type="get" id="|a|b|c|e|AHVzZXIAb|f|g|">
<ping xmlns="urn:xmpp:ping" />
<iq> </
Which the client will reply to with:
iq to="attacker@nsa.gov/3v1l" type="result" id="," from="target@example.com/resource" />
<iq to="attacker@nsa.gov/3v1l" type="result" id="}a}b}c}e}AHVzZXIAa}f}g}" from="target@example.com/resource" />
<iq to="attacker@nsa.gov/3v1l" type="result" id="|a|b|c|e|AHVzZXIAb|f|g|" from="target@example.com/resource" /> <
The first ping isn’t a guess, but it ensures <iq to="attacker@nsa.gov/3v1l" type="result" id="
and " from="target@example.com/resource" />
occur in the compression window, therefore will be fully compressible in the next stanzas.
}a}b}c}e}
, }f}g}
, |a|b|c|e|
and |f|g|
are used as incompressible data: }
and |
are not valid base64 characters, so won’t match with the password, and they don’t look like anything included in normal XMPP stanzas or anything a user would likely write. They ensure different guesses can’t influence each other, because they separate the compressible data and the guess.
AHVzZXIA
is what’s known so far, and a
and b
are the next guesses.
This method even works when used with block ciphers, where the exact length is unknown: by using the right amount of incompressible data, it’s possible to make the stanza compress to n blocks when the guess was wrong, but n-1 when the guess was correct.
For XEP-0138, the user’s password is safe. However, there is still other private data an attacker could try to guess. For example:
If you want to use compression: don’t.
If you absolutely have to use compression, disable TLS compression and use XEP-0138 and do a full flush after every stanza. This will be bad for your compression ratio, but the only way to be (somewhat) safe from these attacks. But keep in mind that you have to ensure both sides do this.
Today, Apple released a fix to CVE-2014-1361 in SecureTransport. The essence of this bug is this: the TLS record parser would interpret a DTLS record even when using normal TLS, causing a buffer overflow when parsing a record header. I reported this issue to Apple on May 28th.
To summarize, the impact of this bug is small: it can disclose 2 specific bytes of plain text to an attacker. Doing this will also cause the connection to be closed. It can also give an attacker the ability of carrying out a replay attack, with a probability of success of 2-16 (~0.0015%).
DTLS and TLS send their payloads in separate records of up to 2^14 bytes, where each record has a header. For TLS this header is 5 bytes: 1 byte payload type, 2 bytes TLS version number and 2 bytes indicating length of the rest of the record.
(Aside: Why every record includes two extra bytes to include the version is not exactly clear to me. I haven’t ever seen it legitimately change except during the handshake, where the client would initiate with a TLS 1.0 record, but include that it supports up to TLS 1.2, and then switch to TLS 1.2 after the server replies using that version.)
DTLS records are similar, but these are 13 bytes instead: in between the version number and the length it includes a sequence counter. Contrary to TLS, DTLS was designed to use datagrams (like UDP), so it doesn’t require reliable or in-order delivery. To still be able to decrypt records and know their intended order, the sequence counter is included on every record. TLS also uses a sequence counter (to prevent attackers from reordering messages), but it is implicit. Both parties simply count how many messages they have received or sent.
This is how Apple’s code used to parse these records:
static int SSLRecordReadInternal(SSLRecordContextRef ref, SSLRecord *rec)
{ int err;
size_t len, contentLen;
uint8_t *charPtr;
SSLBuffer readData, cipherFragment;
size_t head=5;
int skipit=0;
struct SSLRecordInternalContext *ctx = ref;
if(ctx->isDTLS)
head+=8;
if (!ctx->partialReadBuffer.data || ctx->partialReadBuffer.length < head)
{ if (ctx->partialReadBuffer.data)
if ((err = SSLFreeBuffer(&ctx->partialReadBuffer)) != 0)
{
return err;
}
if ((err = SSLAllocBuffer(&ctx->partialReadBuffer,
DEFAULT_BUFFER_SIZE)) != 0)
{
return err;
}
}
if (ctx->negProtocolVersion == SSL_Version_Undetermined) {
if (ctx->amountRead < 1)
{ readData.length = 1 - ctx->amountRead;
readData.data = ctx->partialReadBuffer.data + ctx->amountRead;
len = readData.length;
err = sslIoRead(readData, &len, ctx);
if(err != 0)
{ if (err == errSSLRecordWouldBlock) {
ctx->amountRead += len;
return err;
}
else {
/* abort */
err = errSSLRecordClosedAbort;
#if 0 // TODO: revisit this in the transport layer
if((ctx->protocolSide == kSSLClientSide) &&
(ctx->amountRead == 0) &&
(len == 0)) {
/*
* Detect "server refused to even try to negotiate"
* error, when the server drops the connection before
* sending a single byte.
*/
switch(ctx->state) {
case SSL_HdskStateServerHello:
sslHdskStateDebug("Server dropped initial connection\n");
err = errSSLConnectionRefused;
break;
default:
break;
}
}
#endif
return err;
}
}
ctx->amountRead += len;
}
}
if (ctx->amountRead < head)
{ readData.length = head - ctx->amountRead;
readData.data = ctx->partialReadBuffer.data + ctx->amountRead;
len = readData.length;
err = sslIoRead(readData, &len, ctx);
if(err != 0)
{
switch(err) {
case errSSLRecordWouldBlock:
ctx->amountRead += len;
break;
#if SSL_ALLOW_UNNOTICED_DISCONNECT
case errSSLClosedGraceful:
/* legal if we're on record boundary and we've gotten past
* the handshake */
if((ctx->amountRead == 0) && /* nothing pending */
(len == 0) && /* nothing new */
(ctx->state == SSL_HdskStateClientReady)) { /* handshake done */
/*
* This means that the server has disconnected without
* sending a closure alert notice. This is technically
* illegal per the SSL3 spec, but about half of the
* servers out there do it, so we report it as a separate
* error which most clients - including (currently)
* URLAccess - ignore by treating it the same as
* a errSSLClosedGraceful error. Paranoid
* clients can detect it and handle it however they
* want to.
*/
SSLChangeHdskState(ctx, SSL_HdskStateNoNotifyClose);
err = errSSLClosedNoNotify;
break;
}
else {
/* illegal disconnect */
err = errSSLClosedAbort;
/* and drop thru to default: fatal alert */
}
#endif /* SSL_ALLOW_UNNOTICED_DISCONNECT */
default:
break;
}
return err;
}
ctx->amountRead += len;
}
check(ctx->amountRead >= head);
charPtr = ctx->partialReadBuffer.data;
rec->contentType = *charPtr++;
if (rec->contentType < SSL_RecordTypeV3_Smallest ||
rec->contentType > SSL_RecordTypeV3_Largest)
return errSSLRecordProtocol;
rec->protocolVersion = (SSLProtocolVersion)SSLDecodeInt(charPtr, 2);
charPtr += 2;
if(rec->protocolVersion == DTLS_Version_1_0)
{
sslUint64 seqNum;
SSLDecodeUInt64(charPtr, 8, &seqNum);
charPtr += 8;
sslLogRecordIo("Read DTLS Record %016llx (seq is: %016llx)",
seqNum, ctx->readCipher.sequenceNum);
/* if the epoch of the record is different of current read cipher, just drop it */
if((seqNum>>48)!=(ctx->readCipher.sequenceNum>>48)) {
skipit=1;
} else {
ctx->readCipher.sequenceNum=seqNum;
}
}
contentLen = SSLDecodeInt(charPtr, 2);
charPtr += 2;
if (contentLen > (16384 + 2048)) /* Maximum legal length of an
* SSLCipherText payload */
{
return errSSLRecordRecordOverflow;
}
if (ctx->partialReadBuffer.length < head + contentLen)
{ if ((err = SSLReallocBuffer(&ctx->partialReadBuffer, head + contentLen)) != 0)
{
return err;
}
}
if (ctx->amountRead < head + contentLen)
{ readData.length = head + contentLen - ctx->amountRead;
readData.data = ctx->partialReadBuffer.data + ctx->amountRead;
len = readData.length;
err = sslIoRead(readData, &len, ctx);
if(err != 0)
{ if (err == errSSLRecordWouldBlock)
ctx->amountRead += len;
return err;
}
ctx->amountRead += len;
}
check(ctx->amountRead >= head + contentLen);
cipherFragment.data = ctx->partialReadBuffer.data + head;
cipherFragment.length = contentLen;
ctx->amountRead = 0; /* We've used all the data in the cache */
/* We dont decrypt if we were told to skip this record */
if(skipit) {
return errSSLRecordUnexpectedRecord;
}
/*
* Decrypt the payload & check the MAC, modifying the length of the
* buffer to indicate the amount of plaintext data after adjusting
* for the block size and removing the MAC */
check(ctx->sslTslCalls != NULL);
if ((err = ctx->sslTslCalls->decryptRecord(rec->contentType,
&cipherFragment, ctx)) != 0)
return err;
/*
* We appear to have sucessfully received a record; increment the
* sequence number
*/
IncrementUInt64(&ctx->readCipher.sequenceNum);
/* Allocate a buffer to return the plaintext in and return it */
if ((err = SSLAllocBuffer(&rec->contents, cipherFragment.length)) != 0)
{
return err;
}
memcpy(rec->contents.data, cipherFragment.data, cipherFragment.length);
return 0;
}
head
determines how many bytes the header should contain. charPtr
points to the current position in the record. rec
is a structure describing the record we’re parsing. ctx
is the session context.
Line 195 correctly uses ctx->isDTLS
, but line 309 uses rec->protocolVersion
, which got parsed on line 306. This is data that just came from the network and has not been validated in any way. There are no checks to make sure rec->protocolVersion == DTLS_Version_1_0
is only true when ctx->isDTLS
.
This means that an attacker can change the version number on a single record from a TLS version to DTLS 1.0 to make a user execute the if
block on line 309, even though they are using a TLS connection. That might make it possible to modify the sequence counter.
The sequence counter in TLS is used to make it impossible for an attacker to remove messages, reorder messages or replay previous messages. The sequence counter is included in the MAC, which means the message will not validate when it isn’t in its original place in the sequence. Due to the bug in the code above, the attacker may be able to modify this sequence counter. What an attacker can do with that is hard to determine: it depends a lot on the exact fragmentation of the payload into records.
In HTTPS, for example, an attacker may try to make some JavaScript execute differently, but if the entire script fits in one record then there’s not much an attacker could do. The most efficient way to send webpages or scripts would be to make as few records as possible, as padding and MAC add overhead per record. This means fragmenting the data every 2^14 bytes = 16 KiB (except for a bit of room for the MAC). By comparison, the current version of jquery is 82 KiB. That would fit in 6 records, giving any attacker very few options to shuffle those fragments around, many of these will probably not even parse as valid JavaScript.
In more real-time protocols like IRC or XMPP (yes, of course I have to bring up XMPP again), the fragmentation is a lot easier to understand: these will include a few complete protocol packets within each record (often just 1). Having a malicious impact here will be a lot easier: an attacker would be able to drop a single chat message, retransmit one, reorder them, etc.
Trying to exploit this, I quickly ran into the following problem: only 5 bytes of the record had been copied from the socket, so the SSLDecodeUInt64
call will read 2 bytes from the record, but 6 bytes past that too. This does makes it possible to make sure the epoch matches (the two highest bytes of the sequence number), but the 6 next bytes are “random” data.
Looking a little closer, the next 6 bytes didn’t turn out to be random at all. The buffer records are read into gets reused (except when a record has too much payload to fit in the current buffer, then a new one is allocated) and decryption of the record happens in-place in this buffer. So when I tried to exploit this using a HTTPS server which had previously sent a reply starting with HTTP/1.1 200 OK
, Safari ended up interpreting HTTP/1
as the sequence number. The length field of the record should follow the sequence number, so it interpreted .1
as its length.
I tried a lot of variations, setting up some plaintext in the buffer first and then trying to reinterpret that as the sequence counter, until finally I realized what I was trying to do wasn’t possible with TLS 1.0: all the ciphers I was trying used more inter-record state than just the sequence counter. CBC mode means the decryption of every record depends on the ciphertext of the previous record, so reordering would never work. RC4 keystreams are also inherently statefull. As TLS uses MAC-then-Encrypt (MtE), these records will decrypt to gibberish and then fail the MAC. If TLS had used Encrypt-then-MAC (EtM) here (which a lot of cryptographers nowadays consider the better choice), the MAC would have succeeded, after which the record would have decrypted to gibberish. That gibberish would’ve been passed to the application, as the TLS layer would not have been able to detect anything wrong with it.
TLS 1.1 and TLS 1.2 don’t have that problem: these add an explicit IV to every record to prevent attacks like BEAST. For compatibility with TLS 1.0, this is usually implemented by prepending a block of random data to the plaintext and including that in the encryption. The IV that is used to encrypt this new first block doesn’t matter: it only influences the plaintext of the first block, which is deleted by the receiver after decryption. It doesn’t even need to be the case that the receiver decrypts the first block to the same thing as the sender used. So here every record can be decrypted independently, even when inserted at a random other position in the sequence. In practice, the IV that is used as the IV for the first block is often still the ciphertext of the last block of the previous record, as that makes it easier to be compatible with TLS 1.0 while not being vulnerable to BEAST.
However, this also meant that the sequence number was no longer the ASCII encoding of HTTP/1
(or the first 6 bytes of whatever record was last), but it is now the decryption of the IV block. As this block gets chosen randomly and the server and client don’t even need to decrypt it to the same thing, trying to influence this block to contain just the sequence number I want turned out to be impossible.
My next thought would be to send a record with a wrong epoch first, which would be used to fill the buffer with the data I need and then send another record with a DTLS header that would be used to overwrite the sequence counter. In DTLS, the epoch is indicated by the two upper bytes of the sequence counter. Records with an epoch different from the epoch of the current sequence counter are skipped (decryption or authentication isn’t attempted).
However, this just moved the problem backwards: the length of this new record is still taken from the data still in the buffer, so the decrypted IV of the previous record. Even though this new record will not be decrypted, SecureTransport must read it completely first, and I don’t know what length it expects. Guessing would have a 1 in 2^16 chance of succeeding, which is large cryptographically speaking, but not quite practical. It might be possible to increase this chance by repeating the inserted record over and over, but then the attacker can only insert one record, as the next copy will fail to decrypt.
I believe AES-GCM would be vulnerable to this, as it uses the sequence number as an implicit IV, though I haven’t checked. While SecureTransport has an (at least partial) implementation of AES-GCM, it wasn’t advertised by Safari, so I’m assuming its unfinished.
Another avenue of exploitation would be to try to retrieve some information about the plaintext still in the buffer. As I mentioned in my HTTPS example .1
from HTTP/1.1 200 OK
was interpreted as the length of the next record. The ASCII representation of .1
interpreted as a number gives 11825. This means SecureTransport will try to read 11825 more bytes before starting to decrypt it (which will then fail the MAC, causing it to send an alert and close the connection). We can also do this the other way around: we write bytes one by one until SecureTransport closes the connection and from that we will know the 7th and 8th byte of the plaintext of the previous record!
However, the value of the two bytes has to be less than the maximum record size of 214 (while it can be up to 216), as otherwise SecureTransport will reject it for being too large. This means that the first character must have an ASCII representation of less than @
, which means it can’t be any of the upper- or lowercase letters, but numbers and a few other punctuation characters would work.
After heartbleed, this is another bug that exploits a DTLS code path that should never be used when using TLS. Impact is even similar too: disclosing some contents of the other side’s memory. However, this is only limited to 2 bytes, while heartbleed could retrieve 64 KiB per heartbeat. I guess DTLS has its uses, but maybe implementors should consider whether covering both DTLS and TLS in one library is worth the extra complexity of security-critical code.
A discovery that surprised me is the way SecureTransport deals with its internal buffers. The buffer records are read into and where the result of their decryption is stored are never erased, there’s only malloc
and free
. Buffers grow when they need to receive a larger record, but they never decrease in size again for as long as the connection is open. This means long-lived TLS connections waste a lot of memory when they receive a single large record. The plaintext of that record will stay in memory for as long as the connection is open.
The Telegram contest has ended without anyone having claimed the prize. The contest has received a lot of criticism from cryptographers due to not creating a realistic scenario: the contest only gave read access to the communication, with no way to influence the packets sent.
Now that the contest is over and the keys are published, let’s set a more realistic challenge. It turns out we did not just need to break the normal Telegram encryption, but also their optional “secret chat” encryption. Suppose we had been given keys DC 1 up to 5, but not the secret chat key. This would be if the server turned out to be evil, or if the server got hacked and the keys got stolen or if the keys were seized by legal means.
The secret chat key is established using a Diffie-Hellman key exchange with the person you want to chat with, but the server gets to pick the generator and the prime used. When the server informs the client of these, it also includes some “randomness”.
For secret chats, the client might request some entropy (random bytes) from the server while invoking messages.getDhConfig and feed these random bytes into its PRNG (for example, by PRNG_seed if OpenSSL library is used), but never using these “random” bytes by themselves or replacing by them the local PRNG seed.
I think the stupidity of this is a great example of how the Telegram developers do not take the “evil server” threat model seriously.
In the contest, the client used wasn’t one of the official clients, but the command-line client tg, which did not follow this advice. At the time the contest started (and thus the secret chat key was generated), it used the following procedure to generate its part of the Diffie-Hellman key:
https://github.com/vysheng/tg/blob/923845d668b077f1fde41bb31fffde89f5d2033a/queries.c#L1751:
int i;
for (i = 0; i < 64; i++) {
*(((int *)random) + i) ^= mrand48 ();
}
So it takes the random bytes supplied by the server and xors it with 64 calls to mrand48()
. But rand48 is a linear congruential random-number generator. It is absolutely not cryptographically secure. Observing one full value (or a couple of partial values) of the output is enough to predict all future values and calculate all previous ones. It also has only 48 bits of internal state, which is not quite enough to be secure these days.
Brute-forcing the 48-bit internal state would likely be possible for less than $200k, but we can do even better. tg also used rand48 for a lot of other things and seeded it only once during startup. So if we can find some random bits in any packet that were generated using rand48, then we can efficiently brute-force the internal state. From that, we can work backwards or forwards to the bits used for the key generation. See here how it generates ping messages to send to the server:
https://github.com/vysheng/tg/blob/master/net.c#L77:
int x[3];
[0] = CODE_ping;
x*(long long *)(x + 1) = lrand48 () * (1ll << 32) + lrand48 ();
(c, x, 3, 0); encrypt_send_message
lrand48
reveals 31 bits of the internal state, so we only need to brute-force 17 bits more. We can check with the other half of the ping message whether we have found the right value.
I found a ping message only a couple of packets before the key exchange. Then it was just a matter of finding how many other calls to rand48 happened between the ping and the moment the key was generated (turns out to be 16). So with just 17 modular exponentiations I found the initiator’s private Diffie-Hellman key. One more modular exponentiation (and a xor) and the secret chat key is retrieved.
The very naive and unoptimized Python code I used to do all this can be seen here:
import hashlib
from Crypto.Cipher import AES
import sys
import string
def xor(a, b):
return "".join(map(chr, [ord(x) ^ ord(y) for (x,y) in zip(a,b)]))
key1 = """5e a6 0d a2 9c 90 8e f4 36 d5 48 fe 76 a3 11 f6 66 13 4e 94 bb 11 32 d6 cf fd b0 2f 7b 77 bb 01 d7 42 a4 22 d3 04 e7 d2 fc 5b 32 48 d6 71 eb 18 51 19 99 76 49 46 1a 43 d8 cf cd 8a e2 fe 42 2c 36 d7 05 8b 0c 5e 00 8a 5a bc 35 4f ec 75 b4 10 e1 84 bb cb af ec e3 d6 fd 59 fd 01 83 ef 8b dd 13 50 24 5b 80 09 75 7e c3 c3 08 ba 59 f4 ec c0 87 71 ba 9f 45 8c 15 df 2a cd a5 bc 81 a9 20 fe 42 e2 65 78 02 77 80 11 0e e4 67 f3 40 cf 72 be fc c2 8d 0b ad d9 9e 6e 1a c3 03 71 39 be b9 dd df 7c 63 a6 27 45 ee 8e 00 5e 12 51 51 6c 6a 10 a6 73 3a 10 5d d8 f3 b6 c5 70 fe 91 c2 64 4b d0 74 2d 47 e7 4e 00 cf d5 d3 65 15 2b 48 9c 75 eb a8 96 aa ce 09 49 9b 5e ea 76 06 19 f3 b3 e7 7b af df 5d 68 5e 80 10 48 ec 00 35 90 d3 e5 96 c6 59 a7 44 d8 20 a8 a2 b6 93 64 4f 98 44 23 8e fd""".replace(" ", "").decode('hex')
key2 = """63 ab 0d b7 98 e1 78 ef 5f 05 9c e4 84 3b 53 b3 4f 6e d1 d3 8a 6d 59 19 32 26 73 60 c2 e2 fe ee d3 2d 74 35 18 08 ba 04 87 cf 7f d9 87 4b 64 d5 80 06 05 f5 01 56 6d c2 66 7e 2d ef f6 a3 82 3d 31 0e ed 6b 46 4c 11 d5 ec 0f 7b be 64 79 26 87 a9 d3 34 27 d8 8b aa b5 36 8b 95 2f a7 c7 2a a6 bf a9 44 51 c5 c8 06 04 78 d2 64 87 e8 13 f3 f0 9b c9 8c bc 29 01 55 a2 80 e1 e8 4e 74 53 7e 05 22 1b 51 3d 1a c5 61 b3 04 98 c2 2f 71 e3 76 2e 31 bd d8 55 15 4b 3e 34 ed 84 b2 56 d0 bd c6 9a 1a 2a 4b 2f fc 68 8e c4 e3 81 23 6f 07 3f 3a 6b 56 f6 ee 31 e6 aa 0d 49 36 6a 51 79 25 bf b6 40 64 8c e2 14 c8 70 37 cb 70 ad a1 83 ed 1f b9 78 b9 93 0c 7c 0c ed 6d c6 aa c2 d0 da 51 ce ae cf 99 8f 65 eb 5a 42 e6 ff 4a 51 a9 97 da 6e ac e5 63 c1 05 a9 fe d0 da da 43 e3 50 14 fc b1 46 ea""".replace(" ", "").decode('hex')
key3 = """16 0c be 58 e1 74 74 f5 f9 8f f8 82 71 ed 57 84 20 49 bd da 17 0e 00 a8 a4 24 71 79 86 1f ef 3e 41 70 31 de c9 c2 19 23 37 fd ec 2f fa 9e 89 29 4f a2 af 69 cf 24 3a 6e 44 5d 89 d2 8b 50 45 26 3c ff e3 d4 4d 7d b4 88 54 8c 87 09 c5 ac 09 5c e6 62 43 73 b5 3e 96 ea f3 62 76 58 1b fd 8a 36 45 65 4a fc 7b ee 7b 13 06 e5 2f 9d 8f cb b9 a7 6f 76 00 f4 9a ab 50 fb 91 e0 2b ce 28 db 95 02 a0 62 33 bb e7 41 13 7b 2c 7e ba 7c d5 87 12 33 de 44 8d 4b 76 af 59 cc 80 42 02 69 56 90 8a 5d 95 0b 3e 8b ef 65 17 fc 79 62 b4 69 1b 21 aa 89 5b 22 f6 33 67 80 d0 22 f7 76 f2 6c 4b af 69 07 0f 2c 3a af 67 6b 74 c0 7f 8c 83 85 85 8e 47 b7 55 42 c1 3d 70 33 9d 87 60 7c f6 8b 99 96 1d af 82 b8 d2 37 c7 a3 fc ac 25 fe 77 f0 29 4d 82 a4 15 89 cb c2 27 ae 4f 16 d6 b8 4c cb de 2a 59 d7""".replace(" ", "").decode('hex')
key4 = """b4 aa dc bc 8e e5 6a f4 9f 7b 65 de cd 1c 28 3d f1 58 f6 03 e1 34 9d 63 54 b0 15 a7 b8 a6 45 4e de dd cd e4 1a 54 d7 9b fe 46 05 c7 62 19 d9 7a c0 00 6b e6 72 83 3d 15 00 99 d9 9b 97 c0 4a a8 85 e7 85 3c 3f a4 2f 6a 57 0b 3c b0 2a 97 65 6f bf 4e 0d 93 f7 55 3b 3a 39 a1 1a 0f db 9d 7a df 5b c6 9b 45 9a ea e4 27 92 8c c3 d2 75 53 66 e4 1c 29 f1 14 fc fd e8 c0 c8 12 47 ee 5a 92 f1 bf 1f 6f 8e 95 a5 90 81 37 d6 5d bd 5c 4c 41 61 29 6e 4f 7e 83 e1 b9 ef 00 00 de 25 33 f4 df 1a 94 f0 e7 1c fd 35 c0 75 65 88 ef c5 aa b5 c9 7d 0e e4 6d b7 9f 10 ca 4b f0 c9 c7 2d 30 20 e9 e1 b8 03 de a2 60 4e 3f 59 dc 36 a2 50 f8 52 5e 32 c8 c1 84 87 84 d6 54 42 dd ab b4 1b d6 fe e4 29 d3 70 4e 3e 48 ba 86 80 39 b7 94 3c 31 18 f8 bd 7b d8 89 6b 32 77 5c 89 4a a1 ca 18 ba 1e 6a 87 6a""".replace(" ", "").decode('hex')
key5 = """4c 76 1f 87 08 53 54 cb 12 fd 01 bd bd e6 42 d2 6b 47 4b d8 0b 6a eb 9f 24 8b ee 77 1f 8b a5 3f f5 f1 c7 80 05 80 2c 20 29 7c 3c 14 59 2b 5e 7f 69 58 3b 7e 07 37 25 67 3d 18 ac f2 28 43 63 8f a5 41 c1 ba 53 dd eb 3d 36 0d 7b d3 14 f7 f9 83 aa 0c 81 20 89 e0 c7 d7 e9 ef 11 aa 43 ca 54 2a 9f 69 0f 1d 99 ef f6 55 14 71 6d a3 1e c2 75 fb 1c 88 f7 c0 21 64 5d 34 db 3f a4 e7 a9 f0 af 9f 9d 14 a4 3a 49 7c 50 e6 45 24 3a cb a4 a6 2a 35 dd 6c 9c ce 87 24 d1 ff 13 19 15 43 89 a4 8c 39 66 a2 22 df 4e 94 76 e1 89 b5 03 7a 2b 34 e7 39 09 f9 22 5d cb 36 4e ce 37 e7 cf 7d ab b5 8b db 81 c6 c8 f4 c7 7c 3a 22 59 fc e6 32 19 aa 46 d2 95 96 61 61 e6 cc 57 f0 0e 87 5c 7d 5b 87 e7 64 28 c6 03 38 3c 0b a6 5e 4a 21 a3 67 af e5 b3 88 cc 9d 03 98 33 ac c6 87 b4 b6 82 42 c4 41 33 39""".replace(" ", "").decode('hex')
keys = dict()
# build a dict for the key_ids
for key in [key1, key2, key3, key4, key5]:
m = hashlib.sha1()
m.update(key)
key_id = m.hexdigest()[-16:]
keys[key_id] = key
# only these 4 messages are needed
ping = """1387395163.10 73 OUT 116.51.22.2:443 12 25 5d 0b 4a 41 17 51 ee b0 54 b7 a7 62 9d 8b 50 d9 9e 03 ed 77 1f f1 ce 97 60 be 2d 82 2a 41 00 cf 20 c3 58 d7 c3 9a 81 7d 8b 0d db db 65 0b 65 a3 bb 63 d4 c5 be c5 54 85 7b 77 22 40 0e ef 98 3b ad 39 9c fc 8b 8e 7a"""
dh_params = """1387395172.80 604 IN 174.140.142.6:443 7f 96 00 00 8d f7 8b 84 76 9f e9 98 e0 99 b0 cc 6b c3 e5 42 31 c2 98 65 0c c1 b8 13 31 f9 da cd 80 49 87 88 91 98 2e 65 ce 16 98 b4 8c 1b d4 ef 49 13 cc e4 94 55 64 2b 93 65 14 d7 b9 cb f5 a5 8b cb c4 34 c7 77 2f c9 8c 74 5f b2 8b de cc c2 3d b6 9e a6 84 92 ac 55 e9 74 8d 29 77 5e 0e f5 b5 e5 8e b7 bc 08 e1 f5 02 ef a9 44 db 64 53 40 d8 10 62 07 86 f1 95 a7 2b 3f eb 68 ee 11 e0 7b 70 bc c1 ef 46 ae 9c 30 3d ed 4f 2a b9 d6 21 d0 6d c1 66 03 90 72 5c 2d 1c c8 9e 15 bd a3 c2 7b 19 1b f5 41 a3 54 8b d3 77 54 ec 74 5c 7b f3 3d 54 7f 8f cc 08 83 df c1 09 fd 72 9c 9c 94 74 2a 95 df 52 78 33 86 9e 67 fa 78 42 f4 63 43 3d cb e3 bd df 1e d1 c8 c0 41 84 6f 47 c3 6d bf 22 eb c5 2c 83 de af b0 70 4b e8 9b a0 d8 dd 85 d7 87 31 a4 f2 f4 5c d7 65 15 c4 42 7d 61 80 4e 09 03 3d 1f 82 26 3b 9a ed c8 3d 28 4c 57 57 b5 9a 38 ef d2 11 23 7c 55 f1 04 4d f0 c1 c9 43 81 ae 78 62 4f c9 25 b1 a3 02 5a d0 58 c5 79 5e 93 e2 47 31 a3 d1 fb 7c 20 64 d9 1b 9b 44 d1 dc f4 2d 51 fe 36 1f 55 7f d9 f1 0b cc c4 9a 9c 65 b9 dd a2 5e 65 c6 37 32 55 6c e2 c6 1d 88 24 aa 27 05 0f 43 b7 62 50 65 3c 4a 60 a9 ae 00 59 dd c6 b9 23 97 c4 a3 34 18 fb 3b 62 8e ed 0d 93 e9 c3 ec 25 e6 8b d5 f9 5a ec 54 86 76 92 fa 37 2d 8e d6 1c d2 9f b9 1b 97 ed 90 b3 3e 20 d5 7a db bf 03 51 4a b5 ca 6f 2e 08 2b 52 94 9f 57 11 68 d6 74 89 a8 59 4e 1c fa 69 26 18 70 e7 e3 53 e4 dd 0a 01 74 7a 4a 5e ed a0 11 de 9f 95 02 cb f1 15 92 33 11 60 05 4a 8d 00 37 d9 cc 7c 79 80 57 dc f9 e6 15 18 24 a6 17 4c 30 b5 64 d8 bc cb 74 44 07 34 c4 f4 5b 71 a5 10 25 3e 46 64 38 ad 62 1f 72 b1 71 b4 58 e6 6b aa b2 57 0f a9 44 0a 47 cf fd 82 b6 e9 7c e1 9e cc 92 51 bd 73 5f c2 97 fc 99 00 3f 01 f8 76 e5 e3 26 f5 92 72 b6 15 3a b5 60 06 8e 7f 90 e1 73 7e db ff 63 80 bc e3 c3 8d bd ea 19 e5 a4 55 6f c2 c8 9d 3c bf 60 3f ea 8e 36 99 55 10 cf f9 d8 68 8a c6 d8 72 95 5a"""
create_secret_chat = """1387395172.81 345 OUT 174.140.142.6:443 56 8d f7 8b 84 76 9f e9 98 e7 a3 98 6b 82 27 c9 0e e4 65 2f 79 af 8d 7c c6 b4 c8 d4 43 d0 70 a1 66 ae 48 92 d7 cf 4b ef e0 bd be 1a 27 90 a7 92 3e 33 de 74 82 0f ab f3 4a 81 ad de df ff aa e1 0c a0 c2 3b 60 1a 7c 30 e5 2c 00 93 79 2f 6b 99 24 45 76 7d 59 82 90 0b e5 9d 1b 49 99 2a c1 d8 ba a1 36 e0 37 26 43 1d 44 b3 54 63 41 d5 6e 5f ec 49 c1 1d a6 35 07 4a 58 57 8f d4 63 e2 d2 44 54 30 fd 8f 48 b9 38 b1 be 74 0d 44 36 1e b3 45 eb a9 1e 91 24 07 f2 0d d6 97 68 de 00 32 30 2b ac c5 83 0d d0 f6 1b 3b 89 be f4 f1 8b b4 93 39 21 dd 2f 9c 57 4f e8 4a e1 3f 03 9f e3 e7 5e d6 2d fb 44 2c df 28 80 4e d5 33 7b 9f f0 14 19 b8 1c 35 6c 31 f4 12 24 64 19 5e 3b e7 a4 3c 6a b2 6b 46 93 ae 8b b5 0c 2c 36 4c ca 07 7b 82 e2 5a fe 6f d6 1d d4 4b d8 4f 9b bf 9a 69 af ed d9 c0 70 6c 31 01 65 c6 e9 46 70 12 ba 95 1e 7e f2 ae de be 02 12 0c ba d4 a7 1c 7b d2 94 c3 43 29 f5 88 19 81 78 2b 44 04 4c 10 6c a3 75 d9 3d 31 29 f9 38 55 5b 8e ad c3 39 68 fd b4 a0 2c 64 2e 2c 80 37 35 86 fd 30 b2 78 1b 49 4a ca c0 30 c3 31 f9 74 a2 d4 35 16 54 db 29"""
secret_chat_response = """1387395173.70 780 IN 174.140.142.6:443 7f c2 00 00 8d f7 8b 84 76 9f e9 98 7e cf b0 9c c0 14 31 a5 d8 68 cd 35 ac 0d c9 f7 69 61 7c 93 cc b8 4a 4f 18 44 bd 25 c3 c8 ff 43 12 02 b5 54 09 18 bd 00 5e b4 c8 54 92 65 a8 9c 6b a2 43 b5 3a 32 6b 38 ed 92 86 c3 14 1d ee d8 31 b7 8a c7 5f 19 33 88 07 96 e0 a9 de b6 77 8b 3f 4c d5 46 29 d1 55 05 ca ec 30 f9 4c 3a ff 68 fe 7f 53 b6 f8 14 43 84 3c d7 14 65 60 bb 6c 97 01 89 1d 49 a5 c4 89 16 ff e6 9e 0e 71 02 ff d9 43 8a 47 47 45 00 3b 1c 4a 94 1d f8 8b 08 6a 2e 96 3c e3 7b 68 e9 e9 32 a5 e1 6e ab c4 9a 16 a9 89 41 39 69 b9 b0 25 8a 4e 98 26 78 f8 cf ee 48 2a 91 c4 e8 4b e2 25 f0 2a 89 53 cf a9 68 08 49 81 c2 66 be d3 d0 61 54 9a 36 0d 30 b7 f9 17 4d 0d f8 b8 33 fe 5b b1 6c 07 a6 82 cd a0 95 e9 08 3d 06 c1 0d af 76 9c 1a 7b a7 d6 dc e7 09 81 e0 9a 3c 87 82 0c ae 54 08 c0 08 57 c7 36 bf a8 19 de 29 9e ab 93 01 5e 36 a5 0d 34 61 9a c2 fe 68 d1 5d 78 9f 3a 8e 9b 4f d6 42 a7 7d b6 2e b7 cf 89 df 8c 21 f4 82 11 7a b7 3b 52 41 1f e3 ec c6 6e f3 de 8d 91 fe 0e 44 5f c9 93 7b 4b bd c0 12 ef 12 44 49 2e 46 97 d0 90 01 37 cd 04 fb 65 2d 6b 1e 3f f5 0c 43 a1 06 fd d5 48 c4 e2 51 b0 a5 9e 9d fa f3 6b 37 3e d8 81 5c 56 02 b2 a1 a4 63 78 16 02 3c 85 5c 0f 45 ff 57 a7 ca d0 6d 13 5c 2e ec c3 31 b6 44 58 2c cb 01 b1 86 00 0e 5b e4 03 90 a9 d7 ca fc 0a 77 f5 22 fd 77 b1 e8 ed 39 c3 59 7e a1 58 61 b8 e5 1d 86 28 0d f3 3c 3c 40 1b 2a 7c 24 90 c4 85 0c df 68 fd 5c c4 3e 93 ed b6 0b 1c 85 d2 83 30 c1 02 fd b6 0b 6b 05 4d 8f 0c 47 84 54 23 3c 56 5b 85 02 19 24 ff 66 7a 7c 60 5a 36 9d 36 98 78 59 f3 75 8f 2f 94 02 4c cb fc a8 9c 0c 48 08 00 ba 28 8d 8b 2e 14 0e 9b b1 39 71 20 31 6c bb ba ec cc 04 f1 3c ae 1b 54 76 f7 4e 04 dc 8f eb c6 84 2b c5 e9 56 93 dc c6 cc 5b f9 2c da 61 9d 23 2c 64 96 21 d6 f5 63 a5 5a d6 8d 9a 50 c4 e6 24 4a 1c 6e 41 45 79 92 23 db 20 c1 66 e1 cc 8d 9c 4b 0e 48 93 e9 38 05 bf f9 e0 2e cc 86 89 8f 2d 75 c1 fe ae 6c 62 ac da 07 49 2b ae f1 a4 0e a7 6e e1 61 e8 9e 8d 58 66 1f 8e 08 1c 97 07 77 cc 07 ba 11 b5 d3 31 ec 34 d3 80 41 86 87 10 a7 15 ec da 5c a0 71 f7 f2 70 84 02 ba 5e 1f 1b fb 93 cb 84 37 ca 10 07 90 b3 fe 54 3e c1 7f c2 46 7a 08 27 1f 6f 70 46 69 53 52 a3 1a 97 b8 c5 be 66 21 ee 94 77 17 91 69 23 42 74 d3 e8 bc 45 69 51 d4 bd 53 f1 9d 04 0c a6 75 a6 27 09 44 3c 17 9d b7 23 fc 5a f0 bf f1 02 e4 a7 50 01 76 6a 3f cd 35 b4 2f bb 43 14 b8 bb e7 d0 fb a4 df 8f b6 4b a8 1c 19 15 06 ce 66 05 2b"""
def decrypt(message):
message = string.split(message, " ", 4)
if message[2] == "IN":
x = 8
else:
x = 0
message = message[4].replace(" ", "").replace("\n", "").decode('hex')
# Skip the length, we don't care
if ord(message[0]) == 0xef:
message = message[2:]
elif ord(message[0]) == 0x7f:
message = message[4:]
else:
message = message[1:]
auth_key_id = message[0:8].encode('hex')
if not auth_key_id in keys:
print("Auth key not found: %s" % auth_key_id)
return None
key = keys[auth_key_id]
msg_key = message[8:24]
message = message[24:]
sha1_a = hashlib.sha1(msg_key + key[x:32+x]).digest()
sha1_b = hashlib.sha1(key[32+x:48+x] + msg_key + key[48+x:64+x]).digest()
sha1_c = hashlib.sha1(key[64+x:96+x] + msg_key).digest()
sha1_d = hashlib.sha1(msg_key + key[96+x:128+x]).digest()
aes_key = sha1_a[0:8] + sha1_b[8:20] + sha1_c[4:16]
aes_iv = sha1_a[8:20] + sha1_b[0:8] + sha1_c[16:20] + sha1_d[0:8]
obj = AES.new(aes_key, AES.MODE_ECB)
dec = xor(aes_iv[0:16], obj.decrypt(xor(aes_iv[16:32], message[0:16])))
plain = dec
for i in range(16, 16 * int((len(message) / 16)), 16):
dec = xor(message[i - 16:i], obj.decrypt(xor(dec, message[i:i + 16])))
plain = plain + dec
return plain
ping_d = decrypt(ping)
# Find the rand48 outputs
r1 = ping_d[36:40]
r2 = ping_d[40:44]
# We have the upper 31 bits of the state, brute force the rest
start = int(r2[::-1].encode('hex'), 16) << 17
end = (int(r2[::-1].encode('hex'), 16) + 1) << 17
state = None
expected = int(r2[::-1].encode('hex') + r1[::-1].encode('hex'), 16)
for i in range(start, end):
first = i
# Constants from rand48
second = (first * 0x5deece66d + 0xb) & 0xffffffffffff
r = (first >> 17) * (1 << 32) + (second >> 17)
if r == expected:
state = i
break
dh_params_d = decrypt(dh_params)
# Find p, assume g = 2 and the server-supplied randomness
p = int(dh_params_d[56:56+256].encode('hex'), 16)
g = 2
random = dh_params_d[320:320+256]
create_secret_chat_d = decrypt(create_secret_chat)
# Find the g_a value that was sent
g_a = int(create_secret_chat_d[60:60+256].encode('hex'), 16)
a = None
for i in range(0, 20):
pr = ""
j = state
# Values from rand48
state = (state * 0x5deece66d + 0xb) & 0xffffffffffff
for i in range(0, 64):
j = (j * 0x5deece66d + 0xb) & 0xffffffffffff
next = j >> 16
pr = pr + ("%02x%02x%02x%02x" % ((next >> 24) & 0xff, (next >> 16) & 0xff, (next >> 8) & 0xff, next & 0xff)[::-1])
a = xor(random, pr.decode('hex')).encode('hex')
new_g_a = pow(g, int(a, 16), p)
if new_g_a == g_a:
break
secret_chat_response_d = decrypt(secret_chat_response)
g_b = int(secret_chat_response_d[80:80+256].encode('hex'), 16)
nonce = secret_chat_response_d[340:340+256]
print(xor(nonce, ("%x" % pow(g_b, int(a, 16), p)).decode('hex')).encode('hex'))
Result:
2e209f9d99c9fc8a3ddcd56d212646c9d81a26f8ecf7f27ec928659552dc1c21bb9560b1d85f945f438d7e8e96fae68991a990398bdfef502206f85280e650206271c4b2f5f8884d83ae66ecfecdec922669725e85f9ea58b0d69f5b1eb76815765b12883d17f662498b1f9d7a1194678dd1eed65550d451c05ea59a1aeb8a8c7c441bdb965afdcd85475bf08a1eb8d674a4c7e7af193fa60affe53b9fc4fcea59aff67260166f40af589598060c2200d73dbe961956540674d26b388dc2a097626de41099b9cfd50f569dd3bb4986d515238603c3526782775e53e9bae86358ed55b0efec6965a0e51de4b66e5a3d5f9b9a2067f5d5c4d76014c6562d121b0a
This only needs the 5 DC keys, 4 of the messages (they are included verbatim, you can look them up in the dump if you’d like) and it manages to find the secret chat key in under a second.
The issue was fixed in tg quite soon after the contest started, but the vulnerable key already existed then and they kept using it.
The best part, here are some of the messages that were sent, decrypted using the key found:
It is telegram42kotobox@gmail.com OK? Unfortunately, random numbers are very difficult to generate. anyway, it's telegram42kotobox@gmail.com. and remember, random numbers are very difficult to generate. Hey, Nick. How are you? It is telegram42kotobox at gmail dot com. No spam please. We thus fall back on pseudorandom numbers. Hey, Nick. How are you? It is telegram42kotobox at gmail dot com. No spam please. We thus fall back on pseudorandom numbers. Today, our random number is 1392559044 Hey, Nick. How are you? It is telegram42kotobox at gmail dot com. No spam please. We thus fall back on pseudorandom numbers. Today, our random number is 1393588044
At least they admit they suck at this…
In the past couple of years, a number of attacks have been found on “TLS”, but often those attacks were only shown with HTTPS. The majority of TLS encrypted traffic is probably HTTPS, but it’s important to understand which of these attacks can be translated to other protocols. I’ll use XMPP, but I’ll try to get the attacks down to the core features the used protocol needs to support to help others determine which other protocols are also vulnerable.
The core difference between HTTPS and XMPP is that connections are long-lived. While an attacker who has tricked you into running their JavaScript code can make multiple HTTPS requests per second, a reconnect on XMPP will be easier to notice and slower.
I shall mostly assume that the attacker knows your full JID and is therefore capable of routing stanzas to you. This may not always be realistic, but targeted attacks do happen.
First of all, there needs to be something that is going to be extracted. Something secret that was transmitted which the attacker wants to obtain. For HTTPS, this is often cookies: they are transmitted often, an attacker can cause them to be retransmitted whenever they want, and they (often) give full access to your account. Other attacks attempt to extract data from the body of a page, like email addresses.
On XMPP, possible targets include:
BEAST exploits a problem with CBC mode in TLS which leads to predictable IVs. CBC mode is a block cipher mode that xors a plain text block with the previous cipher text block before encrypting it, to ensure equal plain text blocks do not turn into the same cipher text blocks. However, TLS 1.0 also does this when that block has already been transmitted. This means the attacker can observe that block and pick the next block to be encrypted based on that information.
In other words, suppose P[i]
are the plain text blocks the client sends and C[i]
are the cipher text blocks. Then C[i] := AES(P[i] XOR C[i-1])
. Suppose C[a]
was the last block of a packet. Then, if the server could somehow convince the client to send P[a+1] = C[a] XOR Q
, then the encrypted packet will be C[a] = AES(P[a+1] XOR C[a]) = AES(C[a] XOR Q XOR C[a])
. If you recall the properties of XOR
, this is equal to AES(Q)
. So the attacker can obtain the encryption of any block Q
!
This is what’s known as an encryption oracle. The attacker can pick any value Q
and obtain its encryption, but it can not decrypt anything. But by making clever guesses, the attacker might be able to obtain the decryption of specific blocks, if they guessed correctly. Suppose the attacker is interested in the contents of the b
th block, then the attacker must make a guess of what that block was, lets call its guess B
, and it will make the client send B XOR C[b-1]
(to account for the CBC mode). If AES(B XOR C[b-1])
was equal to C[b]
, then the attacker knows their guess was correct and that P[b] = B
. If the guess was wrong, the attacker will have to try a different value instead of B
and continue until the encryption is equal.
Now, guessing even a 10 character cookie or password correctly is going to take a long time, so BEAST uses another trick: it will insert content before the cookie to make sure the first character of the cookie falls in one block, but the rest of the cookie in the next. If the attacker does know the rest of the block, it only needs to guess one character, which can be done quite quickly. For the next connection, the attacker will insert one character less, so two characters of the cookie fall in the first block. It already knows the first one of the two, so it again needs to guess only one character. It can continue this until it has the entire cookie. This means the attacker is able to guess a long cookie in quite a low number of guesses.
Attacking the password on XMPP is much harder, as the attacker can not insert new content before the password. It is quite likely the password spans different blocks, but that still leaves a large number of possibilities. The contact list is likely also too hard to guess without any prior information. Guessing the JID might be possible if the attacker has a list of all users on the server, but if the attacker didn’t know the JID, it would be much harder for them to actually do this attack.
Guessing messages is also hard, but obtaining message meta-data might be possible.
Suppose the attacker already has a list of contacts who are on my contact list. I send a packet and the attacker is interested in finding out whether it was a message to one of my contacts, and if so, to whom. Here’s the plain text of the message, shown split at the block boundraries:1: <message type='c 2: hat' id='purplea 3: 1074a2d' to='jul 4: iet@im.example.c 5: om'><active xmln 6: s='http://jabber 7: .org/protocol/ch ...
The interesting info is spread over blocks 3, 4 and 5. However, block 3 also contains part of purplea1074a2d
. We can guess this (hint: it’s an incrementing counter), but lets suppose we don’t know it, just its length. The attacker can use BEAST to test every person on my contact list to the 4th block, leaving out the first 3 characters and using at most 16 characters of every JID. It might fail (the attacker doesn’t know for sure it’s a message, it could be an <iq/>
stanza or groupchat message), but if the message is for one of my contacts, the attacker will find that.
There is, however, one more requirement on BEAST that we need to satisfy: the attacker must be able to fully specify the first block in a new packet. It won’t work if it’s a later block. In general, the client will only start new packet with a new stanza. So the packet will almost always start with <iq
, <message
or <presence
, and anything the client sends must be valid UTF-8 and valid XML. Therefore, making the client send B XOR C[b-1]
exactly is very hard.
One way the attacker might try to get the client to use certain contents in its first block is by sending the target an <id/>
with the payload in the ‘id’ attribute:
1: <iq id='�u��_$ճH 2: ' type='get' to= ...The client will then mirror the ‘id’ and send a reply or error back:
1: <iq id='�u��_$ճH 2: ' type='error' t ...
But because of the prefix, the attack can only be successfull when B XOR C[b-1]
happens to start with <iq id='
. The probabilty of that happening is 2-64, which is neglible.
The main requirement for BEAST is that the attacker must be able to choose the block a client will use as the start of a new packet. Additionally, the secret needs to be either easy to guess or it needs to be movable by the attacker so it can be placed on a block boundrary. I do not know of a way to do the former in XMPP, so it is unlikely XMPP is vulnerable.
Lately, there has been a lot of interest in forward-secrecy, mostly in the context of TLS/HTTPS. Some people seem to think it’s a magic bullet that will thwart all the NSA’s efforts. I am not against forward-secrecy, to the contrary, I think any encrypted communications protocol should use it, but I think it is important that people keep realistic expectations about what forward-secrecy protects them against. The worst security is security that you assume you have, but don’t actually have.
What some people seem to think is that forward-secrecy implies that to break the encryption of n sessions, n times as much computational power is necessary compared to breaking 1 session, because it involves n different session keys.
This is not automatically the case. Not only does forward-secrecy not imply it, common protocols which have forward-secrecy based on Diffie-Hellman key exchanges do not have this property.
To break a number of Diffie-Hellman negotiated keys all using the same Diffie-Hellman group, a number of different attacks are known. Many of these scale pretty well in the number of sessions.
Take for example a naive brute force search. We start generating g, g2 mod p, g3 mod p, … until we find the key we want. This requires modular multiplication of a huge number (lets say at least 1024 bits), so the multiplication step is quite an intensive computation. Comparing numbers (even 1024 bit numbers) is by comparison much easier. So it doesn’t matter much if we need to compare g^x mod p to one, two or n different numbers (if we have a lot of keys, we have other options like sorted tables to optimize this further). If the NSA captured the ciphertext of you logging in 50 different times, then it’s 50 times as easy to obtain your password by decrypting at least one of the sessions compared to if they only captured one login.
Many people think 1024-bit RSA keys can be cracked by the NSA. If they would do this to a key used for a TLS handshake that didn’t use ECDHE/DHE, every captured TLS session using that key would be trivially decryptable.
The brute force attack on Diffie-Hellman I described under #1 does not break future sessions: it can only break the sessions that have already been captured. When new keys come in, the process has to start all over again.
But using the index-calculus algorithm instead, the attacker first needs to pre-compute lots of data before the actual key is necessary in the computation. The attack has 3 steps: find some powers of g which factorize into a small set of primes (the factor base). Then taking the logg of the equations gx = p1s1 … prsr creates linear equations with the logarithms of the primes as unknowns. When enough equations are found, they can be solved using linear algebra mod (p - 1). In the final stage, the key is multiplied with g until a number is found that factors into the factor base. From this, it is easy to compute the discrete log of the key.
The first two steps do not use the key at all, their result can be stored for later use to decrypt future keys. There is a trade-off here, though: the larger the factor base, the slower the first and second stages are, but the faster the third stage is. It’s unlikely that it is worth the effort to make the third stage as efficient as decrypting a session with a RSA private key is, but it’s not impossible.
We can conclude from #1 and #2 that attacks on Diffie-Hellman groups exist that can reveal as much as an RSA private key.
It is generally assumed that Diffie-Hellman and RSA offer approximately equal security for the same bit length. So breaking 1024-bit RSA would be as hard as breaking 1024-bit Diffie-Hellman.
Therefore, if you’re using a 1024-bit Diffie-Hellman group (if you’re using Java and DHE, you’re using a 1024-bit group, if you’re not using the very latest version of Apache, you’re using a 1024-bit group) you effectively have the same security level as somebody using a 1024-bit RSA key, but for one exception: the RSA private key can be stolen (by NSA hackers or court orders). But please keep in mind that this is the only benefit forward-secrecy gives you.
Every OTR conversation uses a DH key exchange with the same 1536 bit prime from RFC 3526. While I have no idea how much it would cost, there is a finite amount of work, somewhere in the order of breaking RSA-1536, the NSA needs to do after which they are capable of decrypting every OTR encrypted session with an hour of work. Is it realistic to hope the NSA never had enough interest/budget against OTR to carry out such an attack?
Remember this news article from September 2013 about the NSA’s capabilities:
“For the past decade, N.S.A. has led an aggressive, multipronged effort to break widely used Internet encryption technologies,” said a 2010 memo describing a briefing about N.S.A. accomplishments for employees of its British counterpart, Government Communications Headquarters, or GCHQ. “Cryptanalytic capabilities are now coming online. Vast amounts of encrypted Internet data which have up till now been discarded are now exploitable.”
I don’t think the amount of OTR traffic is “vast”, but other than that, this seems spot on.