andrewj
01-06-2009, 11:16 AM
As many of you would know, a paper was presented at the 25th Chaos Computer Conference in Germany last week (27th to 30th December) providing details on a practical (and demonstrable) implementation of MD5 collisions to produce false SSL certificates. The details on this attack, and the presentation that was made, can be found here:
http://www.win.tue.nl/hashclash/rogue-ca/
http://www.win.tue.nl/hashclash/rogue-ca/downloads/md5-collisions-1.0.pdf
For those of you who just want the condensed version, summarised details are provided below. This text is copied from the following location:
http://www.withamlabs.com/component/content/article/226-fake-ca-cert-produced-using-md5-collision.html
This new paper is based on the previous research into MD5 collisions, and specifically utilises the research for chosen prefix collisions. A 'chosen prefix' collision is where the start of each pre-image - that is the data to be signed - can be arbitrarily chosen by the attacker, so that two sets of data to be signed can have different beginnings, but still produce the same MD5 hash output.
The production of this identical MD5 output is due to the insertion of a string of 'collision bits' within each of the two data sets, that essentially forces the intermediate MD5 calculations at this point to the same value. Therefore, after the processing of the 'collision bits' within the MD5 algorithm, as long as all other parts of the two data sets are the same, the MD5 hash output will be identical for both data sets.
So, in the context of this research, the researchers chose two certificates with two different beginnings - one for an innocuous end use website, and one for a mid level Certificate Authority (CA) - and then calculated the 'collision bits' that would make these two certificates produce the same MD5 hash.
The final part of the puzzle is that the researchers found a CA that was issuing certificates with incrementing serial numbers - that is, each certificate issued would have a serial number +1 from the last issued certificate. This is important because the certificate serial number is contained at the start of the certificate, and therefore must be known for both certificates before the 'collision bits' can be calculated (ie the serial number is part of the certificate 'prefix').
So, putting it all together, the researchers requested a certificate to be signed from the CA, and noted the serial number 'S'. They then calculated the 'collision bits' for the innocuous certificate (created earlier) with a serial number 'S' + 1000, and the bogus CA certificate (also produced earlier). The calculations took around 2 days on a cluster of 200 PS3 consoles.
Once they had the 'collision bits', they were then inserted into the two certificates, resulting in the fact that an MD5 calculation across either certificate would produce the same hash output, even though the two certificates had different contents. The researchers then moved on to gain signatures across these colliding certificates.
The certificate serial number issued by the CA was incremented to 'S' + 999 (ie one less than 'S' + 1000) by issuing repeated certificate signing requests. With the serial number set at the correct value, the innocuous certificate was sent to the CA for signing. At this point, the same signature is transfered onto their bogus CA certificate, creating a fake CA certificate which could be used to validate the signature (they created) on any other (arbitrary and malicious) certificate.
To be clear, the CA certificate produced by the researchers would be accepted by any browser because it is signed by a trusted root CA. Therefore, any end use website certificate signed by this bogus CA certificate would be accepted.
This was demonstrated in practice at the Chaos Computer Conference by having people in the audience connect to any arbitrary SSL website, the researchers has set up a system that would intercept the HTTPS request and return a malicious certificate (signed by their CA) which was accepted by the browser.
This is a text book example of a Man-In-The-Middle attack, and because it the researchers are this Man-In-The-Middle using their CA to sign malicious certs, the hash algorithm, key length or ciphersuite of the end connection website does not matter. Ensuring that your cert does not use MD5 will not protect you against this type of attack.
Now, this is a perfect example of why MD5 is not acceptable for use in secure systems. It is also a perfect example of why PCI does not accept the use of MD5 hashes in any of its standards.
It should be noted that this attack is not a 'pre-image' attack on MD5 - that is, it does not recover the plaintext from an MD5 hash. As it requires the attacker to choose the start of both datasets to create the collision, it is also not an attack that could create collisions in hashed password files, to allow for the calculation of a 'collision password' that would be considered as the same as the one registered by the valid user.
http://www.win.tue.nl/hashclash/rogue-ca/
http://www.win.tue.nl/hashclash/rogue-ca/downloads/md5-collisions-1.0.pdf
For those of you who just want the condensed version, summarised details are provided below. This text is copied from the following location:
http://www.withamlabs.com/component/content/article/226-fake-ca-cert-produced-using-md5-collision.html
This new paper is based on the previous research into MD5 collisions, and specifically utilises the research for chosen prefix collisions. A 'chosen prefix' collision is where the start of each pre-image - that is the data to be signed - can be arbitrarily chosen by the attacker, so that two sets of data to be signed can have different beginnings, but still produce the same MD5 hash output.
The production of this identical MD5 output is due to the insertion of a string of 'collision bits' within each of the two data sets, that essentially forces the intermediate MD5 calculations at this point to the same value. Therefore, after the processing of the 'collision bits' within the MD5 algorithm, as long as all other parts of the two data sets are the same, the MD5 hash output will be identical for both data sets.
So, in the context of this research, the researchers chose two certificates with two different beginnings - one for an innocuous end use website, and one for a mid level Certificate Authority (CA) - and then calculated the 'collision bits' that would make these two certificates produce the same MD5 hash.
The final part of the puzzle is that the researchers found a CA that was issuing certificates with incrementing serial numbers - that is, each certificate issued would have a serial number +1 from the last issued certificate. This is important because the certificate serial number is contained at the start of the certificate, and therefore must be known for both certificates before the 'collision bits' can be calculated (ie the serial number is part of the certificate 'prefix').
So, putting it all together, the researchers requested a certificate to be signed from the CA, and noted the serial number 'S'. They then calculated the 'collision bits' for the innocuous certificate (created earlier) with a serial number 'S' + 1000, and the bogus CA certificate (also produced earlier). The calculations took around 2 days on a cluster of 200 PS3 consoles.
Once they had the 'collision bits', they were then inserted into the two certificates, resulting in the fact that an MD5 calculation across either certificate would produce the same hash output, even though the two certificates had different contents. The researchers then moved on to gain signatures across these colliding certificates.
The certificate serial number issued by the CA was incremented to 'S' + 999 (ie one less than 'S' + 1000) by issuing repeated certificate signing requests. With the serial number set at the correct value, the innocuous certificate was sent to the CA for signing. At this point, the same signature is transfered onto their bogus CA certificate, creating a fake CA certificate which could be used to validate the signature (they created) on any other (arbitrary and malicious) certificate.
To be clear, the CA certificate produced by the researchers would be accepted by any browser because it is signed by a trusted root CA. Therefore, any end use website certificate signed by this bogus CA certificate would be accepted.
This was demonstrated in practice at the Chaos Computer Conference by having people in the audience connect to any arbitrary SSL website, the researchers has set up a system that would intercept the HTTPS request and return a malicious certificate (signed by their CA) which was accepted by the browser.
This is a text book example of a Man-In-The-Middle attack, and because it the researchers are this Man-In-The-Middle using their CA to sign malicious certs, the hash algorithm, key length or ciphersuite of the end connection website does not matter. Ensuring that your cert does not use MD5 will not protect you against this type of attack.
Now, this is a perfect example of why MD5 is not acceptable for use in secure systems. It is also a perfect example of why PCI does not accept the use of MD5 hashes in any of its standards.
It should be noted that this attack is not a 'pre-image' attack on MD5 - that is, it does not recover the plaintext from an MD5 hash. As it requires the attacker to choose the start of both datasets to create the collision, it is also not an attack that could create collisions in hashed password files, to allow for the calculation of a 'collision password' that would be considered as the same as the one registered by the valid user.