The SHA hash functions are a set of cryptographic hash functions designed by the National Security Agency (NSA) and published by the NIST as a U.S. Federal Information Processing Standard. SHA stands for Secure Hash Algorithm. The three SHA algorithms are structured differently and are distinguished as SHA0, SHA1, and SHA2. The SHA2 family uses an identical algorithm with a variable digest size which is distinguished as SHA224, SHA256, SHA384, and SHA512.
SHA1 is the most widely used of the existing SHA hash functions, and is employed in several widelyused security applications and protocols. In 2005, security flaws were identified in SHA1, namely that a mathematical weakness might exist, indicating that a stronger hash function would be desirable.^{[1]} Although no attacks have yet been reported on the SHA2 variants, they are algorithmically similar to SHA1 and so efforts are underway to develop improved alternatives.^{[2]}^{[3]} A new hash standard, SHA3, is currently under development — an ongoing NIST hash function competition is scheduled to end with the selection of a winning function in 2012.
Contents 
The original specification of the algorithm was published in 1993 as the Secure Hash Standard, FIPS PUB 180, by US government standards agency NIST (National Institute of Standards and Technology). This version is now often referred to as SHA0. It was withdrawn by NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 1801 and commonly referred to as SHA1. SHA1 differs from SHA0 only by a single bitwise rotation in the message schedule of its compression function; this was done, according to NSA, to correct a flaw in the original algorithm which reduced its cryptographic security. However, NSA did not provide any further explanation or identify the flaw that was corrected. Weaknesses have subsequently been reported in both SHA0 and SHA1. SHA1 appears to provide greater resistance to attacks, supporting the NSA’s assertion that the change increased the security.
SHA1 (as well as SHA0) produces a 160bit digest from a message with a maximum length of (2^{64} − 1) bits. SHA1 is based on principles similar to those used by Ronald L. Rivest of MIT in the design of the MD4 and MD5 message digest algorithms, but has a more conservative design.
NIST published four additional hash functions in the SHA family, named after their digest lengths (in bits): SHA224, SHA256, SHA384, and SHA512. The algorithms are collectively known as SHA2.
The algorithms were first published in 2001 in the draft FIPS PUB 1802, at which time review and comment were accepted. FIPS PUB 1802, which also includes SHA1, was released as an official standard in 2002. In February 2004, a change notice was published for FIPS PUB 1802, specifying an additional variant, SHA224, defined to match the key length of twokey Triple DES. These variants are patented in US patent 6829355. The United States has released the patent under a royalty free license.^{[4]}
SHA256 and SHA512 are novel hash functions computed with 32 and 64bit words, respectively. They use different shift amounts and additive constants, but their structures are otherwise virtually identical, differing only in the number of rounds. SHA224 and SHA384 are simply truncated versions of the first two, computed with different initial values.
Unlike SHA1, the SHA2 functions are not widely used, despite their better security. Reasons might include lack of support for SHA2 on systems running Windows XP SP2 or older,^{[5]} a lack of perceived urgency since SHA1 collisions have not yet been found, or a desire to wait until SHA3 is standardized. SHA256 is used to authenticate Debian Linux software packages^{[6]} and in the DKIM message signing standard; SHA512 is part of a system to authenticate archival video from the International Criminal Tribunal of the Rwandan genocide.^{[7]} SHA256 and SHA512 are proposed for use in DNSSEC.^{[8]} Unix and Linux vendors are moving to using 256 and 512bit SHA2 for secure password hashing.^{[9]} NIST's directive that U.S. government agencies stop most uses of SHA1 after 2010, and the completion of SHA3, may accelerate migration away from SHA1.
Currently, the best public attacks break 41 of the 64 rounds of SHA256 or 46 of the 80 rounds of SHA512, as discussed in the "Cryptanalysis and Validation" section below.^{[10]}
An open competition for a new SHA3 function was formally announced in the Federal Register on November 2, 2007.^{[11]} "NIST is initiating an effort to develop one or more additional hash algorithms through a public competition, similar to the development process for the Advanced Encryption Standard (AES)."^{[12]} Submissions were due October 31, 2008 and the proclamation of a winner and publication of the new standard are scheduled to take place in 2012.
In the table below, internal state means the “internal hash sum” after each compression of a data block; see MerkleDamgård construction for more details.
Algorithm and variant 
Output size (bits)  Internal state size (bits)  Block size (bits)  Max message size (bits)  Word size (bits)  Rounds  Operations  Collisions found  

SHA0  160  160  512  2^{64} − 1  32  80  +,and,or,xor,rot  Yes  
SHA1  160  160  512  2^{64} − 1  32  80  +,and,or,xor,rot  None (2^{63} attack)^{[13]}  
SHA2  SHA256/224  256/224  256  512  2^{64} − 1  32  64  +,and,or,xor,shr,rot  None 
SHA512/384  512/384  512  1024  2^{128} − 1  64  80  +,and,or,xor,shr,rot  None 
SHA1 is the most widely employed of the SHA family. It forms part of several widelyused security applications and protocols, including TLS and SSL, PGP, SSH, S/MIME, and IPsec. Those applications can also use MD5; both MD5 and SHA1 are descended from MD4. SHA1 hashing is also used in distributed revision control systems such as Git, Mercurial, and Monotone to identify revisions, and to detect data corruption or tampering.
SHA1, SHA224, SHA256, SHA384, and SHA512 are the secure hash algorithms required by law for use in certain U. S. Government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 1801 also encouraged adoption and use of SHA1 by private and commercial organizations. SHA1 is being retired for most government uses; the U.S. National Institute of Standards and Technology says, "Federal agencies should stop using SHA1 for...applications that require collision resistance as soon as practical, and must use the SHA2 family of hash functions for these applications after 2010" (emphasis in original).^{[14]}
A prime motivation for the publication of the Secure Hash Algorithm was the Digital Signature Standard, in which it is incorporated.
The SHA hash functions have been used as the basis for the SHACAL block ciphers.
For a hash function for which L is the number of bits in the message digest, finding a message that corresponds to a given message digest can always be done using a brute force search in 2^{L} evaluations. This is called a preimage attack and may or may not be practical depending on L and the particular computing environment. The second criterion, finding two different messages that produce the same message digest, known as a collision, requires on average only 2^{L/2} evaluations using a birthday attack. For the latter reason the strength of a hash function is usually compared to a symmetric cipher of half the message digest length. Thus SHA1 was originally thought to have 80bit strength.
Cryptographers have produced collision pairs for SHA0 and have found algorithms that should produce SHA1 collisions in far fewer than the originally expected 2^{80} evaluations.
In terms of practical security, a major concern about these new attacks is that they might pave the way to more efficient ones. Whether this is the case has yet to be seen, but a migration to stronger hashes is believed to be prudent. Some of the applications that use cryptographic hashes, such as password storage, are only minimally affected by a collision attack. Constructing a password that works for a given account requires a preimage attack, as well as access to the hash of the original password (typically in the shadow file) which may or may not be trivial. Reversing password encryption (e.g. to obtain a password to try against a user's account elsewhere) is not made possible by the attacks. (However, even a secure password hash can't prevent bruteforce attacks on weak passwords.)
In the case of document signing, an attacker could not simply fake a signature from an existing document—the attacker would have to produce a pair of documents, one innocuous and one damaging, and get the private key holder to sign the innocuous document. There are practical circumstances in which this is possible; until the end of 2008, it was possible to create forged SSL certificates using an MD5 collision.^{[15]}
At CRYPTO 98, two French researchers, Florent Chabaud and Antoine Joux, presented an attack on SHA0 (Chabaud and Joux, 1998): collisions can be found with complexity 2^{61}, fewer than the 2^{80} for an ideal hash function of the same size.
In 2004, Biham and Chen found nearcollisions for SHA0—two messages that hash to nearly the same value; in this case, 142 out of the 160 bits are equal. They also found full collisions of SHA0 reduced to 62 out of its 80 rounds.
Subsequently, on 12 August 2004, a collision for the full SHA0 algorithm was announced by Joux, Carribault, Lemuet, and Jalby. This was done by using a generalization of the Chabaud and Joux attack. Finding the collision had complexity 2^{51} and took about 80,000 CPU hours on a supercomputer with 256 Itanium 2 processors. (Equivalent to 13 days of fulltime use of the computer.)
On 17 August 2004, at the Rump Session of CRYPTO 2004, preliminary results were announced by Wang, Feng, Lai, and Yu, about an attack on MD5, SHA0 and other hash functions. The complexity of their attack on SHA0 is 2^{40}, significantly better than the attack by Joux et al. ^{[16]}^{[17]}
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced which could find collisions in SHA0 in 2^{39} operations.^{[18]}^{[19]}
In light of the results for SHA0, some experts suggested that plans for the use of SHA1 in new cryptosystems should be reconsidered. After the CRYPTO 2004 results were published, NIST announced that they planned to phase out the use of SHA1 by 2010 in favor of the SHA2 variants.^{[20]}
In early 2005, Rijmen and Oswald published an attack on a reduced version of SHA1—53 out of 80 rounds—which finds collisions with a computational effort of fewer than 2^{80} operations.^{[21]}
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, Bayarjargal, and Hongbo Yu was announced.^{[18]} The attacks can find collisions in the full version of SHA1, requiring fewer than 2^{69} operations. (A bruteforce search would require 2^{80} operations.)
The authors write: "In particular, our analysis is built upon the original differential attack on SHA0 [sic], the near collision attack on SHA0, the multiblock collision techniques, as well as the message modification techniques used in the collision search attack on MD5. Breaking SHA1 would not be possible without these powerful analytical techniques."^{[22]} The authors have presented a collision for 58round SHA1, found with 2^{33} hash operations. The paper with the full attack description was published in August 2005 at the CRYPTO conference.
In an interview, Yin states that, "Roughly, we exploit the following two weaknesses: One is that the file preprocessing step is not complicated enough; another is that certain math operations in the first 20 rounds have unexpected security problems."^{[23]}
On 17 August 2005, an improvement on the SHA1 attack was announced on behalf of Xiaoyun Wang, Andrew Yao and Frances Yao at the CRYPTO 2005 rump session, lowering the complexity required for finding a collision in SHA1 to 2^{63}.^{[24]} On 18 December 2007 the details of this result were explained and verified by Martin Cochran.^{[25]}
Christophe De Cannière and Christian Rechberger further improved the attack on SHA1 in "Finding SHA1 Characteristics: General Results and Applications,"^{[26]} receiving the Best Paper Award at ASIACRYPT 2006. A twoblock collision for 64round SHA1 was presented, found using unoptimized methods with 2^{35} compression function evaluations. As this attack requires the equivalent of about 2^{35} evaluations, it is considered to be a significant theoretical break.^{[27]} In order to find an actual collision in the full 80 rounds of the hash function, however, massive amounts of computer time are required. To that end, a collision search for SHA1 using the distributed computing platform BOINC began August 8, 2007, organized by the Graz University of Technology. The effort was abandoned May 12, 2009 due to lack of progress.^{[28]}
At the Rump Session of CRYPTO 2006, Christian Rechberger and Christophe De Cannière claimed to have discovered a collision attack on SHA1 that would allow an attacker to select at least parts of the message.^{[29]}^{[30]}
Cameron McDonald, Philip Hawkes and Josef Pieprzyk presented a hash collision attack with claimed complexity 2^{52} at the Rump session of Eurocrypt 2009.^{[31]} However, the accompanying paper, "Differential Path for SHA1 with complexity O(2^{52})" has been withdrawn due to the authors' discovery that their estimate was incorrect.^{[32]}
There are two meetinthemiddle preimage attacks against SHA2 with a reduced number of rounds. The first one attacks 41round SHA256 out of 64 rounds with time complexity of 2^253.5 and space complexity of 2^16, and 46round SHA512 out of 80 rounds with time 2^511.5 and space 2^3. The second one attacks 42round SHA256 with time complexity of 2^251.7 and space complexity of 2^12, and 42round SHA512 with time 2^502 and space 2^22.
Implementations of all FIPSapproved security functions can be officially validated through the CMVP program, jointly run by the National Institute of Standards and Technology (NIST) and the Communications Security Establishment (CSE). For informal verification, a package to generate a high number of test vectors is made available for download on the NIST site; the resulting verification however does not replace in any way the formal CMVP validation, which is required by law for certain applications.
As of October 2006, there are more than 500 validated implementations of SHA1, with fewer than ten of them capable of handling messages with a length in bits not a multiple of eight (see SHS Validation List).
The following is an example of SHA1 digests. ASCII encoding is assumed for all messages.
SHA1("The quick brown fox jumps over the lazy dog") = 2fd4e1c6 7a2d28fc ed849ee1 bb76e739 1b93eb12
Even a small change in the message will, with overwhelming probability, result in a completely different hash due to the avalanche effect. For example, changing dog
to cog
produces a hash with different values for 81 of the 160 bits:
SHA1("The quick brown fox jumps over the lazy cog") = de9f2c7f d25e1b3a fad3e85a 0bd17d9b 100db4b3
Pseudocode for the SHA1 algorithm follows:
Note 1: All variables are unsigned 32 bits and wrap modulo 2^{32} when calculating Note 2: All constants in this pseudo code are in big endian. Within each word, the most significant byte is stored in the leftmost byte position Initialize variables: h0 = 0x67452301 h1 = 0xEFCDAB89 h2 = 0x98BADCFE h3 = 0x10325476 h4 = 0xC3D2E1F0 Preprocessing: append the bit '1' to the message append 0 ≤ k < 512 bits '0', so that the resulting message length (in bits) is congruent to 448 ≡ −64 (mod 512) append length of message (before preprocessing), in bits, as 64bit bigendian integer Process the message in successive 512bit chunks: break message into 512bit chunks for each chunk break chunk into sixteen 32bit bigendian words w[i], 0 ≤ i ≤ 15 Extend the sixteen 32bit words into eighty 32bit words: for i from 16 to 79 w[i] = (w[i3] xor w[i8] xor w[i14] xor w[i16]) leftrotate 1 Initialize hash value for this chunk: a = h0 b = h1 c = h2 d = h3 e = h4 Main loop: for i from 0 to 79 if 0 ≤ i ≤ 19 then f = (b and c) or ((not b) and d) k = 0x5A827999 else if 20 ≤ i ≤ 39 f = b xor c xor d k = 0x6ED9EBA1 else if 40 ≤ i ≤ 59 f = (b and c) or (b and d) or (c and d) k = 0x8F1BBCDC else if 60 ≤ i ≤ 79 f = b xor c xor d k = 0xCA62C1D6 temp = (a leftrotate 5) + f + e + k + w[i] e = d d = c c = b leftrotate 30 b = a a = temp Add this chunk's hash to result so far: h0 = h0 + a h1 = h1 + b h2 = h2 + c h3 = h3 + d h4 = h4 + e Produce the final hash value (bigendian): digest = hash = h0 append h1 append h2 append h3 append h4
The constant values used are chosen as nothing up my sleeve numbers: the four round constants k
are 2^{30} times the square roots of 2, 3, 5 and 10 The first four starting values for h0
through h3
are the same as the MD5 algorithm, and the fifth (for h4
) is similar.
Instead of the formulation from the original FIPS PUB 1801 shown, the following equivalent expressions may be used to compute f
in the main loop above:
(0 ≤ i ≤ 19): f = d xor (b and (c xor d)) (alternative 1) (0 ≤ i ≤ 19): f = (b and c) xor ((not b) and d) (alternative 2) (0 ≤ i ≤ 19): f = (b and c) + ((not b) and d) (alternative 3) (0 ≤ i ≤ 19): f = vec_sel(d, c, b) (alternative 4) (40 ≤ i ≤ 59): f = (b and c) or (d and (b or c)) (alternative 1) (40 ≤ i ≤ 59): f = (b and c) or (d and (b xor c)) (alternative 2) (40 ≤ i ≤ 59): f = (b and c) + (d and (b xor c)) (alternative 3) (40 ≤ i ≤ 59): f = (b and c) xor (b and d) xor (c and d) (alternative 4)
Pseudocode for the SHA256 algorithm follows. Note the great increase in mixing between bits of the w[16..63]
words compared to SHA1.
Note 1: All variables are unsigned 32 bits and wrap modulo 2^{32} when calculating Note 2: All constants in this pseudo code are in big endian Initialize variables (first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19): h0 := 0x6a09e667 h1 := 0xbb67ae85 h2 := 0x3c6ef372 h3 := 0xa54ff53a h4 := 0x510e527f h5 := 0x9b05688c h6 := 0x1f83d9ab h7 := 0x5be0cd19 Initialize table of round constants (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311): k[0..63] := 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 Preprocessing: append the bit '1' to the message append k bits '0', where k is the minimum number >= 0 such that the resulting message length (in bits) is congruent to 448 (mod 512) append length of message (before preprocessing), in bits, as 64bit bigendian integer Process the message in successive 512bit chunks: break message into 512bit chunks for each chunk break chunk into sixteen 32bit bigendian words w[0..15] Extend the sixteen 32bit words into sixtyfour 32bit words: for i from 16 to 63 s0 := (w[i15] rightrotate 7) xor (w[i15] rightrotate 18) xor (w[i15] rightshift 3) s1 := (w[i2] rightrotate 17) xor (w[i2] rightrotate 19) xor (w[i2] rightshift 10) w[i] := w[i16] + s0 + w[i7] + s1 Initialize hash value for this chunk: a := h0 b := h1 c := h2 d := h3 e := h4 f := h5 g := h6 h := h7 Main loop: for i from 0 to 63 s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22) maj := (a and b) xor (a and c) xor (b and c) t2 := s0 + maj s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25) ch := (e and f) xor ((not e) and g) t1 := h + s1 + ch + k[i] + w[i] h := g g := f f := e e := d + t1 d := c c := b b := a a := t1 + t2 Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e h5 := h5 + f h6 := h6 + g h7 := h7 + h Produce the final hash value (bigendian): digest = hash = h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7
The ch
and maj
functions can be optimized the same way as described for SHA1.
SHA224 is identical to SHA256, except that:
h0
through h7
are different, andh7
.Here the initial values for the variables (in big endian): (The second 32 bits of the fractional parts of the square roots of the 9th through 16th primes 23..53) h0 := 0xc1059ed8 h1 := 0x367cd507 h2 := 0x3070dd17 h3 := 0xf70e5939 h4 := 0xffc00b31 h5 := 0x68581511 h6 := 0x64f98fa7 h7 := 0xbefa4fa4
SHA512 is identical in structure, but:
SHA384 is identical to SHA512, except that:
h0
through h7
are different (taken from the 9th through 16th primes), andh6
and h7
.crypto
library includes free, opensource – implementations of SHA1, SHA224, SHA256, SHA384, and SHA512

