Hashing is the practice of using an algorithm to map data of any size to a specific length. This is called a hash value (or hash code or hash sums). Whereas encryption is a two-way function, hashing is a one-way function. While it is technically possible to reverse-hash something, the computing power needed makes it unfeasible. Hashing is one-way.
Now, whereas encryption is meant to protect data in transit, hashing is meant to check that a file or piece of data has not been altered—that it is authentic. In other words, it serves as a check-sum.
Here is how it works, each hashing algorithm outputs at a fixed (specific) length. So for instance, one may hear about SHA-256, that means that the algorithm is going to output a hash value that is 256 bits, usually represented by a 64 character hexadecimal string.
Every hash value is unique. If two various files produce the same unique hash value this is known as collision and it makes the algorithm essentially useless. Last year, Google made a collision with the SHA-1 hashing algorithm to demonstrate that it is vulnerable.
SHA-1 was officially phased out in favour of SHA-2 in the year 2016. But Google had a point to make so it devoted two years’ worth of funds, man-hours and talent in a partnership with a lab in Amsterdam to make something that was to that point more of abstraction into a reality. That’s a long way to go to prove a point. But Google went there.
Anyway, here is an example of hashing, let us say you want to digitally sign a piece of software and make it available for download on your website. To do this, you are going to develop a hash of the script or executable you are signing, then after adding your digital signature you will hash that, too. Following this, the whole thing is encrypted so it can be downloaded.
When a customer downloads the software, their web browser is going to decrypt the file, then inspect the two unique hash values. The web browser will then run the same hash function, using the same algorithm, and hash both the file and the signature again. If the web browser produces the same hash value then it knows that both the signature and the file are authentic—they have not been altered. If it is not, the web browser issues a warning.
That is usually how code signing works. And remember, no two files can create the same hash value, so any alteration – even the tiniest tweak – will produce a different value.
On the other hand, Encryption is the practice of scrambling information in a way that only specific people with a corresponding key can unscramble and read it. Encryption is a two-way function. When you encrypt something, you are doing it with the intention of decrypting it later.
This is a key distinction b/w encryption and hashing.
To encrypt data you use something called a cypher, which is an algorithm – a series of well-defined steps that can be followed procedurally – to encrypt and decrypt information. The algorithm is also known as the encryption key.
I understand the word algorithm has kind of a daunting connotation because of the scars we all still bear from high school and college calculus. But as you’ll see, an algorithm is really nothing more than a set of rules–and they can actually be pretty simple.
Encryption has a long, storied history. It dates back to at least 1900 BC after the foundation of a tomb wall with non-standard hieroglyphs chiselled into it. Since then there have been infinite historical examples.
The earlier Egyptians used a simple form of encryption. As did Caesar, whose cypher stands as one of the most essential examples of encryption in history. Caesar used a primitive shift cypher that changed letters nearby counting forward a set number of places in the alphabet. It was extraordinarily useful though, making any info intercepted by Caesar’s opponents practically useless.
More than thousand years later, a nomenclator cypher – a type of substitution cypher that swaps symbols for common words in an attempt to avoid a decryption technique called frequency analysis – got Mary Queen of Scots beheaded and a bunch of conspirators killed when their messages were intercepted and deciphered by a literal man (men) in the middle.
MD5
MD5 was introduced in the year 1991 and it replaced the earlier hash function MD4 due to believed weaknesses in this algorithm. It is still popularly used for the protection of fairly insensitive information. 1996 was a very damaging year to MD5 however, some flaws were discovered in its design and so other hashing functions were suggested. The size of the hash is 128 bits, and so is small enough to permit a birthday attack.
SHA
The SHA series of algorithms stands for “Secure Hash Algorithm” they were designed by NIST. Due to the avalanche effect, even a minor change in the data to be encrypted will probably result in a very different hash string. Because the SHA algorithms show signs of the avalanche effect they are believed to have quite a good randomization characteristic. SHA algorithms were based upon the MD4&5 algorithms designed by Ron Rivest. SHA was released via national security authority as a US government standard.
Bcrypt
Bcrypt was created by Niels Provos and David Mazires based on the Blowfish cypher: b for Blowfish and crypt for the name of the hashing function used by the UNIX password system.
It is the best example of failure to adapt to technology changes. According to USENIX, in the year 1976, crypt could hash fewer than 4 passwords per second. Since attackers need to find the pre-image of a hash in order to invert it, this made the UNIX Team feel very relaxed about the strength of crypt. However, after twenty years, a fast computer with optimized software and hardware was capable of hashing 200,000 passwords per second using that function!
RIPEMD-160
RIPEMD (RIPE Message Digest) is a family of cryptographic hash functions developed in the year 1992 (the original RIPEMD) and in the year 1996 (other variants). There are five functions in the family: RIPEMD, RIPEMD-128, RIPEMD-160, RIPEMD-256, and RIPEMD-320, of which RIPEMD-160 is the most popular.
RIPEMD-160 is a cryptographic hash function which is based upon the Merkle–Damgard construction. It is a strengthened version of the RIPEMD algorithm which makes a 128-bit hash digest whereas the RIPEMD-160 algorithm makes a 160-bit output.
Whirlpool
Whirlpool is quite a young hash algorithm it was first released in the year 2000. Since then a few revisions have taken place. Whirlpool’s designers have promised never to patent Whirlpool instead it is free for everybody who wants to use it. Whirlpool hashes are usually displayed as a 128 digit hexadecimal string. Whirlpool-0 is the first version, Whirlpool-1 is the second and Whirlpool is the most recent version of the algorithm. Whirlpool is based on a modified version of the Advanced Encryption Standard (AES).
BLAKE3
BLAKE3 is the most recent version of the BLAKE cryptographic hash function. It was created by Jack O’Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O’Hearn, BLAKE3 combines general-purpose cryptographic tree hash bao with BLAKE2 in order to provide a great performance improvement over SHA-1, SHA-2, SHA-3, and BLAKE2.
HMAC
In cryptography, an HMAC (sometimes expanded as either keyed-hash message authentication code or hash-based message authentication code) is a specific kind of message authentication code (MAC) including a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously check both the data integrity and the authenticity of a message.
MAC
In cryptography, a message authentication code (MAC), also known as a tag, is a short piece of information used to authenticate a message. In other words, to confirm that the message came from the stated sender (its authenticity) and has not been altered. The MAC value protects both a message’s data integrity and its authenticity, by letting verifiers detect any changes in the content of the message.
N-Hash
In cryptography, N-Hash is a cryptographic hash function based on the FEAL round function and is nowadays considered insecure. It was proposed in the year 1990 by Miyaguchi et al.; weaknesses were published the following year.
N-Hash has a 128-bit hash size. A message is separated into 128-bit blocks, and each block is combined with the hash value computed so far using the g compression function. g contains eight rounds, each of which uses an F function, similar to the one used by FEAL.
Base64
Base64 is a collection of binary-to-text encoding schemes that shows binary data in an ASCII string format by translating it into a radix-64 representation. Every Base64 digit represents exactly 6 bits of data. Three 8-bit bytes which is a total of 24 bits) can, therefore, be represented by four 6-bit Base64 digits.
Common to all binary-to-text encoding schemes, Base64 is developed to carry data stored in binary formats across channels that only reliably support text content. Base64 is specifically prevalent on the WWW where its uses include the ability to embed image files or other binary assets inside textual assets like HTML and CSS files.
Mcrypt
Mcrypt is a replacement for the popular Unix crypt command. This was a file encryption tool that used an algorithm very close to World War 2 Enigma cypher. Mcrypt offers the same functionality but uses several modern algorithms such as AES. Libmcrypt, Mcrypt’s companion, is a library of code which consists the actual encryption functions and offers an easy method for use. The last update to libmcrypt was in the year 2007, despite years of unmerged patches. These points have led security experts to declare Mcrypt abandonware and discourage its use in new development. Maintained alternatives include ccrypt, libressl, and more.
AES Encrypt
The Advanced Encryption Standard (AES) is a symmetric block cypher chosen by the US government to protect classified info. AES is implemented in software and hardware throughout the entire world to encrypt sensitive data. It is essential for government computer security, cybersecurity and electronic data protection.
NIST stated that the newer, advanced encryption algorithm would be unclassified and must be „capable of protecting sensitive government information well into the [21st] century.“ It was intended to be easy to implement in hardware and software, as well as in restricted environments — such as a smart card — and offer decent defences against various attack techniques.
AES was created for the US government with additional voluntary, free use in public or private, commercial or noncommercial programs that provide encryption services. However, non-governmental organizations choosing to use AES are subject to limitations created by U.S. export control.
RSA
RSA (Rivest–Shamir–Adleman) is an algorithm used via modern computers to encrypt and decrypt messages. It is an asymmetric cryptographic algorithm. Asymmetric means that there are two different keys. This is also known as public-key cryptography because one of the keys can be given to anyone. The other key must be kept private. The algorithm is based on the fact that finding the factors of a large composite number is difficult: when the factors are prime numbers, the problem is called prime factorization.
Blowfish
Blowfish is a symmetric block cypher that can be used as a drop-in replacement for DES or IDEA. It takes a variable-length key, from 32 bits to 448 bits, making it great for both domestic and exportable use. Blowfish was designed in the year 1993 by Bruce Schneier as a fast, free alternative to existing encryption algorithms. Since then it has been analyzed considerably, and it is slowly earning acceptance as a strong encryption algorithm. Blowfish is unpatented and license-free and is available free to use.
Twofish
Twofish is a symmetric block cypher; a single key is used for encryption and decryption. Twofish has a block size of 128 bits and accepts a key of any length up to 256 bits. (NIST needed the algorithm to accept 128-, 192-, and 256-bit keys.) Twofish is fast on both 32-bit and 8-bit CPUs (smart cards, embedded chips, and the like), and in hardware. And it is flexible; it can be used in network applications where keys are changed frequently and in apps where there is little or no RAM and ROM available.
CRC-64
A cyclic redundancy check (CRC) is an error-detecting code used to detect data corruption. When sending data, the short checksum is generated based on the content of the data and sent along with data. When receiving data, the checksum is generated again and comparison is made with sent checksum. If the two are equal, then there is no data corruption. The CRC-64 algorithm itself converts a variable-length string into a 16-character string.
SWIFFT
In cryptography, SWIFFT is a group of provably secure hash functions. SWIFFT is not the first hash function based on FFT, but it sets itself apart by providing a mathematical proof of its security. It also uses the LLL basis reduction algorithm. It can be shown that finding collisions in SWIFFT is at least as difficult as finding short vectors in cyclic/ideal lattices in the worst case. By giving a security reduction to the worst-case scenario of a difficult mathematical problem, SWIFFT gives a much stronger security guarantee than other cryptographic hash functions.
KMAC
The KECCAK Message Authentication Code (KMAC) algorithm is a PRF and keyed hash function based on KECCAK. It offers variable-length output, and unlike SHAKE and cSHAKE, altering the requested output length generates new, unrelated output. KMAC has two variants, KMAC128 and KMAC256, built from cSHAKE128 and cSHAKE256, respectively. The two variants differ somewhat in their technical security properties. Nonetheless, for most applications, both variants can support any security strength up to 256 bits of security, provided that a long enough key is used.
FSB
In cryptography, the fast syndrome-based hash functions (FSB) are a family of cryptographic hash functions introduced in 2003 by Daniel Augot, Matthieu Finiasz, and Nicolas Sendrier. Unlike most other cryptographic hash functions in use today, FSB can to a certain extent be proven to be secure. More exactly, it can be proven that breaking FSB is at least as difficult as solving a certain NP-complete problem known as regular syndrome decoding so FSB is provably secure. Though it is not known whether NP-complete problems are solvable in polynomial time, it is often assumed that they are not.
Gost
Gost was a set of Soviet Union standards and applied to everything from electronics to chemicals. It standardized everything in Russia meaning almost everything was interchangeable because it was compatible. Gost 28147-89 is the actual cypher which was developed as a soviet and Russian standard. Gost 28147-89 is typically referred to as Gost in cryptology circles. It is based quite closely upon the US DES standard. The main concern with Gost is that the avalanche effect is not very quick to occur.
HAVAL
HAVAL is a widely popular hash function, it differs from many other hash functions because it is possible for it to generate hash values in different lengths, the lengths of the hashes can be 128 bits, 160 bits, 192 bits, 224 bits or 245 bits. It was designed in the year 1992. This hashing function exhibits the avalanche effect and so even a minor change in the string is likely to result in a very different hash value. Recent research, mostly by Xiaoyun Wang has indicated that HAVAL has a number of drawbacks, perhaps putting the use of it on hold.