In the previous post, you saw how authentication and authorization are important in the security context. However, they are not enough to guarantee a secure communication between two entities.
The post left an unanswered question:
How does the destination know if a message was modified after the origin deliveries it?
Hashing addresses that concern.
TRY IT YOURSELF: You can find the source code of this post here.
Web Security Series
- Part 1: Authentication and Authorization
- Part 2: Hashing – Digest (You are here)
- Part 3: Symmetric and Asymmetric Cryptography
- Part 4: Digital Signatures
- Part 5: Digital Certificates and HTTPs
- Part 6: OAuth2 and OIDC
The Problem
In the previous post, you faced an attacker who was not authenticated and authorized, but, still, he changed the message content, as follows in John and Mary relationship example:

First, Mary authenticates John well, so, she accepted the Love you message. Second, John asks if he can meet her parents, however, the toxic ex-boyfriend captures the message and changes it to Hate you. As Mary already authenticated John, she thinks the second message is from him, so, she accepts it.
In a secure communication, modifying a message by an external entity is unacceptable, so, you need to find a way to check if the message was modified or not.
Let’s talk about hashing.
Hashing Algorithms
A hash is a fixed size string, a summary of a plain message. Hashing strategies are algorithms which get as an input a plain message (set of bytes), applies transformations, and output a hash string , as the following diagram shows:

A good hashing algorithm usually has the following features:
- Fixed size: Each hashing algorithm guarantee a generated fixed size hash, doesn’t matter the long of the input/plain message, which allows you to know exactly how the summary will look like. Besides, as long the hash is, as difficult to hack will be.
- The generated hash is unique in a defined context: If you generate the hashing for two different messages, the probability of resulting in the same hash is pretty low in a defined context, this is called a collision. That means, each message will have a unique hash, so, you will be able to distinguish one message for another, using their hashes.
- Difficult to process the hash and deduce the original message: After the hashing is created, trying to get back from it to the original message, requires an impressive amount of computation, so, it is pretty difficult to achieve.
You can find several hashing algorithms, for instance, the more known ones are MD5 and SHA:
- MD5: Message Digest algorithm, creates a 128 bits long hash. This algorithm is already compromised and shouldn’t be used to secure a communication.
- SHA family: Secure hash algorithm, defines several levels of hashing, where the longest one generates a 512 bits long hash.
Now, let’s see how hashing will help John and Mary to communicate safer.
John and Mary Communication With Hashing
John and Mary change how they communicate, adding hashing algorithms. First, John builds the message different:

John generates the hash of the message he wants to send and later, he adds that hash as a part of the message.
After, John sends the compound message to Mary, as follows:

Mary will use the hash and the message to check if someone modified it, as we can see here:

Mary extracts from the message the hash John sent, and the message itself. She applies the same hashing algorithm John used to generate the hash of the message itself, and finally, she compares the John’s hash with her hash, if both are equal, the message didn’t change after John sent it.
Toxic Exboyfriend in the Middle
The Mary’s toxic ex-boyfriend doesn’t know John and Mary added hashing algorithms to their communication, so, will try to intercept the message between her and John, and change it, as follows:

Mary receives the message intercepted by the ex-boyfriend, so, she processes it as follows:

In this case, Mary realizes the message was changed somewhere, as the hash the message brings doesn’t match the hash of the message itself, so, she rejects the message. Mary checks the integrity of the message and it failed.
Changing the Message and the Hashing
The ex-boyfriend is not stupid, he realizes John and Mary are using hashing algorithms, so, he improves the hack strategy as follows:

He changes not only the message, but the hash too. He creates the hash of “Hate you” using the same algorithm John uses, and adds that hash to the message.
When the message arrives to Mary, she sees the following:

Mary accepts the message “Hate you”, as the hash validation passes.
Hashing helps to validate the integrity/not change of a message, but, alone, is not enough to secure a communication. This leaves you a question:
How does the destination (Mary) know if a message and its hashing was modified after the origin (John) deliveries it?
Java Example: Saving Passwords
Saving passwords is a difficult task. For sure, you shouldn’t save passwords in plain text, what if someone steals your database of passwords? he is going to see the passwords of everybody.
Hashing is one approach to save passwords, as follows:

NOTE: Think twice before you save passwords by yourself. There are a lot of considerations to take into account, so, prefer to delegate that features to an already matured application.
NOTE2: You will see the hashing strategy to save passwords as an example in this post, but, it is not intended to be a definitive guide of how to save passwords.
John sends his password in plain text to an application. This application doesn’t save that password in plain text, instead, it generates the hashing of the password and saves it into the database as John’s password hashing.
When John tries to login, the application will do the same, converting his password from plain text to a hash, and compare the hash with the saved hash. If the are equal, John sent the right password.
As hashing is pretty difficult to revert to its origin message, if the passwords database is leak, the robber won’t be able to recreate the passwords.
The following code shows an example of using hashing algorithms in Java:
public class HashingExamples { @Test public void md5() throws NoSuchAlgorithmException { String password = "123456"; MessageDigest messageDigest = MessageDigest.getInstance("MD5"); byte[] digest = messageDigest.digest(password.getBytes()); assertThat(DatatypeConverter.printHexBinary(digest)) .isEqualTo("E10ADC3949BA59ABBE56E057F20F883E"); } @Test public void sha() throws NoSuchAlgorithmException { String password = "123456"; MessageDigest messageDigest = MessageDigest.getInstance("SHA-512"); byte[] digest = messageDigest.digest(password.getBytes()); assertThat(DatatypeConverter.printHexBinary(digest)) .isEqualTo("BA3253876AED6BC22D4A6FF53D8406C6AD864195ED144AB5C87621B6C233B548BAEAE6956DF346EC8C17F5EA10F35EE3CBC514797ED7DDD3145464E2A0BAB413"); } }
There, you see how MessageDigest handles the hashing strategies. In this case, you see MD5 and SHA-512 as an example.
Final Thought
In this post, you answer the following question using hashing algorithms:
How does the destination know if a message was modified after the origin deliveries it?: Hashing algorithms
Hashing helps you to check the integrity of a message, so, you will know if the message changed after it was delivered. However, hashing by itself is not enough to create a secured communication between two entities.
In the following post, you will learn how cryptography adds another security layer to the communication.
If you liked this post and are interested in hearing more about my journey as a Software Engineer, you can follow me on Twitter and travel together.
[…] the previous post, you saw how hashing algorithms are applied in the security context. However, they are just one […]
LikeLike
[…] Part 2: Hashing – Digest […]
LikeLike
[…] Part 2: Hashing – Digest […]
LikeLike
[…] Part 2: Hashing – Digest […]
LikeLike
[…] Part 2: Hashing – Digest […]
LikeLike
[…] okay. To guarantee a unique number for any object on its context, you will need to use a hashing algorithm, however, those are costly and might be […]
LikeLike