When downloading something on a website, you've probably seen some obscure things like checksums
, MD5
or SHA
, SHASUMS
beside or around the DOWNLOAD
button. To the layman, those stuff are strange aren't they?
But actually, they are your good friends! Those are known as checksum or checksum files. These files are actually supposed to help you verify that your downloaded file is really bit-for-bit exact as published by the website, and not a broken download.
It's actually very simple to verify a file checksum. Simply use an app to verify that your file's checksum matches the website file's checksum.
In this guide, you will learn how to do this. I'm going to avoid using jargons, so you can understand. >.<
Three simple things to know
For beginners, a checksum is simply a number.
Checksum:
- A checksum is a short hexadecimal number calculated from a given file's content, using a checksum algorithm.
- A checksum is assumed to be unique to a given file:
- Two same files have same checksums
- Two different files have different checksums
Checksum algorithm:
- A checksum algorithm is a mathematical formula.
- Some common examples:
MD5
,SHA1
,SHA256
,SHA512
etc. - Presently,
MD5
andSHA1
are broken algorithms because they have shown to produce same checksum for two different files. That means a received file's content may be in two possible states: 1) intact, or 2) corrupted and undetected by our checksum verification. That is bad! Therefore, it is recommended to use at leastSHA256
and above.
Hexadecimal:
- Humans use decimal which a base-10 numeric system (
0
to9
). Hexadecimal is a base-16 numeric system (0
to9
, andA
toF
characters). For example starting from0
:- Decimal and hexadecimal start from
0
, and go up to9
- Decimal
10
is hexadecimalA
- Decimal
11
is hexadecimalB
- ...
- Decimal
15
is hexadecimalF
- Finally, decimal
16
is hexadecimal10
(hexadecimal power increases) - As you can see,
A
toF
are really just digits in hexadecimal
- Decimal and hexadecimal start from
- You probably see HTML color codes like
color: #ffa0c2
. Now you know what they are! They are just 3 hexadecimal numbers:ff
is decimal255
(red),a0
is decimal160
(green), andc2
is decimal194
(blue). The same color code can also be written ascolor: rgba(255, 160, 194)
. - Hexadecimal is case-insensitive. For example,
a
is the same asA
,f
is the same asF
, anda0b1c2d3e4f5
is the same asA0B1C2D3E4F5
.
So... A checksum is simply a hexadecimal number that we need to verify.
Using QuickHash-GUI
I personally like Quickhash GUI which is a cross-platform (Windows, Mac, and Linux) freeware tool that can verify file checksums, and many other things. We're going to use Windows for this demo, it's the same for other Operating systems (Mac and Linux).
Step 1: Download QuickHash-GUI
First, download QuickHash-GUI v3.x.x (for Windows, Mac, and Linux):
Save the file into your Downloads
folder:
Extract the file. E.g. C:\Users\user\Downloads\QuickHash-GUI-Windows-v3.3.1
:
Navigate into the 64-bit
folder, and launch Quickhash-GUI_x64.exe
:
Now we are going to download a file to practice verifying its checksum.
Step 2: Download a file
Visit Wireshark, under Download Wireshark
, click on Windows PortableApps (64bit)
to download:
Save the file to your Downloads
folder:
Step 3: Verify the downloaded file
Now in Explorer, drag-and-drop the downloaded file (i.e. in my case, WiresharkPortable64_4.0.4.paf.exe
) into Quickhash-GUI
window:
In Quickhash-GUI
, on the left side under Algorithm
, select SHA256
. See that Quickhash-GUI
shows the calculated SHA256 checksum (in my case, it is D72789CE7CA3715C044AC0913BA0603DF89699EBB6F3839547D64AC1FD9A1518
):
Now we only need to verify that our file's checksum matches Wireshark's published checksum.
Now return back to Wireshark website, and scroll down to Verify Downloads
section, and click on signatures file
:
You will the checksums of all Wireshark
published files. Find the file you downloaded (i.e. in my case, WiresharkPortable64_4.0.4.paf.exe
), and copy its SHA256
checksum:
Now go back QuickHash-GUI
, click Clear Hash Field
button, then in the Expected Hash value
box, paste the clipboard. You should see Expected hash MATCHES the computed file hash!
, meaning the checksums match and the file is verified!
You may click OK
to finish.
Using the Command Line
Instead of using a Graphical User Interface (GUI) tool like QuickHash-GUI
in Step 1, we may also validate checksums using the command line.
Assuming we've completed Step 2, we just need to verify that our downloaded file checksum is D72789CE7CA3715C044AC0913BA0603DF89699EBB6F3839547D64AC1FD9A1518
.
Windows
For Windows, we can use Windows Powershell.
Click Start
and open Powershell
:
Now type the following, pressing Enter
after each line:
While typing you can press
TAB
for autocompletion of file names, parameter, and parameter values, try it. It greatly speeds up typing.
cd Downloads
Get-FileHash WiresharkPortable64_4.0.4.paf.exe -Algorithm SHA256
You will see an output showing the SHA256
checksum. See that it matches the expected value:
If you can't trust your eyes, to verify programatically, run the following. You will see the same results:
cd Downloads
Get-FileHash WiresharkPortable64_4.0.4.paf.exe -Algorithm SHA256 | Where-Object {
$_.Hash -eq 'D72789CE7CA3715C044AC0913BA0603DF89699EBB6F3839547D64AC1FD9A1518'
}
Congratulations, the downloaded file is verified.
Mac
For Mac, we can use zsh
or bash
.
Open Terminal
, and type the following:
While typing, you can press
TAB
for autocompletion and listing file, try it. It greatly speeds up typing.
cd Downloads
shasum -a 256 WiresharkPortable64_4.0.4.paf.exe
You will see an output showing the SHA256
checksum. See that it matches the expected value:
If you can't trust your eyes, to verify programatically, run the following. You should see the message WiresharkPortable64_4.0.4.paf.exe: OK
:
Note the two spaces between the checksum and the file name.
cd Downloads
echo 'd72789ce7ca3715c044ac0913ba0603df89699ebb6f3839547d64ac1fd9a1518 WiresharkPortable64_4.0.4.paf.exe' | shasum -a 256 -c -
Congratulations, the downloaded file is verified.
Linux
For Linux (e.g. Ubuntu), we can use sh
or bash
.
Open Terminal
, and type the following:
While typing, you can press
TAB
for autocompletion and listing file, try it. It greatly speeds up typing.
cd Downloads
sha256sum WiresharkPortable64_4.0.4.paf.exe
You will see an output showing the SHA256
checksum. See that it matches the expected value:
If you can't trust your eyes, to verify programatically, run the following. You should see the message WiresharkPortable64_4.0.4.paf.exe: OK
:
Note the two spaces between the checksum and the file name.
cd Downloads
echo 'd72789ce7ca3715c044ac0913ba0603df89699ebb6f3839547d64ac1fd9a1518 WiresharkPortable64_4.0.4.paf.exe' | sha256sum -c -
Congratulations, the downloaded file is verified.
Cheat sheet
We now know how verify a file.
Here's a quick cheat sheet you can copy and paste into your notebook.
- Download file from website
- Find the file's checksum from website
- Try to look for
SHA256
and above. If not, useMD5
orSHA1
. - If website provides checksum directly, copy it to clipboard, e.g.
d72789ce7ca3715c044ac0913ba0603df89699ebb6f3839547d64ac1fd9a1518
. - If website provides checksum files, e.g.
MD5SUMS
,SHASUMS
,SHA256SUMS
,SHA512SUMS
,.md5
,.sha1
.sha256
.sha512
, download it and open it in text editor, look for the downloaded file's checksum, and copy it to clipboard.
- Verify checksum
QuickHash-GUI:
- Drag and drop file into
QuickHash-GUI
- In
Quickhash-GUI
, on the left side underAlgorithm
, select the algorithm, e.g.MD5
,SHA-1
,SHA256
, orSHA512
. - Paste the website's provided checksum in the
Expected Hash value
box to verify.
- Drag and drop file into
Windows Powershell:
Get-FileHash .\path\to\file.zip -Algorithm MD5 Get-FileHash .\path\to\file.zip -Algorithm SHA1 Get-FileHash .\path\to\file.zip -Algorithm SHA256 Get-FileHash .\path\to\file.zip -Algorithm SHA512
MacOS Terminal:
md5 ./path/to/file.zip shasum -a 1 ./path/to/file.zip shasum -a 256 ./path/to/file.zip shasum -a 512 ./path/to/file.zip
Linux Terminal:
md5sum ./path/to/file.zip shasum ./path/to/file.zip sha256sum ./path/to/file.zip sha512sum ./path/to/file.zip
Regarding trust
If you followed through, you would have learnt how to verify a file's integrity. This means we know that our file is bit-for-bit the same as the original file on the website or server.
However, verifying a file's integrity (correct data) does not verify it's authenticity (correct sender). In layman's words, just because you receive a exact file copy, doesn't mean you received it from a correct source. It is vulnerable to being a fake (spoofed) website, with a fake (spoofed) file, and we are not protected.
So do we trust the website most of the time? It will take another article to explain things, but in simple terms, if your browser shows a Green Lock icon and the web address shows https://
instead of http://
, you can assume data authenticity - the website is belongs to whomever its owner is, because only the owner can get a Certificate for it. The file transfer is encrypted using this Certificate and the file is transferred without modification. Think of Certificates as centralized identity, like your passport is a public identity that we trust because we first trust the passport issuer.
And in case you are interested, there's a mess of GPG and PGP. Think of this as a decentralized identity, that everybody can publish their own identity, and the more people trust an identity, the more likely it can be trusted. This paradigm has not been very successful because it's difficult for layman to know how to even "get started" trusting anyone, compared to the centralized identity where the layman just has to trust the passport issuer.
Final thoughts
It's important to understand how to verify files because, even if one trusts a secure website (e.g. starts with https://
) with a Green Lock on the browser, there is still no certainty that the file arrives intact, or in the very worst case, that there's a chance that a trojan on the computer might have modified the downloaded file just before one opens it. To be very sure that a file is exact to the original, always verify its checksum. It is good practice. It's a very simple thing to do.
Advanced users might already know everything in this article. But a ton of normal people don't.