Many of us hear about how a hacking group managed to break into a website or service and stole customer data, which often leads to panic about how secure our passwords are. After all, if the hackers stole your user data, then wouldn’t that mean that they also have our most secret passwords? Now we experience a cold sweat and start rushing to update all of our passwords from Fluffy2018 to Fluffy2019, right?
The truth is that if a service provider is using standard industry practices in password storage, most of the time you do not have anything to worry about. Again, as long as the provider is using standard industry practices.
Hashing has been around for some time, but is definitely a part of technology that is still growing and changing. But what is hashing?
Hashing is the process of taking a given set of data, running a given calculation on that data, and then getting a unique result that looks nothing like the original data. Additoinally, this result cannot be reversed and translated back into the original data set. You may be wondering what good is this if you can’t reverse the process but in actuality, this can be very useful in the computer world. For example, hashing can be used to verify data integrity by creating unique ‘signature’ values or storing frequently used data by creating unique ‘key’ values.
The process of hashing is based on a mathematical algorithm with one goal: Try to ‘map’ as many inputs (data) to outputs (keys or hash results) with as few ‘collisions’ as possible. Since these hashing algorithms can take an entire sentence and produce a result that can range from 8 or more characters, you can imagine that at some point you will run into a situation where another sentence results in the same sequence of characters. This is referred to as a collision. The art in making a good hashing algorithm is to write one that maps the largest domain of data to the smallest range of results, with the fewest collisions possible.
The easiest way to imagine how a hashing algorithm works is to think of the following scenario:
Picture your city’s DMV, also known as the 8th circle of hell the DMV in your city. Let’s pretend that your city is very small with about 100 people. Also, let’s assume that none of these 100 people are related and, therefore this city is actually diverse… ahem. Now imagine that inside the office the lines correspond with each individual’s last name. Chances are all 100 people could come in and get in their line, without finding someone else standing there already. At this point, you can see that there may be a problem and that as you increase the scope from this city, to say the entire world, you are guaranteed to see multiple collisions. After all, you are the only John Doe in the world, right?
So, how is hashing used for saving passwords?
If you’ve made it to this point, you are probably wondering, why would service providers want to use a hashing algorithm to store my password if they can’t reverse the result and make sure that my password matches? The answer is they don’t want to see your password or reverse the result to see your password.
In fact, service providers find this relationship most convenient as the last thing they want to be responsible for is for someone claiming that their password was exposed, due to a compromise. By hashing your password and storing the results, the provider simply needs to do the same thing every time you login: compare the result to their stored result. Any would-be thief, who steals all the hashed results, would not be able to see what your original password was. This does however leave another way that the thief can break into your account.
If a hacker manages to steal a web sites user login table, the hacker could try to (either through random generation or through iteration) determine what your original password was by hashing different passwords until they get the same hash result. However, the trick is that 1) the hacker needs to know which hashing algorithm was used to generate the result and 2) the password they come up with may not be your actual password, but rather some value that happened to also map to the same hash result. So let’s say your password, Fluffy2019, just happens to map to the same hash result as AAABB123, then the hacker can simply login using AAABB123 in place of your password.
Let's add some flavor
So you’re probably wondering, how do I defend against someone trying to guess your password and matching the hash results? The answer is salt. Or salting rather.
Service providers are aware of the shortcomings of hashing, and that as much as they try to ensure that your data is secure, there will always be bad actors out there who are going to find ways to break in through some kind of unforeseen method and steal their/your data. The only solution then is to make sure that the compromised data is as useless as possible to an outside attacker.
What programmers realized, is that if you take someone’s password, add some kind of secret code or pass phrase to it, then hash that combination, your result will actually be unique. Even if a hacker brute force attacks a hashed password list, the resulting password they compute will still not work on that company’s website or any website since there was actually a hidden variable added. And since hashing is a one way process, it is impossible to reverse engineer the salt value or even tell that a salt value was used. The trick is however, to keep the ‘salt’ value a secret.
Phew, guess I'm safe then...
Well, actually, no.
Hackers are always trying to find new ways to obtain passwords or compromise a secure system. In fact, most of the time, these hackers use something called an ‘exploit’ or they try to ‘phish’ for your password to break in. An exploit is a flaw in a system that was overlooked and if used correctly, can allow an attacker to gain access. It’s equivalent to a building that someone spent millions on securing with barb-wired fencing, vicious attack dogs and security guards (yes, they are vicious too), cameras, access cards, retinal scanners, lasers, guns, rabid pigeons, but then they forgot to lock the back door in the dark alley behind the building.
‘Phishing’ is the act of trying to trick a user into providing their login credentials. Hackers will send out fraudulent emails that look like they are from a company or site that you use. This email will contain a link that when clicked, takes you to a fraudulent website designed to mimic the real site. This fake site however, only serves one purpose and that is to obtain your login. Typically, after a user enters their login information, the fake site will then redirect them to the actual site in hopes that the user will simply think that they must have entered their login incorrectly.
In the end, service providers try to use many methods to keep their systems secure. You probably noticed that your bank occasionally asks you to verify your identity when logging in because they ‘did not recognize the computer you are using.’ Or maybe you noticed how websites lock you out after you guess your password incorrectly too many times. These are just a handful of methods providers use to try to limit the ways that hackers can break-in.
In the end, the most an end user can do to protect themselves and their account is to use a good strong password and store it safely. Many users complain that it is hard to come up with a memorable complex password, but here are just a few ways you can make it easier (in order of effectiveness):
- (Good, but not great) Take something you like, for example, John loves the Dodgers (please don’t judge John). He could use either Dodgers or Baseball as the base of his new password. He then can spice it up by looking at the word and seeing what he can replace with special characters and numbers. So Dodgers, could be changed to D0dger$ where zero replaces the O and the dollar sign replaces the S in Dodgers. This is a better password than had John simply chosen Baseball as his base, since Baseball is a much more generic concept. The idea is to have a range of characters, not just A-Z, but also 0-9 and special characters (that are accepted by the provider). Also, the more unique the base, the better.
- (Better) Use a password generator. Password generators are becoming very common and are actually already present in most modern, good operating systems. Web browsers and password manager applications also allow you to generate random, complex passwords. However, if you are looking for a memorable complex password, some of these generators allow you to select that option, and it will attempt to generate one that is somewhat memorable. Here is a link to a free password generator https://xkpasswd.net/s/
- (Best) Use a password manager and 2FA (Two Factor Authentication). The best protection a user can get is to ensure that their account is secured with 2 of the 3 of the following: Something you know (Password), something you are (fingerprint or other bio-metrics), something you have (phone, key fob, access card, etc). The most commonly used online is something you know and something you have, which is why banks and other secure sites try to text you or call you with a verification number.