Hashed Passwords Are Personal Information Under U.S. Law
On January 22, 2021, a bad actor group termed Shiny Hunters, notorious for exfiltrating large databases of customer information from a clothing retailer, claimed to crack passwords for over 150,000 SHA-256 passwords. This activity introduced some legal questions beyond just the exfiltration of the data.
So, what is SHA-256 you may ask? To better understand the legal implications for the cracking of hashed passwords and the potential impact on litigation, it is necessary to provide a bit of technical understanding about hashing and why it is used.
Encryption vs Hashing
Encryption algorithms take input and a secret key to generate random output called a ciphertext. This operation is reversible. Anyone who knows or obtains the secret key can decrypt the ciphertext and read the original input.
Hashing functions are not reversible. Hashing performs a one-way transformation on a password, turning the password into another string, called the hashed password. “One-way” means that it is practically impossible to go the other way – to turn the hashed password back into the original password. In authentication systems, when users create a new account and input their chosen password, the application code passes that password through a hashing function and stores the result in the database. When the user wants to authenticate later, the process is repeated, and the result is compared to the value from the database. If it is a match, the user provided the right password.
Hashing is a cryptographic process that can be used to validate the authenticity and integrity of various types of input. It is widely used in authentication systems to avoid storing plaintext passwords in databases but is also used to validate files, documents, and other types of data. However, incorrect use of hashing functions can lead to serious data breaches. For example, developers can make implementation errors by using a hashing function that is known to be insecure and is vulnerable to brute force cracking attacks. However, not using hashing to secure sensitive data in the first place is even worse. Hashing is almost always preferable to encryption when storing passwords inside databases because in the event of a compromise attackers will not get access to the plaintext passwords.
To briefly explain a very complicated topic, SHA-256 stands for Secure Hash Algorithm 256-bit and it is used for cryptographic security. SHA-256 is used for secure password hashing. (For reference, SHA-256 has 2256 possible combinations). Here is an example of what a SHA-256 hash of the string password looks like:
There are several hashing algorithms, but some have been deprecated and should not be used. While most industry standard hashing algorithms may make it effectively impossible to work backward from a hashed password to determine a password under current computational limits, other techniques can make hashed passwords insecure. If a bad actor gets a set of hashed passwords, one simple attack vector is to generate a table of possible passwords, run the hashing algorithm, produce their corresponding hash values, and then compare the hashed values for matches. Given the processing power available, bad actors continue to generate enormous tables that contain anything from every possible combination of values for shorter passwords to lists including variants of common and known passwords. Simply stated, if someone uses the more common passwords (e.g., password, 123456, qwerty, etc., and words in the dictionary), the likelihood the bad actor can determine the original password is higher.
As previously mentioned, hashing cannot be decrypted back. However, with time and the right computing power, it can be cracked with brute force attempts or more likely comparing hashes of known strings to the hash.
The Legal Aspects
The classification of hashed values as it relates to the definition of personally identifiable information (PII) is critically important because if access to the hash information permits access to an online account, it can be indicative of whether it is PII for some breach notification statutes such as in the California Consumer Privacy Act (CCPA) or the California Privacy Rights Act (CPRA).
Under the European General Data Protection Regulation (GDPR), hashed passwords are categorized as personal information. GDPR and the National Institute of Standards and Technology (NIST) advise against using some hash algorithms.
Conversely, U.S. law recognizes hashing as a secure, de-identified data point. De-identification means that the data “cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.” In the context of the CCPA, information is not “personal information” if it has been “de-identified.” In comparison, EU GDPR stated that “while the technique of…hashing data reduces the likelihood of deriving the input value…calculating the original attribute value hidden behind the result of a hash function may still be feasible within reasonable means,” and, therefore, the hashed output should be considered pseudonymized data that remains subject to the GDPR.
The question of whether a hashed password “permits access” to an online account is a multifaceted question that has not been fully addressed from a legal standpoint to date. There are two arguments as to whether a hashed password could be considered PII:
- Even if the actual password might be determined from the hashed value, the hash itself does not permit access as it always requires some additional hacking effort to determine the plain text password.
- If a password could be determined for one or more hashed passwords it means that the whole set permits access.
Most likely, if this issue is fully litigated, courts will end up somewhere in the middle of this range but that remains to be seen. Another confounding factor is that technical advances in computing power will continue to move the needle on the effectiveness of existing attacks.
Companies who suffer security breaches often misuse the term “encryption” in their public disclosures and advise customers that their passwords are secure because they were encrypted. This is probably because the general audience is not very familiar with the meaning of hashing, so public relations departments want to avoid confusion. However, this synonymous use of the terms “encryption” and “hashing” makes it hard for outside observers to assess the risks associated with a breach because if the passwords were truly encrypted then the risk is higher than if they were hashed.
As cyberattacks continue to grow at a dramatic rate, coupled with what appears to be a renewed interest by cybercriminals to “crack” hashed passwords, it seems likely that hashing will come under increased scrutiny in both the courts and in the minds of the public at large. It will be important for companies to not only review their hashing techniques and employ best practices but to pay attention to technical advances available to both the company and the hackers as well. Security awareness and training covering the importance of the use of strong passwords is also something that should not be overlooked.
Agio shares information meant to not only make you aware of key changes in prominent cybersecurity and privacy standards, laws and frameworks, but helps to provide context and clarity to these rapidly changing and emerging topics.
Connect with us.
Need a solution? Want to partner with us? Please complete the fields below to connect with a member of our team.