One Breach to Crack ‘Em All! Insights from Password Breaches
A few years ago, I got the idea to analyze breach data. These datasets hold a wealth of fascinating information, but the most intriguing aspect, in my opinion, is the passwords. Already about six years ago, I uploaded a combined Finnish and Swedish password list to my GitHub (yikes, time really flies!).
Recently, I gathered a large collection of password breach data from various sources. The total size of these breaches amounted to several terabytes. This breach data is publicly available, often shared as torrents or through phishy download links on suspicious sites, so gathering of this data was….a journey. By using some advanced tools like cat, grep, sed and applying some filtering, I extracted Finnish and Swedish passwords from the data.
Initially, my plan was to create a heatmap overlaid on a keyboard image to reveal if the passwords were typed and patterns in how the passwords were likely typed. This could have shown whether the passwords were typed by humans by analyzing hand positions. However, as I delved deeper, I realized I could conduct a more detailed analysis. I started categorizing the passwords into groups, examining which characters were most commonly used in specific positions within the passwords. This allowed me to generate heatmaps to visualize the data to easily see the most common characters in any position.
I also developed scripts to analyze the characters used in passwords based on their length. From this data, I created Hashcat masks to optimize password cracking. Additionally, I included a series of masks that could crack password hashes in 10% intervals. This helped to create efficient cracking strategy tailored to the example dataset (analyzed passwords).
Part of my analysis involved measuring how quickly passwords could be cracked with Nvidia RTX4070 Ti Super GPU using the generated Hashcat masks. This allowed to quickly see, how much time it would take to crack the password hash locally with Nvidia RTX 4070 Ti Super GPU. For comparison, I also added a test how long it would take to brute force a login password for a certain user.
Without further due, lets dive in to the data. I’ll present the more interesting results here.
Results
A Python program was developed to analyze a list of passwords based on their length and the types of characters used in each. This analysis included generating Hashcat command outputs for various custom Hashcat masks designed to crack a specific portion of the passwords. Hashcat masks are patterns that guide hashcat on which types of characters (letters, numbers, or symbols) to try in specific positions, making the password-cracking process significantly faster compared to attempting all possible character combinations. To speed up the guessing process, masks were grouped into simple percentage-based categories in 20% increments, though the success of this approach depended on whether the target password fell within one of these groups.
Although Hashcat allows only four custom masks at a time, this limitation was addressed by programming the tool to use both internal and custom masks, enabling more efficient mask generation. Additionally, the tool converted the existing passwords into various hash types, allowing verification of the cracking results. These results were then tested manually and checked against the original passwords to ensure accuracy.
The cracking speeds of the RTX 4070 Ti Super were obtained using the hashcat –benchmark command. Based on these results, possible combinations were calculated. The following GIF illustrates a heatmap showing how password hash resolution times vary with password lengths. The heatmap uses intervals of 20/40/60/80/100 and highlights the impact of custom Hashcat masks on cracking times. Higher percentages of resolved passwords and longer passwords require more time. The greener the heatmap, the better the security (safer password). Below the GIF, individual charts are provided for further detail.
The final chart presents a table showing the time required to crack password hashes locally. This table clearly illustrates that shorter passwords are much easier to crack. For example, the hash of a six-character password can be cracked in just 6.6 seconds. This represents the maximum time needed to exhaustively loop through all possible combinations, though the password is often found much faster. In contrast, for a password with 10 characters, looping through all possibilities could take over 26 years!
Note: Quantum computers and other technological advancements have the potential to significantly alter these charts in the future, possibly within the next decade.
The password rules used to crack a percentage of the password hashes were generated with a Python script. For example, the results for eight-character passwords are as follows:
Length group: 8 characters
10% threshold: 0123469adeiklmnorstu
Mask for 10%: ?1?1?1?1?1?1?1?1 -1 0123469adeiklmnorstu
20% threshold: 0123456789adeiklmnoprstu
Mask for 20%: ?1?1?1?1?1?1?1?1 -1 0123456789adeiklmnoprstu
30% threshold: 0123456789abdeghiklmnoprstu
Mask for 30%: ?1?1?1?1?1?1?1?1 -1 0123456789abdeghiklmnoprstu
40% threshold: 0123456789abcdefghijklmnoprstu
Mask for 40%: ?1?1?1?1?1?1?1?1 -1 0123456789abcdefghijklmnoprstu
50% threshold: 0123456789abcdefghijklmnoprstuvy
Mask for 50%: ?1?1?1?1?1?1?1?1 -1 0123456789abcdefghijklmnoprstuvy
60% threshold: 0123456789Sabcdefghijklmnoprstuvwxyz
Mask for 60%: ?1?1?1?1?1?1?1?1 -1 0123456789Sabcdefghijklmnoprstuvwxyz
70% threshold: 0123456789ABLMSTabcdefghijklmnopqrstuvwxyz
Mask for 70%: ?1?1?1?1?1?1?1?1 -1 0123456789ABLMSTabcdefghijklmnopqrstuvwxyz
80% threshold: 0123456789ABDEHJKLMPRSTabcdefghijklmnopqrstuvwxyz
Mask for 80%: ?1?1?1?1?1?1?1?1 -1 0123456789ABDEHJKLMPRSTabcdefghijklmnopqrstuvwxyz
90% threshold: !0123456789ABCDEFGHJKLMNPRSTUVWXabcdefghijklmnopqrstuvwxyz
Mask for 90%: ?1?1?1?1?1?1?1?1 -1 !0123456789ABCDEFGHJKLMNPRSTUVWXabcdefghijklmnopqrstuvwxyz
100% threshold: !"#$%&'()+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_abcdefghijklmnopqrstuvwxyz{|~¤Ãäå Mask for 100%: ?1?2?3?4?3?2?1?1 -1 !"#$%&()+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_abcdefghijklmnopqrstuvwxyz{~ -2 !"#$%&'()+,-./0123456789:;=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ]_abcdefghijklmnopqrstuvwxyz|Ãå -3 !#$%&()+,-./0123456789:=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[^_abcdefghijklmnopqrstuvwxyz{~¤ -4 !#$%&'*+,-.0123456789:<=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[_abcdefghijklmnopqrstuvwxyz{ä
All hashcat masks can be found from here.
A web login brute-force attack was tested by creating a simple web login form using Python and Flask. The login functionality was evaluated with a Python script that utilized concurrent.futures
with 50 worker threads. This script executed 1,000 login attempts and measured the median response time, which was approximately 0.3189 seconds per request.
The median response time was then used to estimate the time required to brute force a user’s login password, assuming all possible printable characters were included. This character set comprised lowercase (a–z), uppercase (A–Z), digits (0–9), Scandinavian characters, and other printable special characters. The test results are displayed in the accompanying image.
While tools like “ffuf” could potentially accelerate brute-forcing, the overall speed heavily depends on the target server’s performance and network conditions. Additionally, if rate limiting is implemented, brute-forcing becomes significantly slower.
The results indicate that brute-forcing a login password on the web is impractical without tailored wordlists or prior knowledge of the password. A pure brute-force approach would take an infeasibly long time. However, there is always a slim chance of randomly guessing the correct password within the first few minutes of an attack by pure luck.
I also created keyboard overlays to visually analyze whether seemingly random passwords revealed the positioning of hands during typing. This analysis suggested that many of the passwords in the source data appear to be human-generated. This could be influenced by the fact that the most commonly used letters are easily accessible on a keyboard, reflecting a form of cognitive bias. While it is not conclusive that most passwords in the source data were created by humans, the patterns strongly suggest it.
The heatmaps were generated using a Python program that creates an SVG image of a keyboard and overlays the data on top. The intensity of the red color indicates how frequently each character appears in the passwords, with deeper red representing higher frequency.
This small research is based on data from existing breaches, and current password trends may differ slightly from the analyzed data. For an attacker to crack password hashes, they first need a method to obtain them, which typically requires exploiting a vulnerability to facilitate a data breach. While brute-forcing passwords against web logins is time-consuming, some targeted guessing can reduce the number of possibilities. For example, insights from past data breaches can reveal patterns in an organization’s password policies or individual users’ habits, such as a tendency to create specific types of passwords. The heatmaps and character position analysis reveal interesting trends, such as the frequency of certain characters and patterns associated with keyboard layouts, showing the connection between human-generated passwords and their inherent vulnerabilities.
I think the development of tailored Hashcat masks is nice achievement, presenting a practical and efficient approach to password cracking. Of course, they are based on the breach data and effectiveness may wary against other targets. By categorizing passwords into success rate intervals and optimizing the process with advanced GPU benchmarks, this demonstrates the practical feasibility of cracking certain types of passwords while also showcasing the limitations of brute-force methods. The time-to-crack estimates provide critical benchmarks for understanding password strength, enabling security practitioners to refine their approaches to password security. Of course, while quantum computers and technological advancements could significantly change the landscape, attackers would still need access to the password hashes before attempting to crack them.
Web login brute-force testing further emphasizes the importance of server-side defenses, such as rate limiting, tar pitting, multifactor authentication to stop malicious attackers. The research also highlights the impracticality of pure brute-force attacks on common web logins while demonstrating the importance of tailored wordlists and efficient cracking strategies in scenarios where vulnerabilities may exist, e.g. in the form of a very simple password that is found in some common password lists. These findings contribute to a broader understanding of web security challenges and potential countermeasures.
While long and highly complex passwords are safer, they are often harder to remember—especially if you’re using several different ones across multiple sites. Some people might rely on a base password and modify it slightly by adding special characters at the beginning or end for different sites. Sound familiar? 🙂 Password sentences could be an alternative, but they might still be guessable in certain situations unless you add a unique twist.
Overall, this research should give some insights and provide data for improving password security. It not only advances the understanding of how passwords are created and cracked but also offers actionable recommendations.
Note: This small-ish research is only part of the results, I will be releasing some more insights to passwords and cracking at some point. 🙂