Conducting Password Splicing Attacks With oclHashcat-plus

2012-01-19 00:01:12 by chort

A coworker once told me he imagined immigration officials handing Chinese immigrants two bags with slips of paper, asking them to pick a paper from each bag and put them together to form the name of their restaurant. This is how he imagined names like "Green Dragon," or "Golden Lotus," or "China Garden" got created. While it might not be a very accurate way to describe culinary establishment marketing, it is similar to how many users choose passwords. I'm calling this method the "Chinese Take-out Attack."

Due to password complexity policies that require at least one of (fill in the blank), many users choose passwords by adding a pattern to the end (or less frequently, the beginning) of a word. A typical thought process seems to be:

I think about Carlos a lot, so I'll start my password with 'carlos.'
Now I need a number, so I'll choose the year we met (2006), but that's long to type so I'll shorten it to '06.'
I like him a lot, so I'll add an exclamation!
My password will be 'carlos06!'

You might call this 3 sections, but the entropy for such a suffix is sufficiently low that there are a high number of collisions in 2-5 character password suffixes, so I like to think of this as two parts: The base word, and the padding. The same works for prefixes, either people reverse the order (padding, then base word) or just choose from a predictably small pool of base words (names of people, places, teams, schools, years, zipcodes, keyboard-walking patterns, etc). This is essentially the same process as picking two halves of a name from different bags and connecting them.

You might think that a normal list of cracked or leaked passwords would find a lot of these "Chinese Take-out" passwords, especially after rules have been applied. Remember that there are a lot of ways to connect even a small number of patterns, so it's unlikely even the majority of common variations will be found in password lists. Rules are much better at finding character substitutions and other minor variations than they are at finding words swapped to different positions in a string.

I'd been analyzing a list of hashes recently and had reached the point where every attack feasible with my hardware was returning meager results. I had run through all the high-yield attacks, such as fully brute-forcing short passwords, using wordlists + rules, using wordlists + masks, etc. I was down to trying different full-mask attacks (a sub-set of brute-forcing) and using wordlists with thousands of randomly generated rules. While I was finding passwords, it was at the rate of several per hour. Clearly I was not going to find the last 100,000 passwords at this rate. Enter cutb.

The 0.6 release of hashcat-utils will include a new program called cutb. It's essentially a binary executable to do the common string library functions of returning the first n characters, last n characters, or some other sub-section of a string. When you recall that password-cracking wordlists are giant files will millions and millions of strings, you may get an idea of how this would be useful. To spell it out: You can lop off the prefix and/or suffix from every entry in a password list, creating new files with specific length prefixes and suffixes. You can then splice those sub-strings together (pick one from each bag) to create new wordlists, or simply use two lists simultaneously with something like the oclHashcat-plus  combinator attack. The combinator attack fuses two strings together on the fly, repeating the process until all combinations of the two wordlists are exhausted.

Now you may understand this at a conceptual level, but if you're anything like me you'll need to visualize it to really understand the power. I'll show some real examples of plaintext passwords I recovered using Chinese Take-out Attacks on hashlists that had resisted all other previous attacks.

a26armorer = a26a + rmorer
1VANhalen0 = 1VAN + halen0
107Travel. = 107T + ravel.
Dubai007. = Dubai + 007.
g0tsm0k3d = g0tsm + 0k3d

Some of these were probably obvious, but some are not. Who would have thought a password starting with '107T' would contain the word 'Travel' as the base word? More over, this is not a password likely to be found even by tens of thousands of custom and randomly generated rules (it survived just such attacks), due to having both prefix and suffix padding.

In case it's not obvious, here's how I setup the attack:

chort@hydra:~/fun/tmp$ ./cutb.bin 0 3 < combine.txt | sort -u > 3-first.txt
chort@hydra:~/fun/tmp$ ./cutb.bin 0 4 < combine.txt | sort -u > 4-first.txt
chort@hydra:~/fun/tmp$ ./cutb.bin 0 5 < combine.txt | sort -u > 5-first.txt
chort@hydra:~/fun/tmp$ ./cutb.bin 0 6 < combine.txt | sort -u > 6-first.txt
chort@hydra:~/fun/tmp$ ./cutb.bin -3 < combine.txt | sort -u > 3-last.txt
chort@hydra:~/fun/tmp$ ./cutb.bin -4 < combine.txt | sort -u > 4-last.txt
chort@hydra:~/fun/tmp$ ./cutb.bin -5 < combine.txt | sort -u > 5-last.txt
chort@hydra:~/fun/tmp$ ./cutb.bin -6 < combine.txt | sort -u > 6-last.txt

This gave me all the unique 3-6 character prefix & suffix strings from combine.txt (my main wordlist). To see how bad human entropy really is, compare the number of lines in combine.txt to the number of unique prefixes and suffixes.

chort@hydra:~/fun/tmp$ wc -l combine.txt 
76485167 combine.txt    # That's 76.5 MILLION entries, mostly recovered passwords

chort@hydra:~/fun/tmp$ for i in ?-first.txt ; do echo $i ; wc -l $i ; done
297231 3-first.txt           # That's 300,000 unique, out of 857,375 possible in ASCII
3188252 4-first.txt         # 3.2 million out of 81.5 million possible in ASCII
10947267 5-first.txt       # 11 million out of 7.74 BILLION possible in ASCII
24121023 6-first.txt       # 24 million out of 735.09 BILLION possible in ASCII

chort@hydra:~/fun/tmp$ for i in ?-last.txt ; do echo $i ; wc -l $i ; done
314039 3-last.txt            # 314,000
2989287 4-last.txt          # 3 million
9395507 5-last.txt          # 9.4 million
20610518 6-last.txt        # 21 million

Putting it into action:

chort@hydra:~$ ./oclhp64 -m 0 -n 160 --gpu-loops=1024 -d 2 -c 128 \
 -o /fun/out/cracked.out2 -a 1 /fun/hash/hashlist2.md5 \
 /fun/tmp/6-first.txt /fun/tmp/4-last.txt

This is just barely scratching the surface of what's possible using cutb output. You can also use the resulting wordlists, one at a time, combine with a mask attack to form a hybrid attack. In essence, it combines a limited characterset brute-force as either a prefix or a suffix for existing words, read from a file.

chort@hydra:~$ ./oclhp64 -m 0 -n 160 --gpu-loops=1024 -d 1 -c 128 \
 -o /fun/out/cracked.out1 -a 7 /fun/hash/hashlist1.md5 \
 -1 ?l?u?d -2 ?l?d?s ?1?2?2?2 /fun/tmp/4-last.txt

Another method I intend to explore is piping the raw output (i.e. without sort -u) through sort | uniq -c to get a frequency count of each prefix or suffix. Extremely common prefixes and suffixes could be incorporated into rules, suitable for use in multi-rule attacks. The idea behind multi-rules is very similar to combinator, but rather than combining two wordlists in every possible way, it combines two sets of rules in every possible way, and applies them to a wordlist.

So is it really worth going to the effort of processing your wordlists with cutb, taking the time to sort -u the results, and go through all the permutations? Absolutely! In less than a day I was able to recover an additional 4,000 passwords from a hashlist I'd only be getting a few hundred a day at best from via mask attacks and random rules. On another hashlist that I hadn't cracked any hashes in roughly 8 days, I got two within the first 12 hours using the Chinese Take-out Attack. I intend to exploit it to the fullest extent.

This article was only possible thanks to mountains of prior research. Here are some of my sources of inspiration:
On the evolving security of password schemes (Rootlabs)
More on the evolution of password security (Rootlabs)
Salt The Fries: Some Notes On Password Complexity (Kaminsky)
Crack Me If You Can 2010 - Team hashcat (atom)
Ob-security (d3ad0ne)
Pipal (digininja)
PACK (iphelix)
The Password Project (arex1337)
Question Defense Tutorials (purehate_)


at 2012-01-20 04:08:21, Artis wrote in to say...

The same can be accomplished with John The Ripper rules; take a look at this john.conf:

at 2012-01-20 06:27:25, Rich Gautier wrote in to say...

Relevant - The xkcd password generator [and comic] -

at 2012-01-21 15:14:44, atom wrote in to say...

@Rich Gautier: WOW! I just checked this site and its dangerous. Their english dictionary contain only 1949 base words. This makes 14429369557201 possible combinations ONLY. If the password is stored in MD5 on which i can do 10B/s with a single hd6990 i can crack every possible combination in less than 30 minutes. Shut down this site!

at 2012-01-21 15:31:58, chort wrote in to say...

@Artis: Unless I'm missing something, it looks like those Kore rules do common prefix/suffix padding and some common substitutions. Those are useful rules, but don't do what I'm talking about with password splicing. The splicing attack doesn't rely on modifying a single base word, it connects actual prefixes and suffixes observed in the wild, without making a pre-judgement as to what location the baseword willl be found in. At the time I ran cutb output against this list, I had already gone through all the Kore rules that work for oclHashcat.

at 2012-02-20 00:51:11, Artis wrote in to say...

@chort: I see. Thank you for taking time to explain the difference; I'll add your method to my toolbox.

Add a comment:




max length 1000 chars