Friday, May 15, 2009

Cacheebr, the MS Cache password brute forcer

As requested, I built an MS Cache brute forcer. The MS Cache hashes are a little harder to optimize. They are salted and need 2x MD4. This is how you built an MS Cache hash:

* Built NTLM hash for the password: MD4(Unicode(password))
* Append Unicode&lowercase username to the NTLM hash
* MD4 that

So in short: MD4( MD4(Unicode(password)) + Unicode(tolower(username)) )

Because of this, you need the calculate the full MD4 hash for every plaintext. Because of the unknown first 16 bytes of the input for the final MD4 (the NTLM hash), you cannot really reverse steps. I only reversed partial last steps.

I've been a little lazy, this version only supports usernames with a maximum length of 19 characters. You would need to do an additional MD4 for longer usernames.

I interlaced SSE2 three times, getting to something like 72 Mhashes/s on my system.

The download links:

You might also need to install the Microsoft Visual C++ 2008 Redistributable Package.


  1. Hi Daniel,

    Thank You very much for this. I didn't check Your blog few days and now this, wow.

    I will test and compare with John the Riper and report here.

    Thank You one more time.


  2. What does it mean to 'interlace' SSE2? I am (relatively) familiar with SSE2, and heard the word 'interlace' used before, but don't know how it pertains.

  3. I can understand why you could not reverse the inner MD4, but couldn't you still reverse the outer MD4 as much as any flat MD4 computation? Treat (MD4(Unicode(password))+Unicode(tolower(username)))
    as a plaintext and compute all but last round, which has been reversed from mscache hash?

  4. hi dzhugashvili, sorry for my late answer (vacation :)).

    interlacing SSE2 means that you arrange the sse instructions in such a way that your cpu performs multiple instructions per clock cycle. It is important that these instructions do not depend on each other (like a=b+c; d=a*2, where you cannot calculate 'd' before you calculate a). By interlacing in brute forcers, you just calculate the hashes of multiple plaintexts, as these do are independent calculations.

  5. the reason that you cannot reverse the outer MD4 loop with ms cache is because while reversing you keep a part of the input (plaintext) fixed and go change another part of the input. So with regular plaintexts, you reverse the hash for example for the following group of plaintexts:

    you keep part 2 fixed and calculate all the possible plaintexts for the first part (aaaa-zzzz), then you change the 2nd part and re-reverse the hash for this group:

    With the outer MD4 loop, all the inputs are a 128 bits MD4 hash. You cannot efficiently fixate a certain part of this hash and go change the rest (because that input hash actually depends on the plaintext input of the inner MD4 calculation).

    I hope this clears things up for you, feel free to ask for a better explanation :)

  6. I see. Thanks! Hey you may be interested in this:

    I benchmarked your programs against several others I found on the internet - yours were pretty high up there! I was especially impressed by cacheebr!

  7. Interesting things there.. but is it me, or did you somehow remove all the actual links on your site? i assume that at least the red marked text are supposed to be links?

  8. Yeah, the website was still under construction. The links should be up now.

  9. Hello. I was wondering if you might have access to or be able to program a simple example of the mscache algorithm? Something that would just take 'char username[]' and 'char password[]' and produce an mscache hash, using the OpenSSL libraries if, necessary. I have been scouring the net looking for a simple example of mscache, but no luck. Cacheebr's code was a little to complicated for me to follow, having insufficient understanding of the algorithm to start with. I tried Alain Espinosa's mscache demonstration (, but it doesn't work. He confirmed that it 'had a few bugs' but he didn't have time to find them. I tried to fix it, but can't.

    If you could make a simple program, or point out or fix the problem in Espinosa's code, it would be greatly appreciated. I am collecting examples of such algorithms on my website ( and would also like to include mscache algorithm support in the next release of Havok.

    I understand if you don't have time, just thought it would be worth a try. Thanks!

  10. i had some short code lying around using an easy MD4 implementation (see sourcecode from rcracki_mt for that code), you could replace it with the appropriate MD4 functions from OpenSSL:

    // >>>>>>>
    unsigned char pHash1[16];
    unsigned char pHash2[16];

    string sPass = "hello";
    char u_pass[32];
    memset(u_pass, 0, 32);

    // Convert password to unicode
    int i;
    for (i = 0; i < sPass.size(); i++)
    u_pass[i*2] = sPass[i];
    u_pass[i*2+1] = 0x00;

    // make NTLM hash, result in pHash1
    fast_MD4((unsigned char*)u_pass, (sPass.size()*2), pHash1);

    string sUsername = "administrator";
    char u_username[40];
    memset(u_username, 0, 40);

    // Convert username to unicode and lowercase it
    for (i = 0; i < sUsername.size() && i < 19; i++)
    u_username[i*2] = tolower(sUsername[i]);
    u_username[i*2+1] = 0x00;

    // u_temp will be the plaintext for the second MD4 calculation
    char u_temp[56];
    // set it to 0 first
    memset(u_temp, 0, 56);
    // pHash contains result of first MD4 calculation = NTLM hash
    memcpy(u_temp, pHash1, 16);
    // NTLM hash + username
    memcpy(u_temp+16, u_username, (sUsername.size()*2));

    // Hash it, pHash2 contains ms cache hash for password 'hello' with username 'administrator'
    fast_MD4((unsigned char*)u_temp, (sUsername.size()*2+16), pHash2);

    // print it
    for (i = 0; i < 16; i++) printf("%02x", pHash2[i]);

    // <<<<<<<

    haven't tested it, but i think this should work, or at least make things clear enough to implement it yourself.

  11. One interesting thing about your code. I got it to work using OpenSSL's MD4, but then I tried to upgrade to the reference implementation MD4_NEW() you used in rcracki_mt. I was able to use the MD4_NEW function for the first MD4 (the one that produces the NTLM hash), but not the second one. I also noticed that in rcracki_mt, you also used openssl for the mscache algorithm, so evidently you ran into the same problem? Quite unfortunate, as the second MD4 is really the essential one for a multihash brute forcer. The first MD4 is run once per candidate, but the second is run once per hash per candidate. Have you find out why this is? If you happen to find out, please tell me.

  12. the issue is probably found in the length of your username. both MD4_NEW and my own used md4 code do not implement inputs larger then 56 bytes, that is why cacheebr doesn't support usernames longer then 19 characters. it shouldn't be too much of a problem to fix that if you want to.

  13. Ah. I see. Yeah, the MD4_NEW actually only supports input lengths <32 bytes :( I will try to modify this - unless you already have another reference implementation that supports 56 bytes or more? I used MD4_NEW because I could not find the fast_MD4 function you referred to in rcracki_mt. I will look around.

  14. check out this code by jci:

  15. Daniel is your code still around the link you provided is dead?

  16. hi Daxter, i'll be moving tbhost, so it'll be back.

  17. I try compile it on Linux, but:

    $ g++ *.cpp
    Cacheebr.cpp: In function 'int main(int, char**)':
    Cacheebr.cpp:185: error: 'memcpy' was not declared in this scope
    Cacheebr.cpp:270: error: 'ETIMEDOUT' was not declared in this scope
    crackThread.cpp: In member function 'void crackThread::run()':
    crackThread.cpp:96: error: 'align' was not declared in this scope
    crackThread.cpp:96: error: '__declspec' was not declared in this scope
    crackThread.cpp:96: error: expected `;' before 'unsigned'
    crackThread.cpp:98: error: 'pMuteMe1' was not declared in this scope
    crackThread.cpp:98: error: 'memset' was not declared in this scope

  18. i didn't port Cacheebr to Linux... it shouldn't be too hard to fix it yourself. I have no plans to fix this stuff anytime soon, sorry.

  19. hello daniel, the speed of this tool is great. can u please implement such functions like: last password that after one attack no need to start again from begin? i'm interested in the question, this tool use also already gpu-power? there are some nice projects about this like pyrit - a python-app or the cuda-multiforcer. it would be great if u can transfer ur knowledge with this tools. it's also possible to implement dict-attacks? there is a lot of potential. with the best regards. u have knowledge about reversengenering? and u know, how how to organize a gpu-farm?

  20. I have written bruteforcer MSCashe on GPU.
    Something is certainly crookedly written, but works. Gives out 215 Mp/s for one hash on mine 8800GT.
    But well still a code for CPU to add.

  21. Would be nice a restore point like rracki :D
    But anyway, thanks for the tool

  22. On depositfiles have removed. Has loaded on

  23. Sources for GNU/Linux users :

    Great job and good continuation ;)

  24. Hi there, looks like is down, could you please upload the Cacheebr files somewhere else?

    Thanks and regards :)