A little later then I planned (NTLM tables fixed and currently uploading), but here we go with version 0.4 of EmDebr, my SSE2 enabled MD5 password cracker. The main improvement is the plaintext generation on passwords > 4 characters. There are now two types of cracking threads where one, the old one version, is used for lengths <= 4. So what exactly is different now?
EmDebr generates MD5 hashes for 12 plaintexts simultaneously. In the old plaintext generation I used to continue to generate 12 plaintexts by just moving on to the next plaintext 12 times every round. With the new generation I split up the generation in 2 parts, 'first 4 characters' and 'character 5 and beyond'. The second part is now different for all 12 plaintexts, so the first part can be equal for all plaintexts. This way we only have to change 1 plaintext every round until we ran through all combinations on the first part (26*26*26*26=456976 rounds). At that point we move the second part to the next plaintext 12 times and start over again.
I added another speed up, using some sort of 'character couple' cache. I got the initial idea by looking at brute forcing code by Sascha Pfaller. His implementation generates a fairly large bunch of plaintexts at once before starting to hash them. This implementation didn't really work well with me, but it did give me an idea. So now I'm using an array of shorts (2 bytes) and store all combinations of 2 characters of the charset in it. So with loweralpha it contains aa-zz. So instead of 'generating' a plaintext every round, I just 'set' a plaintext every round.
EmDebr now went from like 135 Mhashes/s to like 176 Mhashes/s on my system!
I might clean up and release a new version of EnTibr as well soon, that one had an even better speed up.
p.s. Haven't tested this code on Linux anymore.