Algorithm optimizations (reversing steps)

Moderator: BarsMonster

Post Reply
brotabstrich
Posts: 3
Joined: Thu Oct 23, 2008 5:40 pm

Algorithm optimizations (reversing steps)

Post by brotabstrich » Mon Nov 03, 2008 7:20 am

And now for something completely different:

Can anybody show - or even better explain - me, how the "reversing steps" optimization for the MD4 or MD5 algorithm is done, because I'm a very curious guy without a clue.

Regards,
brotabstrich

User avatar
BarsMonster
Site Admin
Posts: 1118
Joined: Wed Oct 01, 2008 7:58 pm

Re: Algorithm optimizations (reversing steps)

Post by BarsMonster » Mon Nov 03, 2008 7:37 am

Well, that is quite simple, but might be hard to explain :-)
Look at the last step of MD5 algorithm:

#define I(x, y, z) ((y) ^ ((x) | (~z)))

(a) += (x) + I ((b), (c), (d)) + (unsigned long int)(ac);\
(a) = ROTATE_LEFT ((a), (s)); \
(a) += (b); \

You know all values AFTER this step, therefore you may guess values before this step by doing operations in reverse order.
You may do this because data is also known (password length is small, so last data elements are 0s).
So instead of comparing to target hash value you comparing to calculated a,b,c,d BEFORE this step.

So, for MD5 you may reverse almost 25% of the last steps if password length is small enough.

I hope this helps.

Sc00bz
Posts: 136
Joined: Fri Oct 03, 2008 8:28 am
Contact:

Re: Algorithm optimizations (reversing steps)

Post by Sc00bz » Mon Nov 03, 2008 11:28 pm

That's only part of it. You can reverse the whole last round and a few partial steps in the 3 round. All you do is keep everything but the first 4 bytes in the password constant. Then just do all the steps in reverse. For each reverse step do REV_II:
#define REV_II(a,b,c,d,data,shift,constant) \
a = ROTATE_RIGHT((a - b), shift) - data - constant - (c ^ (b | (~d)));
All the way to the first step in the last round. The first step in last round you need to do REV_II(a,b,c,d,0,6,0xf4292244) not REV_II(a,b,c,d,data[0],6,0xf4292244) because data[0] is not constant.

Now start MD5 as normal and stop at the end of round 3 then just add data[0] to A and if it matches exactly with the partial reversed hash you found the password. The way you check passwords is like this: reverse the MD5 with "????aaa" then check "aaaaaaa" then modify the first four characters "aaabaaa" and re-check then "aaacaaa" .... "zzzzaaa" then you need to re-reverse the MD5 with the new data "????baa" then check "aaaabaa" and so on.

BTW you can skip the last 3 steps in round 3 and just check A+data[0] with revA if that matches check do one more round and check D with revD, same with C, and same with B.

You can also partially reverse steps in round 3 (this saves you about the time of one step so about 2.27% faster).

End of reversing code:

Code: Select all

...
REV_II(revB,revC,revD,revA,data[ 5],21,0xfc93a039);
REV_II(revC,revD,revA,revB,data[14],15,0xab9423a7);
REV_II(revD,revA,revB,revC,data[ 7],10,0x432aff97);
REV_II(revA,revB,revC,revD,       0, 6,0xf4292244);
revB = ROTATE_RIGHT((revB - revC), 23) - data[2] - 0xc4ac5665;
revC2 = ROTATE_RIGHT((revC - revD), 16) - 0x1fa27cf8;
End of checking code:

Code: Select all

...
HH(b,c,d,a,data[10],23,0xbebfbc70)
HH(a,b,c,d,data[13], 4,0x289b7ec6)
HH(d,a,b,c,data[ 0],11,0xeaa127fa)
HH(c,d,a,b,data[ 3],16,0xd4ef3085)
revA2 = revA - data[0];
revB2 = revB - (revA2 ^ revC ^ revD); // "revC ^ revD" can be precomputed in the reversing code.
if (revC2 - (revA2 ^ revB2 ^ revD) == c)
{
	HH(b,c,d,a,data[ 6],23,0x04881d05)
	if (revB2 == b)
	{
		HH(a,b,c,d,data[ 9], 4,0xd9d4d039)
		if (revA2 == a)
		{
			HH(d,a,b,c,data[12],11,0xe6db99e5)
			if (revD == d)
			{
				// MD5 is cracked!
			}
		}
	}
}
Please note you can make it even faster by not adding in data[12] and others that are just zero. Or you can pre-add data[1 to 14] to the constants so that you only need to add data[0]. Oh right there might be a bug/typo somewhere in the code because I just wrote it and haven't tested it.

brotabstrich
Posts: 3
Joined: Thu Oct 23, 2008 5:40 pm

Re: Algorithm optimizations (reversing steps)

Post by brotabstrich » Tue Nov 04, 2008 7:05 am

BarsMonster and Sc00bz, many thanks for these explanations.

I think I have to try this at home. :D

Regards,
brotabstrich

Corni
Posts: 23
Joined: Sun Nov 23, 2008 7:56 pm

Re: Algorithm optimizations (reversing steps)

Post by Corni » Sun Jan 11, 2009 2:06 pm

Sorry for reviving, but how would one do this for md4/mscash? All is constant from data, except data[0]-data[3]...
Is it there possible to reverse much more then the last few steps but one round?
Corni

User avatar
BarsMonster
Site Admin
Posts: 1118
Joined: Wed Oct 01, 2008 7:58 pm

Re: Algorithm optimizations (reversing steps)

Post by BarsMonster » Sun Jan 11, 2009 5:55 pm

For MSCACHE I was not able to reverce much.
That much reversing of MD5 is a random bad design which does not happen often :-)

neinbrucke
Posts: 82
Joined: Sun Nov 02, 2008 8:53 pm

Re: Algorithm optimizations (reversing steps)

Post by neinbrucke » Sun May 03, 2009 2:43 pm

let's revive this one again ;)

Thanks for all the explanations about reversing, I have built an MD5 and an NTLM (MD4) brute forcer using them :)

http://blog.distracted.nl/2009/05/entib ... orcer.html
http://blog.distracted.nl/2009/05/emdeb ... orcer.html

Not nearly as fast as BarsWF, but they are open source, so feel free to help and close the gap ;)

Bitweasil
Posts: 110
Joined: Fri Nov 07, 2008 6:50 am

Re: Algorithm optimizations (reversing steps)

Post by Bitweasil » Sun May 03, 2009 3:01 pm

Is there an effective way to reverse things when dealing with large hashlists?

It looks like 1 or 2 steps can possibly be reversed with MD5 (the null input bits past the length), but doing more significant algorithm reversal involves doing work for each step, which could get to be a lot of work for 100k+ hashes each time the reversal needs to be recalculated.

I'm guessing there's a "crossover point" in terms of "amount of work needed to make reversing worthwhile" - but I'm not sure if the code complexity needed to calculate that on the fly is worth it.

Corni
Posts: 23
Joined: Sun Nov 23, 2008 7:56 pm

Re: Algorithm optimizations (reversing steps)

Post by Corni » Mon May 04, 2009 6:37 pm

You'd have a fixed initial cost when reading a hash the first time - You could write already-reversed hashes to a cache file, as usual sorted so you can directly memcpy() it. You need a file somewhere anyways if you want to implement a restart option. You also could implement an option to supress this caching, and just use it for eg >10000 hashes...

neinbrucke
Posts: 82
Joined: Sun Nov 02, 2008 8:53 pm

Re: Algorithm optimizations (reversing steps)

Post by neinbrucke » Mon May 04, 2009 7:42 pm

the thing is that you are constantly re-reversing the hashes, so you do have a crossover point.

let's take MD5 as an example and do a rough estimation. A full MD5 hash takes 4 rounds, when reversing you do 3 rounds + 1 round for every re-reverse.
with MD5, you need to re-reverse after every charset_length^4 (everytime you change a character after the 4th one (so 5 and higher)). With NTLM (unicode MD4) you re-reverse every charset_length^2.

Let's stick with MD5. If you have the charset of loweralpha, you have to re-reverse after doing 26^4 = 456.976 x partial (3 rounds) MD5. So in theory this would have saved you 456.976-1 = 456.975 rounds to calculate.
The reversing gives some overhead, so it's not really like that. Let's use some random number, let's say we actually gain 50%, rest going to overhead (might be better, might be worse though). This gives us 456.976/2 = 228.488 rounds.
It will very much depend on the actual program and how efficient it is. Now when you search for 2 hashes at the same time, you have to re-reverse twice everytime, where it will still be very profitable to do so.

If you have to search for 200.000 hashes at once however, you might be getting close to the limit of where it doesn't pay to reverse. You will 'only' save 228.488-200.000 = 28.488 rounds. Now if you have a larger charset, let's say mixalpha-numeric, you only have to re-reverse after 62^4 = 14.776.336 hashes. You could also choose to only reverse half of the last round, so you only have re-reverse when you change the 9th characters or higher. That should bring enough profit from reversing :)

With NTLM (unicode MD4) it's really something different, because you are already at the 5th input byte when you start changing the 3rd character. If we do the same calculations:

charset loweralpha, 26^2 = 676 rounds less. effective save when searching for 1 hash: 676/2 - 1 = 337 rounds. (so already at 338 rounds it's not useful to reverse anymore)
charset mixalpha-numeric, 62^2 = 3844 rounds less. effective save when searching for 1 hash: 3844/2 - 1 = 1921 rounds.

With MD4 you can also chose to just reverse half of the last round. Or maybe (haven't tried) you can reverse up to like 75% of the last round with plaintext lengths > 4 and keep first 4 chars (8 bytes) steady.

Anyway, I might have made a mistake or forgot an important factor somewhere... :P

Sc00bz
Posts: 136
Joined: Fri Oct 03, 2008 8:28 am
Contact:

Re: Algorithm optimizations (reversing steps)

Post by Sc00bz » Tue May 05, 2009 12:29 am

You forgot that with large lists you'll need to resort every time you reverse.

Also with reversing you don't need to re-reverse the whole way you can cheat again. With MD5 you can reverse 2 steps save that point, reverse 7 more steps save that point, and then reverse 7 more steps save that point. Then every charset_length^4 you use the 2nd save point and recalculate the 3rd save point. Then every charset_length^8 you use the 1st save point and recalculate the 2nd and 3rd save points... but this only speeds it up by like 0.01% or you could just over clock you computer by like a 1 MHz and get the same results :D.

Bitweasil
Posts: 110
Joined: Fri Nov 07, 2008 6:50 am

Re: Algorithm optimizations (reversing steps)

Post by Bitweasil » Tue May 05, 2009 1:03 am

Sc00bz wrote:You forgot that with large lists you'll need to resort every time you reverse.
That was the thing I was trying to remember that made things horribly inefficient. Sorting on a GPU isn't particularly quick. Sorting a 5M element list isn't quick, period. I guess one could do the calculations to figure out if it's worth it or not, but that was what made me look at it, scratch my head, and leave it for smarter people than I on large hash lists.

neinbrucke
Posts: 82
Joined: Sun Nov 02, 2008 8:53 pm

Re: Algorithm optimizations (reversing steps)

Post by neinbrucke » Tue May 05, 2009 7:57 am

hmm good point about the sorting... haven't written a multi hash brute forcer yet, so no experience with sorting large lists :)

Corni
Posts: 23
Joined: Sun Nov 23, 2008 7:56 pm

Re: Algorithm optimizations (reversing steps)

Post by Corni » Tue May 05, 2009 6:05 pm

I somehow don't get your point - what do you mean by 're-reverse'?
*imho*, you'd have to reverse (reverse ~1 round, in case of MD5) for every hash you want your input to be found. After doing this, you've to sort all reversed hashes. This is the initial cost. Now you just compute 3/4 MD5's and compare that output to the (sorted) list of reversed hashes. Could anyone please point out my mistake?

neinbrucke
Posts: 82
Joined: Sun Nov 02, 2008 8:53 pm

Re: Algorithm optimizations (reversing steps)

Post by neinbrucke » Tue May 05, 2009 6:14 pm

read this one: http://www.freerainbowtables.com/phpBB3 ... =715#p5763

and note the paragraph about brute forcing:

"Brute forcing:
data[09] in round 3: Last full step you need to do then check variable a (1 in 4 billion chance will need to do the next 3 steps). You hold all the data chunks constant and change only data[00]. Therefore you can reverse almost all of round 4 because all the data besides data[00] is constant. You are left with a = a + data[00]. So at each guess you do a - data[00] which gives you the value of a after the step that uses data[09] in round 3. When you have finish changing data[00] you then need to modify the other data chunk(s) and re-reverse round 4 with the new constant data."

You can reverse so much because you keep part of the input constant (everything except for the first 4 bytes), as soon as you change a byte in this constant part, you have to reverse again, using the new values.

maybe this one helps as well:
http://blog.distracted.nl/2009/04/rever ... -hash.html

Corni
Posts: 23
Joined: Sun Nov 23, 2008 7:56 pm

Re: Algorithm optimizations (reversing steps)

Post by Corni » Tue May 05, 2009 8:26 pm

Ah ok,
I thought about reversing a hash just being a static operation, like subtracting the initial constants from the.

dzhugashvili
Posts: 2
Joined: Tue Jun 16, 2009 9:12 pm

Re: Algorithm optimizations (reversing steps)

Post by dzhugashvili » Tue Jun 16, 2009 9:20 pm

So what is the conclusion? Is reversal practical to be implemented in a cracker. Sounds like this is an interesting concept, but from what I hear on this forum, it looks like crossover point may make reversal disadvantageous at only a few hashes in hash list. True? So not practical for anything more than proof of concept or a single hash cracker. I am writing a multi-hash cracker for several different algorithms. Sounds like MD5 reversal is pretty advantageous, assuming hashes < 200 000. But MD4 is hardly practical for multiple hashes? :?:

has anybody actually done benchmarks of reversal on multiple hashes?

does anybody have any benchmarks at all? No reversal vs. reversal?

in regards to a crossover point - how could reversal NOT be beneficial? Even if you re-reverse all hashes quite frequently,
wouldn't it still be faster than computing the entire hash algorithm for every plaintext? Re-calculation of the last round of
MD4 every 26^2 tries should still be less expensive than computation of the last round of MD4 every try? Unless each hash re-reversal needs to be computed separately and there are over 26^2 elements in the list. Correct?

_haxxor_
Posts: 52
Joined: Mon Oct 27, 2008 7:57 pm

Re: Algorithm optimizations (reversing steps)

Post by _haxxor_ » Tue Jun 16, 2009 10:59 pm

@dzhugashvili : did you even bother reading all the material provided ? neinbrucke's blog ? freerainbowtables.com ?

dzhugashvili
Posts: 2
Joined: Tue Jun 16, 2009 9:12 pm

Re: Algorithm optimizations (reversing steps)

Post by dzhugashvili » Wed Jun 17, 2009 7:56 pm

I did. Still processing it. But from what I have seen, nobody has demonstrated this in a multi-hash cracker yet using reversal.

Bitweasil
Posts: 110
Joined: Fri Nov 07, 2008 6:50 am

Re: Algorithm optimizations (reversing steps)

Post by Bitweasil » Wed Jun 17, 2009 8:59 pm

dzhugashvili wrote:in regards to a crossover point - how could reversal NOT be beneficial? Even if you re-reverse all hashes quite frequently,
wouldn't it still be faster than computing the entire hash algorithm for every plaintext? Re-calculation of the last round of
MD4 every 26^2 tries should still be less expensive than computation of the last round of MD4 every try? Unless each hash re-reversal needs to be computed separately and there are over 26^2 elements in the list. Correct?
The issue is that each "reversal" step is more involved than simply reversing the hashes.

The fastest method I've found for dealing with multiple hashes involves using a sorted list of hashes, plus a bitmap based on the first bits of the hash - this is quite fast on reasonable numbers of hashes. However, if I were to reverse, in addition to the overhead of the reversal for each hash, I would need to resort the list and recalculate the bitmap for each reversal.

This may be faster than my current system, but I haven't put the work in to test it. Feel free to write a multi-hash brute forcer, or modify an existing one, and prove that it's faster.

I'm not yet convinced that the significant added complexity is worth it.

One approach I've considered is the use of the CPU to reverse hashes & compute the bitmap/sort the list, with each "workunit" taking one or more kernel invocations to complete. As long as the CPU can keep up with the GPU, this will be faster, at least on higher end GPUs that can handle async memcopies and have enough RAM to handle multiple full hash lists + bitmaps in RAM.

Right now, I'm trying to finish my network brute forcer framework that will provide a common package for everyone to develop hash algorithms for, instead of having lots of people rewriting basic brute forcer code. Feel free to join #cryptohaze on irc.freenode.net to discuss this.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest