Hey guys,
Just thought id ask your oppinion/s on if this idea would work.. The concept is to find the statistics of the hashes and their plaintext equivalents. So for md5:..
aa = 4124bc0a9335c27f086f24ba207a4912
A program can be made (which I have already done by the way), which will discover any 'patterns' and other statistics in a list of plaintext 'words'. So for the above example, my program checks for things like:
>A basic search: "how many a's are there in the hash? how many b's? etc", and then adds the statistics on to a scoring array
>A more complex search, taking into account the position of the characters, ie "how many a's in position 0 of 31? a's in position 1 of 31? etc"
>Another more complex search, using "patterns": "where is the pattern '412' found? what is the most common plaintext character in hashes that have this pattern?"
So thats my idea, iv done some research and test trials using my program, and I find that it seems to work.
It generates a list of statistics (a character set) which my other program can then use to crack in the least ammount of time it can. here is what my results *look* like:
hash_character:most_common_plaintext_characters->least_common_plaintext_characters
a:qwertyuiopasdfhg3579@$^*(...
b:mnbvcxz';lkjhgfds?":><{}...
etc
So, any thoughts? Have I missed something? Anything else I need 2 consider? Is this a hopeless cause? (I know, rainbow tables are really good, but these statistics take up hardly any room at all)
[edit]
The principle behind is this: for a dictionary consisting of an even number of each character (ie, 200 a's, 200 b's 200 c's - which is created by a plaintext word generater), there is an *uneven* distribution of hash characters (ie, 253 a's, 198 b's, 444 c's etc).
By leveraging this minor flaw, one can find the probability of a plaintext character behind each hash character.


