[A]s we enjoy great advantages from the inventions of others, we should be glad of an opportunity to serve others by any invention of ours; and this we should do freely and generously. ~ ben franklin
The name Cryptography comes from the greek words Crypto, which means hidden, and graphy, which means writing. Cryptography is defined in the dictionary as, “the art of writing or solving codes.” In this sense Cryptography can be seen as encrypting (writing codes) and decrypting (solving codes). When Cryptographers refer to plaintext, they are referring to the original data set or a decrypted data set. When they refer to ciphertext, they are referring to the encrypted message. Through out the many years of Cryptography’s existence, many mathematicians have written more secure and efficient algorithms for encrypting and decrypting data. The goal that all Cryptographer’s want to accomplish in their algorithms, is the occurrence of complete randomness in the ciphertext.
The algorithm used to encrypt a set of data is called a cipher. One cipher used early in Cryptography’s history was the mono-alphabetic substitution cipher. This cipher simply took a set of data and substituted another set of data into each original character. For example, if I were to encrypt “hello” with a substitution cipher, it might look something like this, “fjuuy.” When looking at this encryption, from the untrained eye, it almost looks completely random. There is a couple of problems though. For one, there is a double “u.” A Cryptanalyst will know that there is a double letter occurrence in the plaintext. From this knowledge statistics can be used to find the most probable occurrences of a double letter. This use of statistics is an attack known as Frequency Analysis. Frequency Analysis analyzes probabilities of letters in a specific language. For instance, the letter “e” has the highest probability of occurring in the English language. When a data set is analyzed using Frequency Analysis, a program will search to find the most frequent letter and assume it is an “e.” In the example above (“hello” --> “fjuuy”), is a trivial example for Frequency Analysis because of the limitation of the sample size. Frequency Analysis yields better results in increase sample sizes. This method made it extremely easy to crack mono-alphabetic substitution ciphers.
Frequency Analysis is also used to break some simple transposition ciphers. A simple transposition cipher shifts letters up or down the alphabet (or another defined sequence). Frequency Analysis can be applied to this situation like the substitution cipher. Probabilities are found and the highest probability letter will be defined as an “e.” There is an extra step in cracking a transposition cipher. You will then look what the offset, or transposition, is on the letter. For example, if I had a plaintext data sequence of “apple, ” and a ciphertext data sequence of “crrng,” we would use Frequency Analysis (of course we would need a larger sample of data) and count how many spaces the letter we found that was most frequent, relative to the letter “e.” The result would then give you the transposition number, which can then be used on each character to decrypt the data.
Another cipher was developed using the same core of a substitution cipher, but it used multiple alphabets. These ciphers were named poly-alphabetic substitution ciphers. This limited the use of Frequency Analysis. In poly-alphabetic ciphers the same plaintext letter could be encrypted different ways in different parts of the data. For example, the plaintext “hello world” could be encrypted as “jokht yefiq.” The letter “o” was encrypted as “t” in the first instance and “e” in the second instance. Poly-alphabetic ciphers are not completely immunized from Frequency Analysis. If you can figure out the length between rotating alphabets, you can use Frequency Analysis on subdivisions of the ciphertext. For example, one alphabet might be used for the first four letters and then it would switch to another alphabet for the next four letters. One would use frequency analysis on each of the subdivisions of size four. It is not extremely reliable to attack poly-alphabetic ciphers with Frequency Analysis though. Many of the Cryptographers that write the ciphers understand that Frequency Analysis relies on a large sample size. They deal with this by shifting through alphabets at relatively short lengths. The example of a four letter size subdivision would be too small to use frequency analysis on.
These three ciphers (substitution, transposition, and poly-alphabetic substitution) are known as traditional ciphers. They are not frequently used today by themselves. Some of the traditional ciphers are used in a part of a modern algorithm, but they are not used alone. Modern ciphers are broken into two categories involving information about the key. The first is private key cryptography (symmetric key). The name sums up the meaning pretty well. The key is kept private. You can think of Private Key Cryptography as passwords. The password would be the key, and it should be kept secret.The next category of modern cryptography is Public Key Cryptography. Public Key Cryptography (asymmetric key) is also self explanatory, the key can be kept public, or shared. The reason why the key can be shared is because the system uses multiple, different private keys and one public key.
In private key cryptography the same key is used to encrypt as it is to decrypt. This type of algorithm is usually much less computationally rigorous compared to a public key algorithm. The disadvantage of a private key algorithm is the key exchange handling. The key must be kept secret to ensure security. Two branches within private key cryptographic algorithms are stream ciphers and block ciphers. Stream ciphers encrypt each character in the plaintext. Block ciphers encrypt a block of data at a time.
The most recent type of Cryptography is Public Key Cryptography. Two keys are needed, one to encrypt and another to decrypt. The algorithm's computation time is usually 100 to 1,000 times slower than private key algorithms. The algorithm has two keys which enables some sense of heightened security. If one key is compromised a person could still be able to encrypt or decrypt (mutually exclusive if one is cracked). But if someone did crack the encryption key, they would be able to send seemingly authentic messages. If someone cracked the decryption key, they would be able to read encrypted messages. These flaws brought up another system in place for public key algorithms. Digital Signatures are used to combat these security concerns. A message is signed with the writer’s private key to use to authenticate the message. Public key cryptography is the next step in the cryptographic evolution. It utilizes both private key and the new public key ideas to make the most advanced cryptographic systems.
Another type of algorithm used to encrypt a data set is a hash algorithm. Hash algorithms are not able to be decrypted. This concept of not being able to decrypt a data set is abstract to most. It serves a great purpose though. Windows, for example, uses a hash algorithm to encrypt all of the user account passwords for a system. When someone attempts to log on to a user account, the password they enter is hashed and checked with the hash that is stored in the system. If the system wasn’t setup like this, hackers could simply locate the file that the passwords are stored in. Hackers can still locate the file the passwords are stored in, but now the passwords are encrypted within the file.
Hash’s have some limitations. In the Windows example, the two hashes must be cross referenced and checked. For this to happen, the algorithm can not be truly random. Nevertheless, hashes provide a pretty strong system of security. Rainbow tables provide a reliable crack for hash codes. Rainbow tables work like two trees, each branching down. The table starts with the least reduced function for the algorithm and reduces it down the tree. The reduction starts with the maximum constraints on the hash algorithm, and then finds collisions within hashes. There are commonly available hash tables already generated by supercomputers that are available for download. The file sizes range from 400mb - 1Tb. If a simple code is being cracked, a 400mb rainbow table might be sufficient. If a high security code is being cracked, a 1Tb rainbow table would have a better shot. Rainbow tables are just huge lists, this means that time is necessary. 1Tb tables could run for days or months before yielding a result.
Many different methods of cracking cryptographic codes have been developed. Statistical analysis has already been discussed. This method is very reliable for transposition and mono-alphabetic substitution ciphers. Another attack method is a dictionary attack. This type of attack is in a category of brute force. A dictionary attack is just a list of a bunch of common words (by a bunch, I mean millions). Brute force attacks take a lot of time and are only somewhat reliable in the feasibility of cracking large codes. Rainbow tables (used to crack hashes) are also half classified as a brute force attack because it does run large lists. The reason it is not fully classified as brute force is because some logic of reduction of the function is used to optimize results and the time it takes to obtain results.
Another attack method is the use of a keystroke logger. This method is difficult because hardware or a “bug” must be implemented on a person that has knowledge of the plaintext. The keystroke logger simply records every keystroke that a user makes. The file can then be analyzed to see what the person was doing and when exactly the password or code was entered. There are many hardware keystroke loggers available. It’s also possible to program a background program that uses key listeners that record data files. If a software keystroke logger is used, then it is possible to mount an attack from afar by hacking into a workstation or installing the software on the computer.
Cryptography’s main security lies on infeasible mathematical solutions. A major crack would be to write an algorithm that would factor extremely large prime numbers in a feasible period of time or be able to solve discrete logarithm problems. Quantum computers are able to perform these operations fairly easily. Once quantum computers become available, our modern cryptographic systems will need to be upgraded to problems that are infeasible to quantum operations. Cryptographic systems will always be advanced and can withstand any advance in technology. When quantum computers become available, people will begin creating cryptographic systems with quantum computers. These algorithms will be more advanced than traditional computing algorithms. Therefore the codes will, again, be “infeasible.”
Cast your vote on this article 10 - Highest, 1 - Lowest
Of course. That segment of my comment has been addressed to gregorian. I meant for an introduction it could be a bit more. Of course it's still pretty nice, and if you're planning to make a part 2, a more specific one, well, I'm looking forward to get through it.
nice coverage of block ciphers, they aren't incredibly important or anything. also, hash algorithms aren't used to "encrypt a data set." you forgot to mention the birthday attack, avalanche effect, etc etc
HackThisSite is is the collective work of the HackThisSite staff, licensed under a CC BY-NC license.
We ask that you inform us upon sharing or distributing.
Very good informal guide!