Exaelia wrote:If I put this under the wrong topic, I'm really sorry. I saw a topic that said 'how to reverse-engineer.' I don't really understand what that is. I'm a newbie, so I know absolutely nothing. Well, there are new people who went to college... Uh.. never mind. But I have no idea about anything.
Alright, I'll only answer this because I love Reverse Engineering... and you seem like a pretty cool guy/gal. Anyway, seeing as you are saying you know nothing, I'll start from nothing.
1. Programming languages:
So, we have these really cool things called programming languages that are needed to get almost anything technological working (phones, computers (Web Browsers (Firefox, Chrome, IE, Safari, Opera, etc.), text editors (Microsoft Word, Notepad, Gedit, VIM, Emacs, etc.), operating systems (Windows, Mac OS X, Linux, etc.), and more), computers in cars, rocket ships, calculators, etc). These programming languages vary in their difficulty to understand, difficulty to learn, what they look like, and how fast they go (yes, I know, I'm leaving something out). Now, there are some programming languages that look almost identical to English except with bad grammar and some funky symbols mixed in (all forms of BASIC). Now, there are also some languages which, to the untrained eye, look like someone took a shit made out of symbols, numbers, and letters in it (PERL, C/C++, BrainFuck, ASM). However, these are the two extremes, there are much easier ones in between that get the best of both worlds. Here's an example of BASIC, and of ASM, so you can see the difference between the two extremes (once again, to the untrained eye):
BASIC:
- Code: Select all
15 LET S = 0
10 MAT INPUT V
20 LET N = NUM
30 IF N = 0 THEN 99
40 FOR I = 1 TO N
45 LET S = S + V(I)
50 NEXT I
60 PRINT S/N
70 GO TO 5
99 END
Pretty easy to understand, no?
ASM:
- Code: Select all
section .text
org 0x100
mov ah, 0x9
mov dx, hello
int 0x21
mov ax, 0x4c00
int 0x21
section .data
hello: db 'Hello, world!', 13, 10, '$'
And that one is kind of a complete mess to look at...
2. Making these programming languages work
Alright, so, so far we've covered that everything technological (almost) has to be programmed in order for it to work. And that these programming languages are behind the scenes, carefully making sure that if you hit that button, then it's going to do what it is supposed to do. We've also covered that these languages vary from one another greatly, though they can all do pretty much the same thing. So, how exactly do these programming languages make things work?
Well, my dear, it's actually not that difficult to understand. However, you must first be able to understand something really fancy called an "abstraction layer". Basically, abstraction is making one thing easier by making a "layer" (so to speak) that takes whatever you put in, and turns it into the really hard stuff, for instance; instead of having to manually turn your radio on by connecting the wires together and holding them, each time you want to listen to music, you just push a button that does that for you. This, in and of itself is an abstraction layer. The button is abstracting the difficult task of holding the wires together, thus making it easier for you to turn and keep a radio on. Got it? Good.
Alrighty, so, in programming we have many, many, many abstraction layers. At the very top of all of these programming abstraction layers, it's actually pretty easy to use and very human friendly, because all of the hard stuff is "abstracted" (catching on yet?). But, at the bottom of all of the abstraction layers, (i.e. where there are no abstraction layers) we find binary. Yes, the cliche 0's and 1's that actually make a computer tick. One abstraction level up from binary is machine code. This is a language that all machine's understand (though each machine has a different version of it), that abstracts the binary, making it easier to program in, but it's still pretty tough. One layer up from Machine Code is Assembly code, otherwise known as ASM. It abstracts the machine code, making it EVEN easier to program (though it's still a pain in the ass). Now, above ASM are all the more well known languages, such as C, that making programming a much, much easier task (and now a-days, even more languages are abstracted above C). Got it? Cool.
So, now that we get the abstraction layers in programming (somewhat), it would be much easier for me to transition into how these all work together. We already know that at the bottom of the pile is Binary (1's and 0's), Binary is what makes a computer tick. Essentially, any program you're running right now, is actually running in Binary at the very bottom level of it all. But, how to we get from the top layer, down? Well, between each layer there is some translating down (imagine trying to talk to someone of a different language, you would need a translator, right? In our case, the layers are translating into the unabstracted (i.e. the harder to program in) code). Imagine the abstraction layer stack we made is collapsing, i.e. BASIC (the easy language) is translating into C (the slightly more difficult language) which is translating into Assembly (the even harder yet language) which is translating into Machine Code (the bitch of a language) which is translating into Binary (the 1's and 0's). This process is done by a special program called a compiler. It is the thing that translates all of this mess for us, into Assembly Code (pretty close to the bottom of the layers), then a program called an Assembler turns that assembly code into Machine Code, which the computer can now use, thus, running our program. (Complicated, right?)
3. Reverse Engineering.
Now, that we understand how each language is translated down into the abstraction layers (or stack), we can begin to realize that if we want to see the code of a program (and no, we don't have the source code (the original copy of the code before it was translated down), that has already been compiled (translated down the abstraction layers), we're going to need another special program, this is where OllyDBG and IDA Pro step in. These programs break apart the now running program we've made, and show us the raw data of it. However, instead of showing us the Binary (because fuck binary), they make it as user friendly as possible and show us the Assembly code (anything past this and it would become a real language (there are programs that can sort of do this, but since we don't know what language the program was originally written in, it usually doesn't work)). By using these programs, we can change the assembly code and reassemble (the program that turns assembly code into Machine Code) the program, thus changing how the program works.
What is the use of this? By Reverse Engineering, people can crack programs, thus getting them for free, or, on a more legal note, can see why the program is crashing. They can see the assembly code as the program operates, find the bug, and squash it's little head. Or, they can reverse engineer virii and other malware, and see how they work, so that we can patch all of the security holes and stop them from attacking our systems again.
Shit that's a long post... Hope you understand it!