home *** CD-ROM | disk | FTP | other *** search
-
- The Idea Of IDA
- (A Small Primer For IDA Newbies)
-
- By
-
- Gij
-
- Hi there, gij here, i'm guessing most of you are reading this because you've
- heard about IDA and are thinking "why is it better then wdasm?", or you have
- already gotten IDA, but found it to complicated to use.
- This is being written to help you out with your first steps in using IDA.
- Hope it helps...
-
- The Disassembling Challenge
- ---------------------------
-
- Most software today is written using high level languages: C, C++, VB,
- JAVA, Delphi, etc'.
- These are ( generally speaking ) compiled languages that turn high-level
- code into the low-level code form that the computer understand, Assembler.
- We need to make a distinction between a Decompiler and a Disassembler.
- A decompiler takes a binary file generated by a compiler and try to reverse it
- into the high level language the file was compile from.
- So for example a C++ decompiler would take an .exe made by Visual C++ or any
- other compiler and turn out a C++ source file. A good decompiler is very hard
- to make.
- A Disassembler on the other hand, will take a binary file that could have been
- written in almost any language, and disassemble into an Assembler source file.
-
- A good disassembler needs to be able to fairly accurately distinguish between
- data and code.
- a good decompiler needs to do that AND be able to understand what code
- construct in the original high-level language generated this code.
-
- so what is IDA? neither. it's a hybrid between the two types of programs.
- IDA stands for "Interactive Disassembler", and it is. but it also has a of the
- characteristics of a decompiler, namely it's FLIRT feature.
-
- Those of you who have programmed in c, or indeed ever try to debug a windows
- program, know that every program uses some functions supplied by the compiler
- or as part of the Win32 API.
- "printf()" is one example, all c programs that call printf, and are compiled
- by the same compiler have the same piece of code inside them.
- In the compilation process the compiler links in the code for the "printf()"
- function from it's included libraries.
-
- This opens up an opportunity for a disassembler to recognize the code pattern
- of a particular function and pin a useful name on it.
- This also saves us from the embarrassment of tracing through a function for 20
- minutes only to discover it's some compiler's variant of "fseek()".
- It saves us time by helping us understand what the program is doing easily
- by letting us see what functions it calls.
-
- This is exactly what FLIRT does, the FLIRT libraries that come with many
- signatures of functions from various compilers, not only for C but also
- for Pascal, Delphi and others.
- in a way You get the best of both worlds, IDA let's you reverse any binary file
- from any language, while taking advantage of the shared nature of compiler
- libraries.
- You can get more information on FLIRT at the IDA home page.
-
- IDA vs. Wdasm
- --------------
-
- Those of you who have been using Wdasm up till now, will need to make a slight
- switch in attitude when moving over to IDA.
- Wdasm, takes in a file and gives out a disassembly. that's it.
-
- IDA is INTERACTIVE. this means the disassembly you get is very much editable,
- you can change code to be marked as data, and the other way around.
- you can add comments, see cross-references ( very useful, we'll get to it later )
- , and probably a whole lot more.
- That's probably why most people consider it more complex, or heavier then Wdasm.
- in IDA, you have to do more work, but you can accomplish much more.
- it's possible to completely reverse an application inside IDA, generate a source
- and have it compile to a byte-identical exe file. ( I'm not saying it's easy
- though ).
-
- IDA by itself adds comments to some API calls or INT's, And using some tools
- availible, you can add your own comments to the databases.
-
- The Use of IDA
- --------------
-
- There are two version of IDA you can use, idaw.exe and idax.exe.
- idaw.exe is a windows exe, while idax.exe is a dos file.
- I heard it said it's better to use idaw.exe because as a windows program
- it has less trouble with memory.
- I personally use idax.exe because i've found that with international versions
- of windows idaw.exe tends to have problems when typing in text.
-
- One Important thing to remember, and thank you to whatever kind soul on
- #cracking who helped me understand this when i was starting out with IDA ,
- is that once you feed IDA an exe, dll, or whatever file, upon exit it saves
- a Disassembly database with the extension .IDB, when you wish to continue
- work on the disassembly, you do not reload the binary file, but tell IDA
- to load the .IDB file. once you've generated the .IDB file for a program
- you no longer need the original EXE, all changes to the disassembly are
- made an stored in the .IDB file.
-
- OK, let's get down to some real-life use of IDA.
-
- Actual Use Of IDA
- -----------------
-
- You can load a program into IDA in 2 ways:
- 1) on the command line: "idax.exe c:\target\program.exe"
- 2) through the file dialog which appears when starting IDA without
- a filename argument.
-
- Once you load the program into IDA, it will show you a panel of options to
- choose from, the default options are chosen for you according to the file
- format you loaded.
- The other options are either advanced ( meaning, if you need them, you know
- what how to use them ), or self-explanatory.
-
- IDA then goes through 2 phases:
- 1) Actual Disassembly, including separation of program into code and data
- areas. This is not fool-proof, IDA does make mistakes some times, don't
- expect to run a program through IDA and have a 100% percent accurate
- disassembly. IDA does come as close as anything i've ever seen.
- At this phase, it also marks functions and analyses their stack arguments,
- assigns label names to jump and call destinations, separates the file
- into segments if needed, and probably more.
-
- 2) If IDA recognize the file as compiled by a supported compiler, it will
- load a signature file, and try to assign names to as many function as
- possible.
-
- After IDA has finished doing it's work, you can see some of the results of
- the disassembly, besides the main screen, where the disassembly is shown,
- the is also the status window, where information about what IDA is doing
- is shown, and other screen you can activate to see information about the
- disassembly.
- To switch between the window you can use the "Windows" menu, or F6 and
- Shift-F6 to go back and forth between windows.
-
- The Disassembly window can be divided into 3 parts:
- 1) segment:offset ( to the left )
- 2) code/data ( Taking up most of the width of the screen ).
- 3) Comments, These are actually part of the disassembly, but sometimes
- contain Important information like Cross-references and API call info.
-
- All in All, the disassembly screen should look familiar if you've ever written
- asm code.
-
- An IDA Disassembly is divided into 3 types of areas/entries:
- 1) Code.
- 2) Data.
- 3) Unexplored.
-
- You can distinguish between code and data just like you would in a normal asm
- file. data is preceded by some sort of data specifier: db,dw,dd,dt,dq....
- You can detect unexplored code by looking at the left part of the screen,
- unexplored areas are marked by a "greyed out" segment:offset.
- Unexplored areas are the areas to which no reference is made in code
- ( it's not jumped or called to, or incorporated in to the code flow of
- the program in any other way. ), or as data ( that area is not read or written
- by code.)
-
- These places usually ( always? ) contain 0's, and are usually not relevant
- to the code, unless IDA missed marking some area as code, which references this
- area. This illustrates another point, with any change you make to the
- disassembly IDA checks if it can deduce anything else about the rest of the
- programs from the changes you've made. so for example, if you mark a data
- areas as code, IDA will look at it, and if it sees a jump to some unexplored
- space, it will mark that area as code, same with data, if that new code you've
- marked reads or writes to a previously unexplored address, IDA will mark that
- area as data.
-
- To change the markings of an area, you select it using the mouse, then press
- one of 3 keys:
-
- 1) 'c', marks that area as code.
- 2) 'd', marks that area as data, if you do not select an area, but place the
- cursor at a line, and press 'd', you will cycle through the data specifiers
- available. if you mark an area, you can use 'a' to declare it a string,
- which will make IDA automatically give the string name, and show it as "123",
- instead of "db 31h,32h,33h" for example.
- you can also use '*' to make it an array of data, which will pop up a dialog
- where you can set various options concerning the array (this is the same as
- marking an area and pressing 'd' ).
-
- 3) 'u', marks that area as unexplored.
-
- Names
- -----
-
- Names are the most Important part of a disassembly, it's the difference between
- 'loc_0_200' and 'Show_Splash_Screen' that makes it possible to understand a
- program bit by bit.
- To change the name of a label, or create a new one, position the cursor at the
- desired line and press 'n', a dialog will appear which will let you enter or
- modify the label name.
-
-
- Cross-references And Information Screens
- ---------------------------------
-
- I've already mentioned Cross-references in this article a few times, it's only
- fair that i should explain what they are to those who do not know.
- a reference is created when a certain piece of code uses another area in some
- way, this could be a call ( one piece of code calls another ), a jump , or a
- read or write operation to a data location.
- IDA keeps track of this references, and maintains a table of cross-references
- for every label in the disassembly. for example:
-
- seg000:0200
- seg000:0200 loc_0_200: ; CODE XREF: _main+1EAjump tablesj
- seg000:0200 cmp word_1B90_14C, 0
- seg000:0205 jz loc_0_20A
- seg000:0207 jmp loc_0_2F2
-
- This means that the label loc_0_200 is referenced as CODE ( jump or call, thus
- "CODE XREF" ) by another location in the program, to see the list of places
- that reference this location, position the cursor at the line of the label,
- and select "Cross references" from the "View" Menu, a window should appear with
- a list of locations, you can jump to that location by selecting it in the window
- and pressing ENTER.
- IDA also keeps tabs of your little "field-trips" inside the code, so after
- you've traced 12 function deep into the code you can press ESC to get back to
- the place you where before your present location.
- Data Cross Reference are Much the same, no need to explain them separately.
-
- Cross references are very Important because they give us a little info about
- a piece of code or data by telling us it is related to another piece of code.
- so potentially, if you've understood one piece of code, you can us it as an
- "anchor" into other locations in the program.
-
- I should take a minute to tell you about the other items in the view menu,
- even though they are fairly self-explanatory:
-
- 1) Disassembly: shows you the Disassembly screen for the current file.
-
- 2) Functions: shows you a list of location marked as function entry points.
-
- 3) Names: shows you a complete list of FLIRT Recognized functions, and
- locations marked as strings.
-
- 4) Signatures: shows you the FLIRT signature files currently loaded and
- applied to the file, you can manually add other signature
- files here.
-
- 5) Segments: shows you the created segments
-
- 6) Segment registers:
- it seems to show you the changes in, and the value of, each
- segment register between any range of locations in the file.
-
- 7) Selectors: This Is Pmode related, won't show anything normally, perhaps
- you need to add them manually. probably used when disassembling
- extenders, or something else that's fairly dodgy.
-
- 8) Cross references: you know.
-
- 9) Structures: You can define structures and assign them to data areas, this
- will show the currently defines structs. ( see how advanced IDA
- is? )
-
- 10) Enumerations:
- I think Enum's are used to give Names to numbers, or maybe array
- elements. i've never had to use it, normally, neither would you.
-
-
- Comments
- --------
-
- There are two types of comments: repeatable and regular.
-
- 1) Repeatable: made by pressing ';', when adding this sort of comment
- to a function, the comment will appear next to any call to that function
- anywhere in the file.
-
- 2) Regular: when adding this sort of comment, it will appear only once, at the
- location you've entered it.
-
-
- Marking Positions
- -----------------
-
- Sometimes after disassembling for a while, you want to mark a location it
- could be the beginning of the data strings section, an Important function,
- your lucky number as an offset, whichever.
- IDA has a feature that let's you mark positions, and give the mark a name.
- to plant a mark, got the location you wish to mark, and then press Alt-M, or
- select "Mark Location" from the "Navigate" Menu, this will pop up a window,
- press enter on an empty line ( if there are no marked location, they will
- all be blank ), a dialog will appear asking what name you wish to give to that
- mark, this is NOT like generating a label for a location, only the marked
- positions table will show that name.
- To go to a marked position, select "Jump To/Marked Position" out of the
- "Navigate Menu", a list of marked positions will show up, choose the one you
- want to jump to and press ENTER.
-
- Getting Around
- --------------
-
- Sometimes You need to get around IDA, and you just can't be bothered to do it by
- pressing PgUp a 3-digit number of times.
- That's when you can use the "Navigate" menu, the sub menus you need are:
-
- 1) Jump to: You can go to a specific address,function,string,segment,
- entry point,cross reference, marked positions etc'.
-
- 2) Search for: allows you to search for text, specific operands to an
- instruction, next code/data/unexplored, etc'.
-
- Exporting
- ---------
-
- Those of us that write articles ( yeah, me too ), use IDA to rip code out
- and paste it into our text as examples.
- To do this you need to export the code you want to a file, and that is done
- by using the "File/Produce Output File" menu item.
- there are 3 options we care about right now:
-
- 1) Produce ASM file: this will put out an .ASM file that you should be able
- to feed TASM or MASM with.
-
- 2) Produce LST file: same, but will also put segment:offset pairs into the
- file, use this for articles.
-
- 3) Produce DIF file: IDA can also server as a patcher, look at
- "EDIT/Patch Program", if you modify the program and use this option
- you will get a file similar to the output of the dos program "fc".
-
-
- Final Notes
- -----------
-
- There is more to write about IDA, you should know that it has a scripting
- language, which let's you automate some rather mundane disassembling procedures.
- If you would like to learn more about IDA scripting, check the help file for
- syntax and a list of functions, and you can also get some very nice scripts
- at mammon's place.
-
- For real-life examples of cracking with IDA i direct you to the normal place
- ( you know )m plenty of articles out there.
-
- I Appreciate Input on my articles, comments and criticism, you can reach me
- on EFNET's #cracking4newbies, or at my email at gij <at sign> iname.com, i
- can't promise a reply, but i do read all my mail.
-
- GREETZ
- ------
-
- All The Guys (and girl) On #c4n: Never has so much newbiness been in the hands
- of so few people.
-
- Gij.
- yep.
-
-