PE header
Context
Information related to the PE header I collected here and there. My favorite resources on the subject, however, are Goppit and Iczelion’s tutorials (both available on tuts4you), but it’s just a personal preference. Pictures are made with Pencil and are freely reusable.
Reaching the Data Directories
In a classical PE file, we find at offset 0x3C the field e_lfanew. The value it contains is the offset to the NtHeaders structure (IMAGE_NT_HEADERS in the MSDN), where we find the famous PE\x00\x00 signature. At offset PE+0x78 starts the Data Directories (Optional Header Data Directories in the MSDN), which is an array of IMAGE_DATA_DIRECTORY structures:
1
2
3
4
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
Note that VirtualAddress is in fact a RVA.
The first entry allows getting the RVA and size of the export table (Export Directory Table in the MSDN), while the second allows getting the RVA and size of the import table (Import Directory Table in the MSDN). In short:
- PE+0x78 = export table RVA
- PE+0x7C = export table Size
- PE+0x80 = import table RVA
- PE+0x84 = import table Size
Export table
The export table lists the functions a given PE file (often a DLL) makes available to other PE files. For example, if the dynamic library lib1.dll exports the function funct1, any executable code can call funct1. The figure below depicts an overview of the export table:
- Focusing on the red part of the figure, we see the IMAGE_EXPORT_DIRECTORY structure has 11 fields, but only a part will be of interest here (this structure can be accessed with the RVA at
PE+0x78, see above):- NameRVA is the RVA of the name of the module (here the string “KERNEL32.DLL”).
- NumberOfFunctions is the total count of functions exported by the module.
- NumberOfNames is the total count of named functions exported by the module.
- The next important field is AddressofFunctions: this is the RVA of the ExportAddressTable (green part of the figure). Each entry in the ExportAddressTable is also an RVA, and it can points either to (i) code or (ii) a string:
- if it points to code, this is the code executed when calling a given API. In the figure above, the RVA at ExportAddresstable[2] allows reaching the code of the function AcquireSRWLockExclusive.
- if an entry in the ExportAddressTable points to a string, we’re dealing with a forwarded export. On my laptop, for example, the entry corresponding to the API HeapAlloc in kernel32.dll points to the string “NTDLL.RtlAllocateHeap”. This means any call to HeapAlloc in a given program will be forwarded to the API RtlAllocateHeap exported by ntdll.dll (see also the MSDN). This feature can be abused by malwares.
- The last two important fields are AddressOfNames and AddressOfNamesOrdinal: both are RVA, pointing to the arrays ExportNamePointerTable and ExportOrdinalTable, respectively (blue part of the figure).
- ExportNamePointerTable is an array of RVAs pointing to strings. These strings are the public names we use when calling functions by their name and are located in a table called ExportNameTable.
- ExportOrdinalTable: an array of words, where each word is the index of an RVA inside the ExportAddressTable. Often, we parse the ExportOrdinalTable and the ExportNamePointerTable in parallel.
Let’s illustrate this by finding the API AquireSRWLockExclusive (follow with the figure):
- parse the ExportNamePointerTable and the ExportOrdinalTable in parallel;
- for each entry in the ExportNamePointerTable, do a string comparison with the name of the API we want;
- when the correct entry is found, get its index: on the figure its ExportNamePointerTable[0];
- use this index to retrieve the correct word in the ExportOrdinalTable: ExportOrdinalTable[0] = 0x0002
- use this word as an index in the ExportAddressTable: ExportAddressTable[2] = RVA of AquireSRWLockExclusive.
Import table
Import table of an executable lists external functions a PE needs to run. The figure below depicts an overview of the import table:
Information is split in 3 parts (red, blue and green on the figure):
Looking at the red part, we have the IMAGE_IMPORT_DESCRIPTOR structure that can be accessed from the RVA at
PE+0x80(see above). In a classical PE, there are as many IMAGE_IMPORT_DESCRIPTOR structures as DLL this PE depends on. For example, if a PE uses APIs exported by kernel32.dll, ws2_32.dll, and msvcrt.dll, the RVA atPE+0x80will points to an array of 3 IMAGE_IMPORT_DESCRIPTOR structures.Three fields are of interest here:
- Name1: the RVA of the name of the DLL exporting the required APIs
- OriginalFirstThunk: an array of RVAs (in fact an array of IMAGE_THUNK_DATA, but let’s take a shortcut), each pointing to an IMAGE_IMPORT_BY_NAME structure
- FirstThunk: before imports resolution (when the PE is on disk), it’s a copy of the OriginalFirstThunk; after imports resolution (after a PE has been loaded in memory), it’s an array of virtual addresses pointing to the entrypoints of APIs.
Both OriginalFirstThunk and FirstThunk should be considered before (blue part) and after (green part) imports resolution.
- The blue part shows OriginalFirstThunk and FirstThunk before imports resolution: both are arrays of RVAs pointing to IMAGE_IMPORT_BY_NAME structures:
1 2 3 4
typedef struct _IMAGE_IMPORT_BY_NAME { WORD Hint; BYTE Name[1]; } IMAGE_IMPORT_BY_NAME, *PIMAGE_IMPORT_BY_NAME;
This field Name in this structure is the name of the API to import (let’s ignore Hint for the moment).
- After imports resolution (green part), the RVAs of the FirstThink are replaced by virtual addresses of the APIs.
TODO:
IMAGE_THUNK_DATA, ordinal versus addressofdata, hint, msb
# File and section alignments
todo
RVA to offset
use the sections table and find which section should contain the value of the RVA: if section.start <= RVA < section.end, bingo. then black magic:
1
(RVA - section_virtual_address) + section_raw_address
Offset to RVA
todo
not by hand
todo
EOF

