Simple XOR Decoding With Ghidra
References
- Talks by Dr. Josh Stroschein, “Using Ghidra to Statically XOR Obfuscated Shellcode” (https://www.youtube.com/watch?v=DgaPPixn9k4)
- The author’s blog: https://0xevilc0de.com/
- “Repository that provides sample codes mimicking real malware” (https://github.com/jstrosch/learning-malware-analysis )
Steps
- Create a new project
- Import a shellcode (shellcode.bin provided by the author)
- Tell Language: x86 (visual studio)
- The analyzed result shows up
- Go to the first offset (00000000)
- (at this moment, the codes seem to be a series of random bytes)
- ‘D’ or ‘Disassemble’
- (now, the unpacking codes will show up)
Codes
- The first jump (at 00000000) goes to 00000016
CALL FUN_00000002
- Looking at the codes from 00000002, there is function that works as a XOR decoder:
- 0x186 and 0x97 seem to be the size of the packed code and the encoding key
- from the code above,
00000009 XOR byte ptr DS:[EAX + ECX*1],0x97 0000000e TEST ECX,ECX 00000010 JNZ LAB_00000008
- EAX is a location (of the encoded shell codes). This value was popped from the stack at
00000002 POP EAX
, and again it was pushed when the function was called at00000016 CALL FUN_00000002
as a return address. So, the value must directs to the offset,00000001b
(the next opcode). - ECX is a counter, if ECX is non-zero, it jumps to 00000008(loops back and continue the process). Otherwise(if it reached to the end), it calls EAX (again, EAX has the base address of the shellcodes)
- At this moment, the encoded codes have been all XOR decoded.
- EAX is a location (of the encoded shell codes). This value was popped from the stack at
Insights?
- Before decoded, they (the codes below 0000001b) look like weird operations that doesn’t make sense at all, or look just random data.
- Also, withinin the encoded data section, 0x97 appears a lot. We can say that the NULL bytes in the original codes are encoded as the same value with the XOR key (since
0 XOR key == key
)
How we can obtain the encoded codes?
Two options
- Run the codes from the debugger placing a break point.
- The breakpoint can be set at 0000001b (where the shellcode begins), HOWEVER software breakpoints can be overwritten.
- The better position to place the breakpoint is at 00000012, which is just a moment before the shell code is called, while the codes are decoded and can be extracted.
- A static way
- the shell codes are located from
- the start of the shellcode (00000001b offset)
- and the offset 000001a1 (0000001b + loop size(0x186))
- set block over the ecoded codes, then Right-click, and choose “Clear Code Bytes” from the encoded shellcodes
- press the “play” menu icon (Display Script Manager)
- if the “Script Manager” window pops up, filter with “xor”, then “XorMemoryScript.java” appers
- Choose “the Play” icon, again
- Then ghidra asks which the XorValue is used; enter 0x97
- Then the decoded codes are displayed on the screen
- the shell codes are located from