【转】ELF 加密文件解码初探(思路很好)

hackme: Deconstructing an ELF File 转自:http://www.manoharvanga.com/hackme/
文章思路很好,更多请看原文。
A friend recently asked me to find the password for a little hard-to-hack program he had written and I agreed to it. The short journey of a few hours that led me to its password were extremely interesting and this article describes the process as well as some of the new techniques learnt along the way.
Few minutes after accepting his challenge, I received a binary called "hackme" in an E-mail and I got started! Those interested in giving it a shot can download the binary file and get back to this article later. Do let me know if you find anything interesting along the way that I did not think of or missed! Comments can be sent to manohar dot vanga at gmail dot com with the title [hackme]. UPDATE: You can post comments to the Hacker News discussion.
Test Run I ran the binary and tried a couple of random passwords. As expected, they failed, providing me with an extremely helpful message:

$ ./hackme Password, please? password Oops..

Interestingly, running the binary under GDB gave a specially crafted welcome message:
$ $ gdb ./hackme Reading symbols from /tmp/hack/hackme...(no debugging symbols found)...done. (gdb) r Starting program: ./hackme Fuck off! no debuggers!Program exited with code 0364. (gdb)

Same with ptrace:
$ strace ./hackme execve("./hackme", ["./hackme"], [/* 41 vars */]) = 0 brk(0)= 0x9016000 access("/etc/ld.so.nohwcap", F_OK)= -1 ENOENT (No such file or directory) ... snip ... ptrace(PTRACE_TRACEME, 0, 0, 0)= -1 EPERM (Operation not permitted) fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb783e000 write(1, "Fuck off! no debuggers!\n", 24Fuck off! no debuggers! ) = 24 _exit(2543604)= ?

The Obvious Stuff Even though the probability of a password being visible in plain sight was near zero, I decided to give it a shot anyway.
First, I checked if the binary was stripped or not:
$ file hackme hackme: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.27, stripped

It was stripped. Dead end there. GDB would not be much help with a stripped binary in terms of deconstructing the logic. I decided to try looking for a literal string comparison with the password within the binary just in case:
$ strings hackme /lib/ld-linux.so.2 libdl.so.2 __gmon_start__ _Jv_RegisterClasses dlopen dlsym libc.so.6 _IO_stdin_used __libc_start_main random GLIBC_2.1 GLIBC_2.0 PTRh QVhE [^Ph [^_] 8%':!06! %!'460 &64; 3 %'<; !3 UWVS [^_] Fuck off! no debuggers! Password, please? Congratulations! Oops..

Just in case, I try all the strings as passwords but they all fail. Not too surprising. The output however gives me the message that gets printed on success; "Congratulations!". It also seems to be containing the string "libc.so.6". Fishy. A quick ltrace quickly explains what the binary is doing:
$ ltrace ./hackme __libc_start_main(0x8048645, 1, 0xbfb48a04, 0x80486b0, 0x8048720 dlopen("/lib/libc.so.6", 2) = 0xb7757ae0 dlsym(0xb7757ae0, "ptrace") = 0x00eddf40 dlsym(0xb7757ae0, "scanf") = 0x00e621a0 dlsym(0xb7757ae0, "printf") = 0x00e5baa0 Fuck off! no debuggers! +++ exited (status 244) +++

While this gives us the same welcoming message, we can see what is happening here. The libc shared library is being opened dynamically and the addresses of ptrace, scanf and printf are being retreived with dlsym! Sneaky trick!
More irritatingly however, the strings output shows that the binary is using the random() function. However, since it is a reproducible program, that is the password works every time, the call to random is probably not being seeded. We will worry about this later.
The strings output also explains how the binary is detecting the debugging environment. Calling ptrace while running inside a ptraced environment (eg. strace, ltrace or gdb) returns -1.
Getting over this debugger hurdle however is quite easy using the LD_PRELOAD environment variable. The LD_PRELOAD environment variable can contain a list of custom shared objects which are then loaded prior to all others when running an executable. It is an easy route to blocking a process from calling unwanted functions. I quickly wrote a quick dummy function in a new file:
/* fake ptrace() */ #include long ptrace(int x, int y, int z) { printf("B-)\n"); return 0; }

...and compiled it:
gcc -shared -fPIC -o fake.so fake.c

Now running under strace with the LD_PRELOAD variable set to our fake shared object gives us the full scoop!
$ strace -E LD_PRELOAD=./fake.so ./hackme execve("./hackme", ["./hackme"], [/* 24 vars */]) = 0 brk(0)= 0x9727000 access("/etc/ld.so.nohwcap", F_OK)= -1 ENOENT (No such file or directory) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78a6000 open("./fake", O_RDONLY)= 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\240\3\0\0004\0\0\0"..., 512) = 512 ... snip ... MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78a1000 write(1, "Password, please? ", 18Password, please? )= 18 read(0, password "password\n", 1024)= 9 write(1, "Oops..\n", 7Oops.. )= 7 exit_group(7)= ?

Looks like the password buffer is 1024 bytes long. I could try overflowing that but coupled with stack randomization (which can be turned off if I remember correctly), it would be hard to do on a lazy Friday. More importantly, my goal is not the break the program but to get the password!

It is starting to seem like my only option left is to sit and reverse engineer the code for this binary; something I did not want to get in to on a Friday afternoon. A geek challenge however trumps laziness any day of the week so I started with the task of disassembling the binary.
Disassembly I start with an objdump output (go ahead and open it in a new tab if you want to follow along):
$ objdump -D ./hackme > out.asm

The assembly is a mess as expected of a stripped binary. I quickly need to find the part that pertains to the password encryption logic. From the test run, I know that the logic is somewhere between where it prints the "Password, please?" message and the "Oops.." message. I will start by locating these strings in the assembly output and backtracing to where they are being used. The "Pa" of the "Password, please?" string is 50 followed by 61 in hexadecimal. A quick search locates the string in the assembly output:
$ grep "50 61" objdumpout.txt 8048798: 00 50 61add%dl,0x61(%eax)

The address of the string therefore is 0x8048799 (since the first byte needs to be skipped). Searching the file for this address leads me to this code:
804859d:68 99 87 04 08push$0x8048799 80485a2:ff 15 94 99 04 08call*0x8049994

Great! It is pushing the address of the string (as a literal) on to the stack and calling an address. I assume that this is the variable storing the address returned by dlsym for printf.
Now I need to find the "Oops.." message. I repeat the same procedure to get to the string and find the code:
8048633:68 c1 87 04 08push$0x80487c1 8048638:ff d0call*%eax

While I am at it, I pick out the "Congratulations!" message as well and find the location. Finally the code looks much more comprehensible:
# The "Password, please?" message is being printed here 804859d: 68 99 87 04 08push$0x8048799 80485a2: ff 15 94 99 04 08call*0x8049994 80485a8: 8d 45 84lea-0x7c(%ebp),%eax ... snip ... 8048626: 83 ec 0csub$0xc,%esp # The "Congratulations!" message is being printed here 8048629: 68 af 87 04 08push$0x80487af 804862e: eb 08jmp8048638 8048630: 83 ec 0csub$0xc,%esp # The "Oops.." message is being printed here 8048633: 68 c1 87 04 08push$0x80487c1 8048638: ff d0call*%eax

I quickly annotate the assembly (bottom posting of comments) so I remember what I figured out:
804859d: 68 99 87 04 08push$0x8048799 80485a2: ff 15 94 99 04 08call*0x8049994 # The "Password, please?" message is being printed here 80485a8: 8d 45 84lea-0x7c(%ebp),%eax # This is probably the address of the password buffer. 80485ab: 5bpop%ebx 80485ac: 5epop%esi 80485ad: 50push%eax 80485ae: 68 ac 87 04 08push$0x80487ac 80485b3: ff 15 90 99 04 08call*0x8049990 80485b9: 83 c4 10add$0x10,%esp # Push the password buffer and the string "%s" onto the stack and call scanf 80485bc: 31 c0xor%eax,%eax # Clear EAX. 80485be: eb 01jmp80485c1 80485c0: 40inc%eax 80485c1: 80 7c 05 84 00cmpb$0x0,-0x7c(%ebp,%eax,1) 80485c6: 75 f8jne80485c0 # Find the string length of the password we entered. Return value in EAX. 80485c8: 31 dbxor%ebx,%ebx 80485ca: 83 f8 13cmp$0x13,%eax 80485cd: 0f 94 c3sete%bl # Hmm! If the strlen(buf) != 0x13) BL is set to 1! We have our first hint! 80485d0: be 0a 00 00 00mov$0xa,%esi # Move integer 10 into ESI. This is the start of a loop that runs 10 times. 80485d5: e8 b6 fd ff ffcall8048390 # Call random(). Return value in EAX 80485da: b9 13 00 00 00mov$0x13,%ecx 80485df: 99cltd 80485e0: f7 f9idiv%ecx # Divide the random number in EAX with 19. EAX is quotient, EDX is remainder. 80485e2: 31 c0xor%eax,%eax # Throw away quotient. 80485e4: 8a 8a 9c 86 04 08mov0x804869c(%edx),%cl # Hmm. That address looks like a lookup table of some sort. # The operation is basically doing "CL = table[remainder]". # Since remainder can't be more that 19, I dump the first 19 bytes of this # address: #0xfb, 0x4c, 0x8d, 0x58, 0x0f, 0xd4, 0xe8, 0x94, 0x98, 0xee, #0x6b, 0x18, 0x30, 0xe0, 0x55, 0xc5, 0x28, 0x0e 80485ea: 0f b6 7c 15 84movzbl -0x7c(%ebp,%edx,1),%edi # This basically does EDI = password[remainder] 80485ef: 42inc%edx 80485f0: 89 95 74 ff ff ffmov%edx,-0x8c(%ebp) # Increment the remainder and store it in another variable 80485f6: 31 d2xor%edx,%edx 80485f8: eb 0cjmp8048606 80485fa: 69 c0 8d 78 01 6dimul$0x6d01788d,%eax,%eax 8048600: 42inc%edx 8048601: 05 39 30 00 00add$0x3039,%eax 8048606: 3b 95 74 ff ff ffcmp-0x8c(%ebp),%edx 804860c: 7c ecjl80485fa # This is a weird loop. It seems to be a pseudorandom generator. # The loop runs while a counter is less than the incremented remainder above. # Inside, it's doing the following (remember eax was cleared above to 0): #eax = eax * 0x6d01788d //This is a prime number according to Wolfram Alpha #eax += 0x3039 // 12345 in decimal # That is an unseeded (or seeded to 0) pseudorandom generator! Nice but # pointless as it is unseeded. 804860e: 31 f8xor%edi,%eax # XOR the pseudorandom value above with password[remainder] as stored above 8048610: 38 c1cmp%al,%cl # Compare the lower byte of the XOR'ed result with the lookup table entry stored in CL 8048612: b8 00 00 00 00mov$0x0,%eax 8048617: 0f 45 d8cmovne %eax,%ebx # If the lower byte of the XOR is not equal to the lookup table entry set EBX=0 804861a: 4edec%esi 804861b: 75 b8jne80485d5 # Decrement the main loop counter (the one that runs 10 times) and jump # if more iterations are left 804861d: 85 dbtest%ebx,%ebx 804861f: a1 94 99 04 08mov0x8049994,%eax 8048624: 74 0aje8048630 # At last! Jump to the failure message (past the congratulations) if EBX is 0! # EBX should be non-zero in order to print the congratulations message! 8048626: 83 ec 0csub$0xc,%esp # The "Congratulations!" message is being printed here 8048629: 68 af 87 04 08push$0x80487af 804862e: eb 08jmp8048638 8048630: 83 ec 0csub$0xc,%esp # The "Oops.." message is being printed here 8048633: 68 c1 87 04 08push$0x80487c1 8048638: ff d0call*%eax

Phew! That was not as bad as I expected! Converting that logic to C code that dumps out the password took me a while with testing and the final result looked like the following:
#include #include int main() { int i, j, edi; char buf[50], ch; char out[50]; unsigned char check; int ret = 0, val, len, rem; int magic; int k; unsigned char arr[] = {0x6a, 0xfb, 0x4c, 0x8d, 0x58, 0x0f, 0xd4, 0xe8, 0x94, 0x98, 0xee, 0x6b, 0x18, 0x30, 0xe0, 0x55, 0xc5, 0x28, 0x0e}; for (i = 0; i < 19; i++) out[i] = 'x'; out[i] = '\0'; for (i = 10; i > 0; i--) { int m2; val = random(); rem = val%19; check = arr[rem] & 0xff; ch = buf[rem++]; j = 0; magic = 0; printf("rem = %d\n", rem); while (j < rem) { magic *= 1828812941; magic += 12345; j++; } m2 = magic; magic ^= ch; out[rem - 1] = (m2 & 0xff) ^ (check & 0xff)); } printf("Password: %s\n", out); }

Now for the fun; the output:
$ ./decompiled rem = 3 rem = 16 rem = 4 rem = 4 rem = 11 rem = 9 rem = 11 rem = 12 rem = 3 rem = 8 Password: xxsaxxxpexYoxxxexxx

The loop in the binary runs only 10 times and it has repeating checks for some offsets of the password. The only characters of the password that matter are the ones NOT marked with an 'x' in the output (I make the program set these as it runs).
Now the sweet part! I run the password against the original program:
$ ./hackme Password, please? xxsaxxxpexYoxxxexxx Congratulations!

HA! That was fun!
Conclusions So what did I learn?
Know Thy Tools 【【转】ELF 加密文件解码初探(思路很好)】I knew how to get around many of the hurdles I faced from past knowledge and experiences with the various tools I used. The better you know your tools, the more you can think about the real issues at hand (in this case, reversing the program logic).
Test the Waters I knew I would not find an easy way to reverse this program but I tried all the easy routes anyway. Even though it did not provide me with much information, I gained confidence after having eliminated some of the options. It cleared the way forward.
Know Thy Assembly The machine instructions were a little hard to decompile and I found myself referencing the Intel manuals from time to time to figure out what was going on. More than the assembly itself though, I really recommend learning the GNU Assembler syntax. I was familiar with the Intel syntax (eg. NASM) but not completely proficient with GAS syntax (AT&T syntax). I found this article and this article very useful in quickly brushing this up.
Some thoughts on the program itself follow
  • Checking only few parts of the password was ineffective although it would not have made life much more difficult if every character was checked. (Note: The original author told me he put the 10 in the main loop for debugging purposes and forgot to change it)
  • The random number was a nice way to scare me a little but in the end, it had to be unseeded to be deterministic and consequently, not very effective. If I had a different version of libc with a different random(), the program would probably have failed with the original password.
  • The actual password was "SesameOpenYourself!"! I came up with a few nonsensical variations that worked as well. eg. "NasaJeeperYouShelby".
All in all, a good Friday afternoon! Again, comments can be mailed to manohar dot vanga at gmail dot com with a title of [hackme].
Download: all files

    推荐阅读