Drawing the Stack-CFANZ编程社区

Intro

I’m new to everything demonstrated below, so be forewarned of potential errors. This technique of drawing the stack was something a professor hammered into the class this year. Let’s dive in!

Author Assigned Level: Newbie

Community Assigned Level:

Required Skills

Since C and Assembly are used, C and Assembly would be useful to know before reading. The examples, however, should be simple enough for any beginner to read and understand from context.

C

Nothing fancy here:

#include <stdio.h>

int callMe(int num1, int;

int main(){
int a = 0;
int b = 2;
int c = callMe(a, b);
printf(“%d\n”, c);
return 0;
}

int callMe(int num1, int{
int ans = num1+num2;
return ans;
}

Assembly

; main()
   <+0>: push   rbp
   <+1>: mov    rbp,rsp
   <+4>: sub    rsp,0x10
   <+8>: mov    DWORD PTR [rbp-0xc],0x0
   <+15>:   mov    DWORD PTR [rbp-0x8],0x2
   <+22>:   mov    edx,DWORD PTR [rbp-0x8]
   <+25>:   mov    eax,DWORD PTR [rbp-0xc]
   <+28>:   mov    esi,edx
   <+30>:   mov    edi,eax
   <+32>:   call   0x68f <callMe>
   <+37>:   mov    DWORD PTR [rbp-0x4],eax
   <+40>:   mov    eax,DWORD PTR [rbp-0x4]
   <+43>:   mov    esi,eax
   <+45>:   lea    rdi,[rip+0xb6]        # 0x734
   <+52>:   mov    eax,0x0
   <+57>:   call   0x530 <printf@plt>
   <+62>:   mov    eax,0x0
   <+67>:   leave  
   <+68>:   ret

; callMe()
<+0>: push rbp
<+1>: mov rbp,rsp
<+4>: mov DWORD PTR [rbp-0x14],edi
<+7>: mov DWORD PTR [rbp-0x18],esi
<+10>: mov edx,DWORD PTR [rbp-0x14]
<+13>: mov eax,DWORD PTR [rbp-0x18]
<+16>: add eax,edx
<+18>: mov DWORD PTR [rbp-0x4],eax
<+21>: mov eax,DWORD PTR [rbp-0x4]
<+24>: pop rbp
<+25>: ret

In short, main() passes a and b to callMe() through edi and esi, which returns a + b in eax.

The stack

Now that we can see the assembly, we can draw out exactly how the stack looks and have a better understanding of what our code is doing. A practical use for drawing the stack is for debugging, but it’s also a fun exercise (which is what spawned this topic)!

The stack grows from high addresses to low addresses, top to bottom. Typically, we reason about the stack such that the low addresses are on top; in this sense, we will be using an “inverted stack” in our depictions.

main(): lines <+0> and <+1>

<+0>: push   rbp
   <+1>: mov    rbp,rsp

The very first stack modification is push rbp, from main(). rbp is a 64 bit register, so it takes up 4 rows on the stack (each row is word-sized, which is 2 bytes):

< high addresses >
|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP and RSP (main)
|------|
|      |
< low addresses >

rbp is used as a helper register for performing stack functions. It is used as a “base” when traversing up or down through the stack. Using rbp in this manner is also a way to setup and use local variables, which we will talk about later on.

A common analogy for this technique is “dropping an anchor”. rsp changes constantly, as the stack grows and shrinks; copying its current value to rbp “anchors” it for however you want to use it, while still allowing you to push and pop.

Moving on!

main(): line <+4>

<+4>: sub    rsp,0x10

This instruction makes space on the stack to store local variables, in our case int a and int b, both 32-bits. Now I know I said we shouldn’t mess with rsp, but moving the stack pointer is how you can create local variables.

Since a and b are both integers, we need to allocate a minimum of 64 bits on the stack. On this line, we can see that 16 bytes have been allocated, by subtracting 0x10 from rsp's current value. Here is what our stack looks like after this instruction:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
|      | -0x2
|------|
|      | -0x4
|------|
|      | -0x6
|------|
|      | -0x8
|------|
|      | -0xa
|------|
|      | -0xc
|------|
|      | -0xe
|------|
|      | <-- RSP (main)
|------|
|      |

Now that we have this space allocated, rsp is once again “fixed” to it’s current location; we can use rbp to access the spaces allocated by rsp.

As we will see, this is done by subtracting some number of bytes, 0xn, such that [rsp-0xn] points to the last byte. How far up to read is determined by another factor, which we will see below.

main(): lines <+8> - <+25>

<+8>: mov    DWORD PTR [rbp-0xc],0x0
   <+15>:   mov    DWORD PTR [rbp-0x8],0x2
   <+22>:   mov    edx,DWORD PTR [rbp-0x8]
   <+25>:   mov    eax,DWORD PTR [rbp-0xc]

Within the first two lines, we are initializing our local variables, a and b, to 0 and 2, respectively.

When accessing the stack, the assembler cannot infer the size of the object we want to retrieve (or set); so we have to specify it ourselves. DWORD PTR [rbp-0xc] tells the assembler “hey, we want you to look at the value at [rbp-0xc] and read 4 bytes (2 words, specified by DWORD)”.

When we are accessing the stack, I mentioned that [rsp-0xn], for some hex n, points to the “last byte”: this is specifically the last byte of the size you specify. So for DWORD PTR [rbp-0xc], the assembler goes to [rbp-0xc] on the stack, and then reads “up” 4 bytes.

With that said, here’s how our stack looks now that some values have been initialized:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
|      | -0x2
|------|
|      | -0x4
|------|
| 0000 | -0x6  \
|------|        | int b = 2;
| 0002 | -0x8  /
|------|
| 0000 | -0xa  \
|------|        | int a = 0;
| 0000 | -0xc  /
|------|
|      | -0xe
|------|
|      | <-- RSP (main)
|------|
|      |

Note: for the purposes of this tutorial, we will not take into account endianness

As you can clearly see, we have initialized some local variables on the stack without modifying rsp or rbp.

The other two lines do not modify the stack, but simply retrieve values from it. This notation is of the same manner that adding things to the stack is, except the memory operand is the 2nd operand of the instruction (and not the 1st).

The important thing to note is that we have copied a into eax and b into edx.

Onward!

main(): lines <+28> - <+30>

<+28>:   mov    esi,edx
   <+30>:   mov    edi,eax
   <+32>:   call   0x68f <callMe>

These first two lines are nothing special: we are moving a into edi and b into esi. These two registers are used to pass the arguments to the callMe() function, which is what happens next.

When the call instruction is executed, it pushes the address of the next instruction onto the stack (rip); it then copies the address of the called procedure into rip. This modifies our stack:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
|      | -0x2
|------|
|      | -0x4
|------|
| 0000 | -0x6  \
|------|        | int b = 2;
| 0002 | -0x8  /
|------|
| **** |          * int a was overwritten,
|------|            as it is never used again
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
|      |

At this point, we are now in callMe().

callMe(): lines <+0> and <+1>

<+0>: push   rbp
   <+1>: mov    rbp,rsp

This should look familiar…

We are once again dropping an anchor, this time saving “old” rbp's value (since we use it again after this function returns). Here’s how our stack looks now:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
|      | 
|------|
|      | 
|------|
| 0000 | 
|------|        
| 0002 | 
|------|
| **** |         
|------|  
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (callMe)
|------|
|      |

One important thing to notice here is that we didn’t modify rsp. This is because we want the stack to “look the same” when we enter and leave a function. This will be covered in a bit more detail later on.

callMe(): lines <+4> - <+13>

<+4>: mov    DWORD PTR [rbp-0x14],edi
   <+7>: mov    DWORD PTR [rbp-0x18],esi
   <+10>:   mov    edx,DWORD PTR [rbp-0x14]
   <+13>:   mov    eax,DWORD PTR [rbp-0x18]

These instructions should also look familiar, as they were executed earlier almost verbatim. The important difference is that in the first two lines here, we are initializing int num1 and int num2 with the values inside edi and esi, respectively.

Here is how our stack looks after these instructions:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
|      | 
|------|
|      | 
|------|
| 0000 | 
|------|        
| 0002 | 
|------|
| **** |         
|------|  
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP, RSP (callMe)
|------|
|      | -0x2
|------|
|      | -0x4
|------|
|      | -0x6     
|------|
|      | -0x8
|------|
|      | -0xA     
|------|
|      | -0xC
|------|
|      | -0xE     
|------|
|      | -0x10
|------|
| 0000 | -0x12    
|------|
| 0000 | -0x14
|------|
| 0000 | -0x16    
|------|
| 0002 | -0x18
|------|
|      |

You will notice that assembly provided us with way more stack space than necessary; I am not sure why that much padding is there, but the best answer I can find is here 14. Please join in on the discussion in the comments if you can explain this further.

Now to the guts of this function!

callMe(): lines <+16> - <+21>

<+16>:   add    eax,edx
   <+18>:   mov    DWORD PTR [rbp-0x4],eax
   <+21>:   mov    eax,DWORD PTR [rbp-0x4]

These instructions are pretty straight forward: add b and a and store that value on the stack:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
|      | 
|------|
|      | 
|------|
| 0000 | 
|------|        
| 0002 | 
|------|
| **** |         
|------|  
| **** |
|------|
| **** |
|------|
| **** | <-- RIP (main)
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP, RSP (callMe)
|------|
| 0000 | -0x2
|------|
| 0002 | -0x4
|------|
|      | -0x6     
|------|
|      | -0x8
|------|
|      | -0xA     
|------|
|      | -0xC
|------|
|      | -0xE     
|------|
|      | -0x10
|------|
| 0000 | -0x12    
|------|
| 0000 | -0x14
|------|
| 0000 | -0x16    
|------|
| 0002 | -0x18
|------|
|      |

We then pull that value back into eax and move on.

callMe(): lines <+24> and <+25>

<+24>:   pop    rbp
   <+25>:   ret

This first line is what I mentioned earlier: we want the stack to look the same when we leave a function as it did when we entered. Because we never modified rsp within callMe(), it is still pointing to rbp's old value from when we used it in main().

The other “cool” thing about not shifting rsp on this go around has to do with ret. When ret is executed, it pops the top of the stack into rip. After popping rbp, rsp points to main()'s return address. This means that those variables we just stored on the stack within callMe() will get “wiped out”.

Now I put “wiped out” in quotes because what’s actually happening is that rsp changes and we lose access to those locals. They don’t get erased though (until rsp overwrites them), which raises some fun vulnerabilities outside the scope of this tutorial.

As you can see, popping rbp and returning significantly changes our stack:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
|      | 
|------|
|      | 
|------|
| 0000 | 
|------|        
| 0002 | <-- RSP (main)
|------|
|      |

rbp has been restored to its original (to main()) value, and rsp is pointing to the first byte preceding rip (before it was popped).

Onto the final instructions!

main(): line <+37>

<+37>:  mov    DWORD PTR [rbp-0x4],eax

callMe() passed our answer (2) back to main() inside eax; this instruction simply stores that value on the stack:

|      |
|------|
| **** |
|------|
| **** |
|------|
| **** |
|------|
| **** | <-- RBP (main)
|------|
| 0000 | -0x2
|------|
| 0002 | -0x4
|------|
| 0000 | 
|------|        
| 0002 | <-- RSP (main)
|------|
|      |

BOOM!

At this point, we have saved a+b on the stack and, for the purposes of this tutorial, we are finished. The next set of instructions is setting up to printf() our calculated value, but we have covered all the fun stuff now.

Conclusions

Drawing the stack can help you debug, explore vulnerabilities, or kill some time; it assists in giving you an empirical understanding of exactly what a program is doing.

This example was simple (yet still required a long-winded post), but you can do this for anything you want.

Carry on!