Master Command Line Arguments in x86-64 Assembly with YASM

Master Command Line Arguments in x86-64 Assembly with YASM

Want to master command line arguments in pure x86-64 YASM assembly on Linux? This video breaks down how to access argc and argv directly from the stack, without GCC libraries. Follow along as we write a program to loop through and print user-provided arguments, complete with a visual stack explanation. Whether you’re new to assembly or leveling up, this tutorial offers clear, practical insights for coding low-level programs. Check out my other videos for more assembly tips, and subscribe for more coding goodness!

Introduction to Command Line Arguments 00:00:00
Target Architecture and Assumptions 00:00:04
Recap of Command Line Arguments 00:00:56
Accessing Arguments in Pure Assembly 00:01:47
Program Structure and Data Section 00:02:19
Stack Pointer and Argument Count 00:04:16
Accessing Argument Vector (argv) 00:07:14
Looping Through Arguments 00:12:16
Incrementing to Next Argument 00:16:45
Visualizing the Stack 00:19:24
Running the Program with Arguments 00:21:56
Conclusion and Call to Subscribe 00:23:48

Thanks for watching!

Find us on other social media here:

  • https://www.NeuralLantern.com/social

Please help support us!

  • Subscribing + Sharing on Social Media
  • Leaving a comment or suggestion
  • Subscribing to our Blog
  • Watching the main “pinned” video of this channel for offers and extras

hey there let’s talk about accepting command line arguments in a pure assembly program

written for x86-64 aka amd yasm assembly

on linux anyway so uh there are going to be a lot of topics that i that i sort of mention in this

video that are not actually covered in this video so if you feel yourself getting lost like for

program in assembly in the first place.

If you don’t know what command line arguments are,

if you don’t know the terminal, things like that,

you probably want to check out my previous videos

because they will be explained in detail.

For now, I’m just going to assume you know how to do all this basic stuff.

And the only thing that I need to show you

is how to write a program in Yasm assembly that can

grab your command line arguments.

So, you know, just to do a quick recap of what I’m even talking about.

echo we can launch it with no arguments it doesn’t do anything but we can give

it some arguments we can say hello and then we can say goodbye and that’s two

arguments that I’m giving to echo echo will grab on to the arguments it was

given it’s actually going to receive three arguments the index zero argument

being the name of itself its own program and then the next two arguments being

the stuff that I typed after the program but you can see that it somehow figured

two arguments as strings and print them right so imagine you have an assembly

program and you want the user to be able to launch your program and add

arguments at the end of the command line and have your program behave in some

certain way according to what the user wants the question then becomes how do

you access those arguments it’s a little bit easier in GCC look at my other video

that I posted about the topic for GCC linked assembly programs it’s a little

You pretty much just look at RDI and RSI to get your argc and your character pointer array.

When you’re dealing with pure assembly though, the GCC libraries don’t bundle up the command

line arguments in a really convenient way for you.

You’ve got to look for them elsewhere.

It’s honestly not that much harder.

It’s just a little, feels a little weird.

I think at least for me, I was hoping to see the arguments in a register and I did not.

But anyway, so here’s my program.

Again, this is not an assembly programming video.

an assembly programming video if you need to know how to program assembly see my other videos for

now i’m just going to quickly go over it i’ve got my data section in yasm i am printing out a couple

of strings or i i have a couple of strings defined i’m saying like the module has started i’m going

to begin printing arguments i have some system call codes so that i can write just to standard

output and so that i can exit the program this is pure assembly so we don’t actually return from

we just get jumped into and then we call exit normally and then here’s the standard output pipe

again another video covers that and then i’ve got the exit success code again another video covers

that but basically you know exiting zero is usually what you do for success then i’ve got the

text section which holds all my assembly instructions and this is my entry point we don’t

really need to uh push and pop these registers because we’re just going to exit the program when

it’s totally fine we would need to preserve Kali saved registers if we were

jumping into a main function or if this was a different function that was being

jumped into but we’re not so we don’t have to anyway here’s my entry point the

underscore start for a pure assembly program and then I’m going to grab the

incoming arguments when you’re linking with GCC which we’re not doing here in

this video then the arguments come in very easily they just come in argc

up in RDI and arg sorry argc comes into RDI and the character pointer array comes in as RSI so you

can imagine in GCC this would be integer main and then integer argc character pointer array argv

right hopefully you’ve done this in a higher level language so you have a better idea of what I’m

talking about if not I guess that’s okay it will still allow you to grab even if you don’t

it will still allow you to grab,

even if you don’t understand this in C++.

So argc is actually the stack pointer.

The stack pointer register, RSP,

always tells you where the top of the stack is,

like what’s the last piece of data

that we actually have sitting on the stack.

So that’s gonna be the count of arguments.

argc is always gonna be the stack pointer.

So if you dereference the stack pointer register,

pointer register then what you’re doing is you know the stack pointer register

actually points to memory locations within the stack so if the memory

location that it’s looking at is where argc is stored then you dereference it

with the brackets and you’ll actually get argc so right away I’m just gonna

steal argc off the top of the stack and this is a good idea to do right away

because if you start doing other stuff you might end up modifying the stack

especially if you start calling functions and things like that and and

and and these these pieces of information will be lost or they’ll be a lot harder to find so right

away i’m just going to say argc goes directly into r12 and i have a little comment reminder

for myself at the top saying r12 is now argc same thing for uh r13 i’m going to say that’s a pointer

to my character pointer array i talked about this in depth on my other video where we talked about

an assembly program but basically if you look at uh if you look at argv right here it’s not just a

pointer and it’s not just an array it’s a pointer to an array and if you recall in higher level

languages or just i guess anything an array itself is a pointer because argv the symbol let’s say we

were in a higher level language argv can only point to one thing so it’s going to point to the very

but the beginning of your array is not going to be one string like one argument that the user gave

you because the user could give you many arguments instead it’s an array of pointers

so it’s a character pointer array meaning argv is a pointer to a pointer

the first item in any array is what is pointed to by the symbol so like imagine if we had just

had just for the sake of argument imagine if we had an integer array let’s say we had like 500

integers in like a c++ program so the symbol a is what i’m trying to say the symbol a

is really a pointer that points to the first integer in that array and then later of course

you can dereference that pointer with some sort of an index to get the index 5 integer or the

the array and a pointer to the first item.

Or I guess you could say it’s both a pointer to the array

and a pointer to the first item,

because it’s kind of the same thing.

So that means if we grab argv,

then we’ll be grabbing a pointer.

If we then dereference that pointer,

it would tell us where another pointer is.

And then that other pointer would point to a string

representing the first argument.

first argument so we’ll explain that more as i do this loop i’m going to i’m going to show you

some code that actually loops through all the arguments anyway the thing about where to find

that argv is it’s just the next pointer or i guess it’s the it’s the next item that the stack holds

so remember i told you that rsp when this when this function comes in rsp is a pointer to wherever

we’re we’re looking what memory location the top of the stack is so if we dereference that we get

So if we dereference that, we get argc.

Well, we just need to go find the next previous item in the stack

and then dereference that so that we can get the first pointer or argv,

you know, the first pointer to a pointer,

or the first double pointer in your array of pointer to pointers.

Why am I doing plus eight?

This might actually make sense as is,

sense as is but it’s important to understand that when you add stuff to the stack you’re actually

decreasing the memory locations uh that you are pointing to so like if i if i increase a stack

this is not a stack video there’s other videos that i have for that um if we add something to

the top of a stack you imagine the stack growing visually in a in a vertical direction like it’s

growing up right but inside of the computer the memory locations are actually going down

If we’re looking at the stack pointer and we’re adding eight, what it’s saying is that

we’re looking back into memory that was already added for the stack.

So that’s like if we grew the stack, that would be subtracting from the memory location.

So like looking further up in the stack would be subtracting from the memory location because

the stack grows upward in the abstract, but downward in memory locations.

So that means if we add eight, that means we’re going in the other direction.

that means we’re going in the other direction.

We’re looking downward into the stack.

So what that means is that the top two items on the stack

are first argc and then second, that first pointer,

you know, the argv argument.

And that’s how we access it with plus eight.

Why is it a plus eight instead of plus something else?

Because all pointers on 64-bit systems are 64-bit integers

and 64-bit integers are eight bytes.

If we’re talking about eight bytes, sorry,

if we’re talking about eight bits per byte.

So this is just how you access it.

What about this other thing going on here?

What’s this LEA instruction?

Usually you see a move instruction, right?

So we basically want to look at the stack pointer,

but look one level lower.

And each item on the stack is also going to be eight bytes.

We want to look one item lower on the stack,

and then we want to dereference it.

but we don’t want to dereference and store the dereferenced value we still want to grab the

actual pointer the thing is I shouldn’t have said dereference in the first place these are dereferencing

brackets right so like up here on line 46 when you put brackets around rsp it dereferences whatever

value rsp holds so therefore when you put brackets around this you kind of expect that you’re going

to be dereferencing right but we don’t want to dereference we just want the original address that

showing us. So the problem with this is that when we say RSP plus eight,

we can’t actually do a mathematical formula unless we’re inside of brackets, which means we

can’t do them in a move instruction without accidentally dereferencing and losing a pointer

to the actual array of pointers. If we dereferenced like this, we would just end up with,

you know, a pointer to one string. So the way around this is instead of using the move instruction,

instead of using the move instruction we use the lea instruction the lea instruction allows us to

put a formula inside of brackets so that the assembler won’t get confused but then it won’t

put the dereferenced value into r13 it’ll just put the actual value of whatever we see

with that formula meaning it’ll give us the memory location of the item sitting one under the top of

this at this point r13 is now the memory location of the of the item sitting one underneath the top

of the stack and then we can de-reference that later in order to look at all of our arguments

so the next part of our code is just going to print an intro message so there’s like this intro

string up here hello and um i’m basically going to call a custom function that i made to just sort of

Printing. Don’t worry about this code right here.

This is not the point of the video.

I just made a custom function that just kind of helped me print.

Don’t worry about that.

The real meat of this video is that we’re going to loop through all arguments

and just print every single argument that the user provided.

So you can imagine this is maybe the top of a while loop here.

Notice how, oh, I forgot to replace main with start with start.

I had another version of this program that used main.

pretend we have this is called the main loop instead of the the main functions

loop anyway so main loop initialize first thing I’m going to do is I’m going

to initialize the loop by saying we’re looking at index 0 okay no problem so

we’re looking at index 0 we want to do that we want to look at all the

arguments you know we want to look at index 0 and index 1 and index 2 and we

just want to keep going until we’re out of arguments remember that we also have

the number of arguments coming from argc now sitting in r12. So that means if we have a

counter that starts at zero with r14 and then we know how many arguments there are in r12,

we should be able to know when the loop stops. So just a reminder that, you know, the loops here,

the arrays here are zero-based indexed or zero-index based, which means, for example,

that’s three arguments but the array of arguments is going to have indexes zero and one and two

which means the last valid index is going to be two or the size minus one think about that if you

have five arguments the last index that is valid is going to be four size minus one because it’s

zero based so we can use that logic to figure out if we’re done so we start off with an index of zero

here and then at the top of the loop i’m just going to quickly ask are we done how do we know

are we done how do we know if we’re done we compare the current index we’re

looking at with the count and we say if the current index we’re looking at is

greater than or equal to the count then we know we’re already done we don’t need

to look at any more indexes why am I saying is is the index greater than or

equal to the count because remember if we have five arguments then the last

valid index is four therefore if we find ourselves looking at index five we know

already done five is equal to or greater than five, but four is not equal to or greater

than five.

So that’s why I’m using this logic here.

I’m saying if we’re done, then jump to the done label.

Basically what’s the done label?

It’s main loop done.

All it really does is it says goodbye and then it just exits the program with a system

call.

Again, I have other videos that explain system calls and all that stuff.

So if we’re done, we, uh, we jumped down to the done area, but if not, then execute

to the done area but if not then execution will fall through to this next loop or sorry this next

label which is not necessarily we don’t really need a label there but i i like to put i like to

put it there just to help myself remember this is what it looks like in terms of a while loop you

know i have like here’s the top of the while loop here’s the little comparison part of the while loop

here’s the body like the opening braces of the while loop and then here’s sort of like the

closing braces i just like to do that but anyway after we decided that we are not done and we drop

done and we drop through to the next actual instruction we’re just going to

print the next argument so what’s the next argument we will dereference r13

remember up here we took the second item sitting like you know one under the top

of the stack and we just stuck it into r13 so this is r13 is now the address of

the second item on the stack if we de-reference it then that’s going to give

then that’s going to give us a pointer to our um sorry it’s going to give us a pointer to the first

item to the first string i should say to the first string uh the f this the string of the first

argument so if we dereference you know let me say this one more time just to make sure that i’m

item, you know, the item right under the top of the stack, that’s going to be a pointer

to the first argument string.

So if we just simply dereference R13, then we’re basically telling RDI, here’s a pointer

to the string that we want to print.

Okay, so then I just call my helper function to print that string to standard output, no

problem.

And then we have to figure out like, how do we increment to go to the next string?

might have been unclear earlier because gcc does it in a different way but on the stack

every item underneath the the top of the stack is a is another pointer it’s part of the original argv

array it’s another pointer to a different string so basically if i increase r14 here then i’m

increasing the index counter that’ll help us eventually terminate but notice how here i’m

Remember R13 is the second item, you know, the item right under the top of the stack.

And if I dereference that, then I now have a pointer to the first argument string.

If I want to go one lower into the stack, then I just add eight to that register’s value.

Because again, remember, the stack grows downward in memory.

So if I increase the value, then I’m sort of like going through previous items that were put into the stack.

So imagine the stack here, it’s got like,

you know, argc sitting on top

and then underneath argc,

it’s got a pointer to the first argument.

And then under that in the stack,

it’s got a pointer to the second argument.

And then under that in the stack,

it’s got a pointer to the third argument.

Then under that, it’s got a pointer

to the next argument and so forth.

So every time we wanna go to the next arguments pointer,

the next arguments string pointer,

we just add eight to that R13 register,

which was originally just pointing at the first string.

so again why eight because pointers are 64-bit integers therefore they are eight bytes so I’m

just literally increasing the index by one and then I’m moving that register r13 to point to

the next pointer that way next time I dereference it up here on line 70 I’ll be dereferencing the

next string and these strings don’t need to be contiguous they could be located anywhere it’s

that is contiguous but it’s just full of pointers to other strings or sorry it’s full of if we de

reference the stack pointer at any point then we will get the address of a string so then we will

get a pointer to a string this double pointer stuff sometimes i get tongue tongue tired okay so uh

when we’re done printing and incrementing then we just jump to the top of the loop

And then finally, when we’re done, well, we’re done.

Let me draw this out for you, because I think the way I’m explaining it might be

a little bit unclear, so I just want to make sure that I’m being totally clear on

this, okay, so I’ve got my little annotator here and you can imagine here’s a stack

and it visually grows up.

We can imagine that every time we add an item to the stack, it grows up.

Like if you wanted to take a five, stick it on top of the stack.

Well, then it would end up on the top of the stack, right?

No problem.

Right? No problem.

Oh, maybe this needs to be bigger because of my pen size.

So let’s do it like that.

What the heck happened?

I lost the whole desktop.

There we go.

So we have this and this and this and this.

And so there’s probably some kind of a value sitting on top of the stack.

When we started the program,

RSP is a register that just has the memory location of the top of the stack,

you know, the most recently added item.

you know the most recently added item we know that this was actually arg c so if we dereferenced rsp

that’ll go to wherever rx is and get it so we can basically say that rx is sitting on the top of

the stack so you know we dereference our rsp which holds a memory location to this place in memory

we just had an integer stored an eight byte integer okay cool so then the next lowest i guess

the next highest item or the next lower item is arg v at index zero which is the same thing as

just saying arg v remember a pointer to the array is really a pointer to the first item

the next lower item is arg v at index one and so forth so we can just keep doing this we can keep

keep doing this we can keep going lower and lower and lower on the stack by adding eight to r13

because remember r13 originally pointed uh you know to this item right here i don’t know i’ll

say like r13 maybe starts off pointing uh to the first argument because that’s the way we have it

set up so every time the loop iterates if we dereference r13 then we’re getting a pointer to

when we add eight bytes to r13 we’re really just moving it down to the next pointer that we can

dereference hopefully that was a little bit more clear than what i said before or maybe you’re just

a visual learner um you know it’s a good idea to try to explain things in many different ways

but long story short we’re going down down down further in the stack to get more arguments and

we’re just printing them and so now that we’ve explained it all we should be able to just run

clear and make run and under the hood let me just show you real fast what I’ve got inside of my make

file under the hood when we do make run you don’t need to know about make files I have a video that

explains all of this don’t worry but well I mean you don’t need to know it to understand this video

but you probably do need to know it notice how when I call the executable which in this case is

I give it some arguments.

I give it just some strings.

First arg, second arg, third arg, fourth arg.

So that’s the same thing as if I typed echo first arg,

second arg, third arg, fourth arg, right?

That’s what echo is doing.

So in the make file, I’m just giving it those arguments.

And then our program now is looping

through all of the arguments.

It knows when to stop because we grabbed argc

and it knows where those strings are located

are located because the pointers to those strings are just sitting on the stack so if we deref twice

we’re derefing we’re doing a double dereference a pointer to a pointer

at least we could say r13 is a pointer to a pointer but the actual values sitting inside

the stack are just pointers to strings um so let’s just do this in a slightly different way

clear and then I’ll say main because that’s the name of the program that I

compiled you can imagine this is could be named something else and I’ll just put

hello you are super cool now so I’m giving it one two three four five

arguments in addition to main so it should print six things main hello you

are super cool now so notice how it prints main hello you are super cool now

Alright, that’s pretty much everything that I wanted to show you for the pure assembly

version of Grappin Command Line Arguments.

Thank you so much for listening.

It’s a little late for me.

The sun’s starting to come up.

I’ve got to go and eat a bunch of cookies.

So I hope you learned some stuff and I hope you had a little bit of fun.

Thank you so much for watching this video.

I will see you in the next one.

Happy studying and coding and all that stuff.

Hey everybody!

Thanks for watching this video again from the bottom of my heart.

I really appreciate it.

I do hope you did learn something and have some fun.

If you could do me a please, a small little favor,

could you please subscribe and follow this channel or these videos

or whatever it is you do on the current social media website

that you’re looking at right now.

It would really mean the world to me and it’ll help make more videos

and grow this community.

So we’ll be able to do more videos, longer videos, better videos,

or just i’ll be able to keep making videos in general so please do do me a kindness and uh and

subscribe you know sometimes i’m sleeping in the middle of the night and i just wake up because i

know somebody subscribed or followed it just wakes me up and i get filled with joy that’s exactly what

happens every single time so you could do it as a nice favor to me or you could you control me if

you want to just wake me up in the middle of the night just subscribe and then i’ll i’ll just wake

up i promise that’s what will happen also uh if you look at the middle of the screen right now you

of the screen right now you should see a qr code which you can scan in order to go to the website

which i think is also named somewhere at the bottom of this video and it’ll take you to my

main website where you can just kind of like see all the videos i published and the services and

tutorials and things that i offer and all that good stuff and uh if you have a suggestion for uh

clarifications or errata or just future videos that you want to see please leave a comment or

up what’s going on you know just send me a comment whatever i also wake up for those in the middle of

the night i get i wake up in a cold sweat and i’m like it would really it really mean the world to

me i would really appreciate it so again thank you so much for watching this video and um enjoy the

cool music as as i fade into the darkness which is coming for us all

Thank you.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply