Master File Copying with System Calls in x86-64 YASM Assembly

Join me as I break down a simple yet powerful x86-64 YASM assembly program to copy files using system calls! Learn how to open input files, create output files, and use a looping buffer for efficient data transfer. I’ll demo the code, explain file handles, permissions, and error handling, and even verify the copy with MD5 checksums. Perfect for intermediate assembly programmers or anyone curious about low-level file operations. Check out my other videos for more assembly tips, and don’t forget to subscribe!

Introduction to File Copy Program 00:00:00
System Calls in YASM Assembly 00:00:06
Opening Input File 00:00:17
Creating Output File 00:00:22
Using Looping Buffer 00:00:28
Assembly Program Prerequisites 00:00:37
Overview of Source File 00:00:53
Data Section and Strings 00:01:06
Copy Buffer Length 00:01:21
File Permissions Explanation 00:02:05
System Call Codes 00:02:57
File Descriptors and Exit Codes 00:03:24
Text Section and Entry Point 00:03:45
Welcome Message Function 00:04:01
Print Null Terminated String 00:04:16
Running Initial Program 00:06:26
MD5 Checksum Explanation 00:07:00
File Tests Function Introduction 00:07:28
Open File Read Function 00:09:52
Checking File Handle 00:14:00
File Handle Concept 00:14:25
Error Handling Importance 00:15:24
Testing File Open 00:23:00
Create File Function 00:31:16
Testing File Creation 00:32:36
Copy File Function 00:36:14
Stack Buffer Creation 00:38:16
While Loop for Copying 00:40:14
Read System Call 00:40:47
Write System Call 00:45:03
Checking Read/Write Operations 00:46:51
Final Program Run 00:49:00
Verifying File Copy with MD5 00:49:22
Testing with Larger Input 00:51:20
Optimizing Buffer Size 00:53:41
Conclusion and Call to Subscribe 00:54:41

Thanks for watching!

Find us on other social media here:

https://www.NeuralLantern.com/social

Please help support us!

Subscribing + Sharing on Social Media
Leaving a comment or suggestion
Subscribing to our Blog
Watching the main “pinned” video of this channel for offers and extras

Hey everybody in this video i’m going to show and explain a simple program that copies a file

using system calls in an x8664 yasm assembly program

we’re going to use the system calls to open an input file and read characters from it we’re

going to use another system call to create a destination file and write characters to it

we’re going to use a looping buffer,

which should be kind of fun in the middle.

And I’m just going to do my best to explain as much as I can.

I should say though that before you can watch this video,

you probably already need to know how to program an assembly,

at least in a basic level, Yasm assembly.

And so if you don’t know how to do that yet,

this video is probably going to be confusing for you.

You probably need to look at my other videos dealing with Yasm assembly and

system calls and so forth.

So I’m just going to show you real fast what’s actually going on in this source

actually going on in this source file so far.

So this is my assembly program.

It’s not finished.

I’m going to write it on screen for you.

For the most part, you can see I’ve got a data section here and then I just have a

bunch of null terminated strings.

So I’ve got a string saying, hey, the module started.

We’re about to open the file.

We’re about to create the file.

We failed to do something.

We’re done copying.

You know, we terminated the program, you know, whatever.

Right. So I just have some strings.

No big deal.

Down here, I have something called the copy buffer length, which is which is just

length with just which is just the size of the buffer that I’m going to use

between the input file and the output file so a two byte buffer is really

inefficient it’s too small but I made it two bytes just to show to you that are

looping you know a read area is going to actually work because if I make a buffer

that’s too big if it’s bigger than the file then we won’t actually know if the

if the buffer loop works or not so I’m just gonna put two here could change

that later if we wanted to we’re gonna be reading from a file called input and

called input and writing to a file called output so that’s no big deal in order to open a file with

a system call for read mode we’re just going to use a zero as the flags that just means read mode

and then for the creation of the file we’re going to use some standard permissions this is not a

permissions video I’m going to probably release a permissions video at some later point in time

not too important right now but basically this 640 is the heart of what permissions we’re actually

what permissions we’re actually looking at.

Q is quad word.

And six just means that the owner can read

and write to the file.

Four means that anyone who is in the same group

as the file has been assigned to,

which is usually the owner,

can just read it but not write to it.

And then everybody else,

people who are not the owner

and who are not in the right group,

they have no access to it.

Just a simple security feature of Linux

for file permissions.

You can go a lot further than this in the terminal,

in the terminal, but for this assembly program,

we’re just gonna use basics.

We’re not gonna use ACLs.

I don’t even know how to do that in assembly.

System call codes.

So again, if you don’t understand system calls,

you gotta watch my other videos first,

but we have a code of zero to read from a file,

one to write to a file, two to open a file,

three to close and open file.

Probably should have put that one before create,

but I don’t really care.

85 is the code to create a file for writing,

to create a file for writing and then 60 is the code to exit from the program.

So then we have file descriptors.

Descriptor number one is a standard output.

Descriptor number two is standard error.

And it’s always one and two, no matter what program you’re dealing with.

Unless you have some kind of really crazy non-standard thing going on.

And then for the exit codes, we’re just going to say exit zero for success and exit one

for failure.

Now to the actual tech section where our instructions are in the assembly program.

in the assembly program.

So you can see section text here.

We’ve got a global entry point called underscore start.

So this is not a GCC program that requires a main entry point.

This is a pure assembly program that requires

an underscore start entry point.

Again, see my other videos for more assembly explanation.

I’m gonna call on a method,

well, not a method of regular function called welcome.

Just to print a little welcome message,

you can see right here, that’s all that’s doing.

It’s just loading, you know, a string,

sending it to standard output, and calling on a helper function that I made called print null

terminated string. When it’s done with that, it just uses the call code to exit the program.

No big deal. Nothing too advanced so far. I have the tests function commented out because I want

to write that in front of you. And if you look at what I have inside of print null terminated string,

this is not the point of the video, so I’m just going to skim it. It’s not super important.

called print null terminated string that takes in a pointer to a character array

and a file handle for where you want to print it and what it’ll do is it’ll come

in grab the incoming arguments it’s only got two arguments so it uses r12 and r13

to store those arguments and then r14 is the result of calling another helper

called string length you can imagine my string length function just kind of

scans the string until it finds a null terminator aka the number zero not the

zero but actually a zero in order to figure out how long the string is and then it can use a

regular system call to print the string to the right file descriptor so we’re going to be doing

most mostly this kind of this exact same thing when we copy the file so I’m just definitely not

going to explain it if you don’t know pushes and pops and epilogues and prologues see my other

videos crlf is just going to print a carriage return new line so it or sorry carriage return

sorry, carriage return line feed.

So that’s just like making the cursor go to the next line.

Not a big deal.

We just print a string basically.

And then I have a custom die function

that allows us to die with a failure code of one.

Oh, I forgot to mark that.

Let’s see, 206 exit fail.

I’m just gonna put that down here.

Yeah, instead of hard coding values,

it’s a lot better to use variables if you can or defines.

So all that’s gonna do is just kind of print

an error message and then exit

then exit with the you know the appropriate

Exit code which is just going to be one because this is a simple program

So CRLF print an alternated string dying string length a welcome message all this fun pretty stuff

That doesn’t really do anything except make the program more fun to look at so I’m going to run it here and make sure that actually works

Okay

Nice, okay, so if I run it you can see that the the make file system again

file system again this is not a make file video so I’m not going to show you

my make file see my other videos if you want to learn more about make files but

you can see that the assembly program says that it started and then it just

exited then the make file continues to run these extra commands that I have

set up so this is not part of the assembly program this is just taking a

checksum an MD5 checksum of the input file and then an MD5 checksum of the

what empty five checksums are i’ll probably release a video sometime in the future if you’re

interested on all my platforms just talking about why we why you would use a checksum but for now

just imagine this is a fingerprint so if the fingerprints don’t match that means the files

don’t match right now the output file doesn’t even exist so it just is an error and that’s why the

make file thinks there’s an error because the output file didn’t exist so that’s the basics

of kind of getting started with this you know bare bones program now let’s start looking at running

program. Now let’s start looking at running the file tests. So file tests, we got to make a new

method. I’m going to stick it down here. Let’s see. I got a solution up so that I don’t have to

spend too much time typing. I’m going to try my best to balance between copy pasting and just

typing very quickly. But let’s see, where’s the file test? So we got this. I’m going to put the

I’m feeling pretty lazy.

I’m just going to copy paste the whole thing.

Copy paste the whole thing.

So now this method right here is going to get called the file tests.

I keep saying method because I teach C++ a lot too.

File tests.

If we go down further, this function is called file tests.

Here’s the signature.

It doesn’t take any arguments.

It just kind of does stuff.

It doesn’t return anything.

And here are the registers that I’m going to use.

the input file handle, the output file handle, and then the count of bytes read from the input

file at any given time inside of our looping buffer reader section. So the first thing I’ve

got to do is I’ve got to open a file to read. The second thing I’ve got to do is create a file to

write. And then I’m going to copy the input file to the output file. And then I’m going to print

a message saying, hey, everything was successful. And then I’m going to close both files. Notice how

anything i’m just sort of calling other functions do the work for me again if you don’t know the

prologue and epilogue stuff or calling functions and returning you should see my other videos

but i just like to use helper functions assembly is like so unwieldy right it’s it just gets out

of control so quick and so confusing so fast so anytime you can just you know take a chunk out

of your assembly code and move it somewhere to another module or to another function

going to make your life a lot easier and make debugging a lot easier people who try to write

the entire portion of their program in just one gigantic function those are the people who usually

end up spending 10 times longer debugging for no reason at all so i believe in the power of modular

thinking so anyway what are we going to do inside of the open file read function it’s not too bad

to be honest let’s see if i can find it real quick open file read i’m going to copy paste it because

Again, I’m pretty lazy.

I’ll just explain what it is though.

So down here we need a function called open file read.

So I’m just going to do that.

And you can see the signature that I’ve chosen for this is, you know, I like to

write all my functions in C++, uh, kind of lingo or prototypes so that I can have a

better understanding of what the assembly is actually supposed to do.

So you can see the function called open file read.

I want it to take two arguments.

two arguments the first argument should be a pointer to a string that represents the file

name that i want to open and it should be null terminated again if you look back at my uh at

my strings up here they all have little zeros at the end so they are all null terminated anyway uh

so that’s the first argument that’s going to show up as rdi in assembly and then the second argument

is the flags for opening the file so i think probably that was redundant if i if i name this

that was redundant if i if i name this open file read then i think it should be obvious that the

flags are just going to be the read flags only so i probably didn’t even need to provide this

that was bad design on my part you could write a better one on your own where it’s just one argument

and if the name is read then just use the read flags but if the maybe if you want to leave the

flags in there you could just say open file only it’s up to you anyway so it attempts to open a

successful it’ll return the file handle that’s the long return type right here

long file handle if it fails then it will just basically complain and exit

the program you probably want a more sophisticated way of handling errors in

your program I just decided to complain and exit the entire program because this

is you know not supposed to be super complicated I just want to show you how

to copy files and then for me I like to leave comments that just sort of explain

And once my functions get so complicated that I actually run out of registers,

then that tells me I probably need to just make another function,

you know, split the work up in some way.

So far, so good.

I only end up reusing the same registers for multiple purposes occasionally.

I’m not a hardcore assembly programmer.

I’m just, you know, I’m like medium.

So I’m going to use R12 and 13 and 14.

That’s why we have the prolog pushing all those registers to preserve them

pushing all those registers to preserve them because they’re callee saved and then the epilogue

epilogue that just kind of pops them i’ve got a label here for the function remember a function

is just a label and then a return statement as long as you obey and respect the abi like preserving

certain registers then you should be okay first thing i’m going to do is grab the incoming

arguments so i’m going to grab the file name which is a pointer which means it’s just a 64-bit

integer i can stick that into a register and then the flags same thing so i’m going to grab

rdi and rsi the incoming arguments i’m going to stick them into r12 and r13 so the character

pointer in the flags and then i’m going to attempt to open the file with a system call so

system call is right there line 168 if you look up the table for system calls in my favorite book

that i usually recommend or any table that knows the system call codes for x86-64 assembly

call code to open a file is some number i’ve assigned that to sys open if we look up here

open is code 2 so that’s why i you know i don’t want to remember the numbers it’s bad to hard

code number so i just i just put it as a define and then the first argument that it wants in rdi

is the name of the file so i just gave it r12 i guess i probably could have just used rdi

directly in the system call but that tends to make me nervous reusing the argument registers

registers, I like to have them somewhere where they’re not going to be destroyed.

If I like, let’s say I accidentally added some code here on line 163.

If I wanted to reuse RDI and I accidentally added some code there,

then RDI would have been destroyed.

Would have cost me a bunch of time debugging my program.

Although I admit it’s not super efficient to do it that way.

Then we’re going to do the file status

flags as the second argument, and then we just do a system call right away.

Notice how we use RAX to send in the call code.

And then the system call sends us back its return result also in the RAX

You can see up here, it’s just the file handle.

So we can assume maybe at this point that we have a file handle sitting in R14.

You know, what is a file handle when you ask the system to open a file for you?

The operating system under the hood is just going to do a bunch of stuff to

bunch of stuff to actually open a file it’s going to go it’s going to take the string

that you sent it and it’s going to parse it and figure out you know how do i how do i interpret

where that actually is on disk i’m going to look at like the file system in the past that you

provided i’m going to look at the mount points and i’m going to figure out like where exactly on disk

does that file start and then the operating system stores that the operating system stores the file

name it stores where you’re looking at the file currently stores a bunch of stuff that you don’t

a bunch of stuff that you don’t want to have to remember you know it creates data structures under

the hood and all it’ll give you back in return is a file handle for simplicity because then later

you can use that file handle to just sort of say I would like to write some bytes to a file or I

would like to read some bytes from an open file here’s the handle you gave me previously and then

it’ll just work assuming you have a valid handle so the handle is kind of the most important part

failed because it’s a mistake i’ve said this in other videos it’s a mistake that new programmers

make or lazy programmers make uh let’s suppose for the sake of argument uh uh file open sys call

pretend we’re in c plus plus and there is some sort of an api function that we can call either

directly to the system or some person’s library and it’ll open a file for us so maybe like a

just say we call this right so i’m going to do like a little comment if inside of your program you just

call it with the you know some some path and then you assume that what you have is a valid you know

handle if you assume that what you got back is a valid handle or maybe this is not a function

just assume that the call succeeded, your program is probably going to have errors when you least

expect it. And it’s not going to look good. Especially if you release a function like this

to the public, or if you have like a professor who’s like grading your code and they are testing

to see if you’re checking for return codes and stuff like that. That’s not a smart idea, right?

Like you shouldn’t, should not, shouldn’t proceed as if everything went according to plan. What,

just proceeding as if everything went according to plan.

You want to use an if statement, right?

You want to say if the handle has some value that seems to be valid,

like for example, more than zero in the case of opening a file,

I would say probably more than two because all of our programs

always have automatically assigned file handles of zero and one and two

to represent standard input, standard output, and standard error.

But I think usually people just say like,

if it’s greater than zero, then it’s fine.

zero then it’s fine uh you know for me i might put greater than two but more than zero is fine

the point is check to make sure that it actually succeeded

do i have spell check on this thing oh god you’re all going to see my true spelling let me see if i

can get it on real fast plug-ins what how come this oh okay now it’s highlighted

If we’ve succeeded based on some kind of a comparison of the return result,

then we’ll proceed in one way.

Otherwise we’ll respond to the error by, you know, doing something else.

Somehow, like writing a log file, sending an email, complaining to the user,

doing any number of things where you can actually respond to the error.

Maybe you probably want to change your execution path.

Like if the file successfully opened, then go ahead and start writing to it

or reading from it or whatever.

But if the file did not successfully open,

want to do something else in the program and not just start trying to read from the file so anyway

super super good idea and so that’s why i’m going to implement that inside of assembly 2. so instead

of just immediately using the file handle i’m going to check it i’m going to say let’s compare

it to the number zero if we succeeded then let’s go to another label called uh read success i

personally when i’m doing branching logic i like to say open file read is the name of the function

read is the name of the function and I’ll just I’ll just uh uh append some kind of a suffix

to the original function name that way it’s it’s easier to avoid collisions when you have lots of

functions and lots of labels and things like that so I’m basically saying if we succeeded I’m going

to jump to this label which is down here and so uh this video also is not about branching logic

and and how to implement those instructions you can probably infer it by looking at my code but

it by looking at my code but you know see another video in the future for that topic anyway if we

succeeded in opening the file handle then we’re just going to say oh we were successful and then

we’re going to uh um let’s see we’re going to print the name of the file that we just successfully

opened and then we’re going to send the file handle into rax so that this function has a return

So when you open a file to read successfully, the caller will receive the file handle.

You might be wondering yourself, wait a minute, wait a minute, wait a minute.

It just gave us the file handle in RAX.

Remember, we got to respect the ABI.

Anytime we jump anywhere or call another function or call another syscall,

which a lot of these things do,

we will probably lose the value of registers that are not callee preserved.

And that’s definitely RAX.

I mean, just doing any system call like this print null terminated string function does.

print null terminated string function does that’s going to destroy rex so that’s why i saved it away

first in r14 then at the very end of the function right before i return i’m just going to grab r14

and send it into rex again respect the abi do not return data as a return value in r14 or any other

x anyway so that’s the gist of that let me go back up for a second uh so notice how

we’re we’re sort of comparing oh gosh i just i just reconfigured my annotator and i bought a

new drawpad i wonder if this is going to work ah it works there’s a bunch of stuff i added too

so notice how uh we try to open the file and then we sort of compare the file handle to see if we

off to the success area down here oh my green’s not working oh it’s tricky it’s tricky i gotta

hit it in a certain way there we go if it succeeds we go down here to the success label so we

basically at that point um totally ignore all of the fail code right we’re just like branching

around on the other hand if it fails then execution falls through because it’s not going to jump

if uh if if r14 is not greater than or equal to zero so if we fail maybe i should put this in a

red if we fail then it’s going to fall through to the next label and then the next instructions

some people like to say let’s uh let’s jump to the success label and then if not let’s jump to

the fail label that can buy you a little bit more uh i don’t know jump length if you have a giant

program but in this case i’m just going to let it fall through it saves us one jump instruction

and so if it fails then it’s going to say first off let’s print a message that we failed

to standard error if you want a refresher on standard input standard output and standard error

see my other videos and then it’s going to print the name of the file that failed so that’s this

part right here it’s going to say hey we we failed to print or sorry we failed to open this file name

for reading and it’s going to print a new line there with the crlf thing

and then it’s actually going to exit the program at that point it’s going to say all right we

failed and so the whole program just just quits and that’s my die function that i showed you

earlier it’s just going to call the system call code for quitting and it’s going to give a return

code of one to indicate to anyone automating our program including new make that well our program

failed at least it failed okay so we got all that and then we got the success and so now you kind

of know the idea behind opening a file let’s do that real fast just for fun so file tests

um oh i know what to do i’ll just comment out these uh instructions just so that we’re only

doing the file open yeah that’s pretty good and then we won’t close anything just yet

so now let me run this in the terminal real fast make run it says it successfully opened

it says it successfully opened the file input.txt

and then the program exited.

No problem, that error code one,

that’s because the output file doesn’t actually exist.

So don’t worry about that.

And let’s change the name of the input file

just to show you that it can fail and we can detect it.

I’m gonna put a two there

so that the program will try to open an input file

that doesn’t actually exist.

And then I’ll run it again.

Notice how it says fail to open file input2.txt.

And then it says terminating program after failure to open file.

And then this time, notice how the make file never got far enough

so that it tried to print the MD5 sums of the input and output files.

The assembly program just failed.

And so GNU make said, I’m not going to proceed any further.

Kind of useful when your program gives good exit codes, right?

Because then other programs know when to stop

or maybe what to do depending on what’s happening.

So I’m just going to fix this real fast.

down here and maybe the next thing I want to add is closing the files so we’ve got a function to

open a file for reading and then we have a function down here that is not implemented yet

for closing the files so I’m going to do let’s see what is r12 r12 is the input file so I’m

going to close r12 so basically this function it’s going to it’s going to take one argument

handle. So that’s why I’m giving the file handle of the input file as the first argument. And then

I’m just going to call it. So let’s copy paste close file underneath this. So I’m going to do

like some more space. And let me let me go get this from my solution close file.

And it’s going to be a pretty simple one. Really nothing much to it. I’m just going to write my

C++ prototype saying, well, it takes in one argument and it’s a handle. It doesn’t return

argument and it’s a handle it doesn’t return anything it attempts to close a

file and I’m just gonna use R12 to hold the incoming argument your programs at

home should probably be a little bit better than mine you should check to see

if the file successfully closed or not and respond in some kind of a way but

for me I’m just saying I don’t really care I already showed you that we can

check for a return value so I’m allowed to be lazy now and just sort of try to

close it and then just assume it all went according to plan so grabbing the

So grabbing the incoming arguments here.

That’s why I’m using R12.

That’s why I’m doing a push-pop pair.

And then the system call code to close is pretty easy.

You just say, here’s the code to close.

Stick that in RAX.

Let me go up real fast.

Notice how sysclose is the call code number three

on line 43 there.

So I send it the call code three to say, let’s close a file.

It only wants one argument, which is just,

what is the file handle that you want me to close?

what is the file handle that you want me to close remember before the operating system created a

bunch of stuff under the hood and gave you a file handle you can then use the file handle

to close a file read a file write to the file whatever to the file so it’s pretty easy once

i’ve set up the incoming arguments to the system then i actually use the system call instruction

syscall and then i can assume it’s probably closed at that point then a return statement at the end

like any additional thing happening.

It should just be, it said it successfully opened the file

and then it just exited.

Okay, now we’re ready to add a little bit more.

Let’s create a file to write.

So this is going to be the same thing basically

as opening a file to read,

except it’s going to be a different call code

and we’ll give it initial file permissions

rather than a read mode flag.

But then we’ll just get a file handle in return.

get a file handle in return.

And I’ve also stuck this into another function, of course.

So the file name to write line 109.

If you just scroll up real fast here, or if I scroll up real fast,

it’s a line 27 here, just output.txt.

And then the file permissions that we want to use.

I’m going to scroll up here.

That’s the second argument to the system call code.

That’s just the stuff that I talked about a little while ago where it’s like,

we want the user to be able to read and write to the file.

to be able to read and write to the file we want people in the same group to be able to read only

and we want everyone else to not be able to do anything basic linux permissions not a big deal

at the end you should probably maybe note that uh you know this q i think i might have said this

before this q just means quad word and these zeros are always going to be the same so really it’s just

like three numbers representing file permissions i’ll go over that in more detail in some other

So for now, we know that we’re going to open a file name to we’re going to open a file

for writing doesn’t have to exist yet.

We’ll give it some default permissions.

And those are the incoming arguments to the function that we’re going to call now called

create file.

When that file when that function comes back, assuming it didn’t decide to exit the program

because we failed to open the file, we will receive the file handle in our ex per usual.

And so I’m just going to stash that away real fast into our 13.

Remember R12 has the handle to the input file.

R13 has the handle to the output file.

And then, you know, for me personally,

I put that in comments to help myself remember

what I’m even doing,

because things can get confusing really, really fast.

And while we’re at it,

before we even write the create file function,

I might as well just uncomment these things at the bottom,

just to say, let’s close both files properly.

You always want to do that.

let’s copy paste the create file function let’s see I didn’t do that already right yeah okay

so let’s do that maybe like right here I’m going to go grab it from my solution real fast create

file okay about as complicated as opening a file just because I put in some some checking logic to

see if it successfully opened the file which is a good idea so here’s my function create file

arguments. Oh shoot, file creation hand. Let’s see, am I using R13? Yeah, oh, I mislabeled that.

Instead of saying flags, it should be permissions. So let me just do perms here. Long perms,

file creation perms, perm. I wish I could get a perm, if you know what I mean. Anyway, so,

you know, we just have like the file name that comes in and the permissions that come in,

And it’s going to return a handle just like we did with the read file for opening function.

But this time we’re going to do something slightly different.

So the system call is going to be the code for creating a file.

SYS create, which is not real unless you define it.

So if we just look up, I define that as.

Where is it?

Create 85.

So like right here, code 85 for system create.

Again, not a smart idea to hard code numbers.

Defines are way better.

numbers defines are way better. So then the first argument that it wants is the name of the file.

That’s the incoming argument that I took into R12. It has to be a null terminated string. It’s going

to be output.txt. Then the second argument is the file’s permissions. So that’s R13 that I took here

from the second argument, long perms. Once I’ve set up those things, I can do the system call

right away. The system, again, will try to open the file. It’ll try to create the file,

set up some data structures under the hood. If it succeeds, it’ll give me a valid file handle

in RAX, the return register. If not, then things have failed and I need to respond to that. So

I’m just going to stash the file handle in R14 right away. That’s why I’m also preserving R14

in the push pop pair that I have, R14 up at the top and the bottom.

So now we got to check whether or not we successfully created the file. Again,

logic is just opening up off a reading I’m going to compare our 14 to zero if

it’s greater than zero I’m going to assume it’s a valid file handle so jump

if it’s greater than or equal to to that label create file success I’m still

using my appending can naming convention so the name of the function is create

file so that means the success area is create file and then append underscore

success and so then I’m just gonna you know print a cute message saying hey we

successfully created the file. Yay.

Same thing that we looked at before basically.

And then I’m going to return the file handle in RAX.

If we fail, same thing that we did before when we were opening a read file,

I’m just going to complain basically to the user and then call on my little

die function to properly exit the assembly program with a

exit code of one, you know,

and I could enhance that to exit with different codes. Like maybe exit code.

Maybe exit code one means the read file didn’t work and exit code two means the

write file didn’t work and exit code three means we failed for some reason

while we were copying the data, you know, whatever you want.

I’m keeping it kind of simple.

So we got all that.

Again, this is not different than reading just, you know, the call codes pretty

much and, you know, the arguments, but the idea is the same.

So now we’re ready to uncomment.

I think we actually already did that.

We’re ready to let the program try to create the file.

to create the file and the only thing we need to add after that is the copying portion and the

successful message so let’s see if this works do clear and make run so you can see that the program

starts running here and then it says okay the module started and then it says we successfully

opened the read file and then we successfully created the output file notice how the make file

doesn’t fail when it tries to call the md5 checksum of the output file because now it actually exists

because now it actually exists if we list the program uh list the directory here you’ll see

that the out file has actually been created it just has a length of zero because there’s nothing

inside of it notice also that the permissions match what we intended the initial user can read

and write or the owning user can read and write the group can only read and everybody else can do

nothing to heck with you we could change that real fast just to show you i can get this r to

show you i can get this r to turn into an rw just by modifying permissions up here

you know giving permissions to a group uh just is a nice way of allowing multiple users

to have a shared file location you know add them all to the same group and then set that group

onto the file and then set the group permissions to allow people to do whatever you want them to

do read or read and write or whatever so let’s see where the perms i’m going to change this

So now the group people should be able to read and write.

I can’t remember if this will work because the file already exists.

Let me give it a try.

Okay.

I ran it one more time and it looks like it did not create the file because it already

existed.

So let me just remove output.txt.

There should be another system call code you could use to just check to see if a file

exists or if you wanted to be kind of hacky, you could just try to open the file and see

open the file and see if it succeeded and then close it right away.

I wonder if there is a call code for just exists only.

I don’t remember off the top of my head.

So I’m going to remove it and then run the program again.

And then we should see now.

Yeah, now that the file didn’t exist and was newly created.

Now those new permissions that we added are reflected.

So the group, anybody who’s on the group can read and write.

And obviously the group is just the same as my new user by default.

But again, you could be more complicated than that.

you were running a multi-user system

and you wanted to share folders or whatever.

Okay, so I’m gonna remove output real fast

and I’m just gonna revert the permissions to 640

so that the group really can’t do very much

except just read it.

And then if I, oh shoot.

LS, okay, yeah.

So now you can see it reverted.

Okay, so we got the out file.

I’m starting to get lost here.

I’m starting to get lost here. What am I doing? I’m supposed to copy the data, I think. So file

tests and we are copying the file. I think I just uncommented that. Really? Oh, no, no, no. I’m

looking at my solution repo. Okay. That’s what I’m doing wrong. So we’ll look into your repo or your

the copy file I was like what how did I how did that program run if I already uncommented that

and we’ll we’ll uncomment this message for now no let’s leave it until we actually finish everything

so now we got to do the copy file function so where’s that okay I’m going to copy paste this

whole thing you know what this would have been like a five hour video if I actually had to type

this by hand I can’t even remember how long it took me to write this program I think it was like

off the top of my head. So this would be a nightmare to type, I think, on video, even if I

kind of already know how to do it now. So let’s see, string length, print an alternated string.

This is close file. This is create file. Okay, so I’m just going to do copy.

File is going to be right before close file and right after create file. Okay, so copy file.

Notice the signature that I’ve chosen for this one. It doesn’t return anything. So

So that’s like not great design.

You know, I need to check to see if the copying operations inside of my loop actually succeeded

and maybe return something to indicate success or failure or at least exit the program.

But I’m not doing that right now.

I’m just keeping it a little bit more simple.

It takes two arguments.

The first thing is the input handle and the second thing is the output handle.

Conveniently, we have both of those now at this point.

And then here’s my register usage.

through r15 to just sort of grab the incoming arguments here for r12 and 13 and then

r14 grabs the beginning of the temporary buffer which i’m going to make on the stack because i

think i’m so cool instead of making it as a global in the bss section and then uh

r15 is going to hold the result from the copy operation or i guess the the write operation so

Sorry, the read operation only.

Oh, I think I’m only checking the read bytes instead of the read and the write bytes.

I’ll talk about that in a second.

So I could upgrade my program a little bit more if I wanted to.

But basically R15 is going to be my temporary variable that looks at the return value to

say, hey, did we do we read anything?

Like how much did we read?

So copy file.

This is not a video about making local variables.

So just trust me on this.

so that I can use it to save the location of the stack pointer.

Then I’m going to make a copy buffer on the stack

just by subtracting the stack pointer

because the stack grows downward in memory location numbers.

And then the base pointer is going to help me remember

where the stack was when I started.

I’m just going to say that that’s going to be the first byte in my buffer.

And I’m going to move that into R14.

And let’s see, what did I do?

Let’s see, what did I do?

Think about from the stack.

Oh, did I write like a good program or a bad program?

I feel like I should have actually saved the base pointer and not the stack pointer there.

Let me see if this runs.

If it does run, I probably have a naughty program that might self-corrupt sometimes.

And it might be a good idea to move the base pointer as the first.

Because this is not a stack video, but basically the stack grows downward.

that means you’re sort of like extending its reach you’re allocating like a free space

if i take where the stack

nope nope nope i got it right i’m sorry if i take where the stack ends up after i extend it then i

have a lower address right because it grows downward to memory if i then say that the new

tip where it grew not where it grew from but where it grew to if i say that’s the first bite in a

in a buffer then when you’re actually using the buffer you increase memory locations as you’re

filling up a buffer so then that’s going to grow back towards where the stack started so it should

be fine i think if i if i did go ahead and reverse this if i used rbp instead of rsp there then i

think i would be going in the wrong direction and just corrupting memory so just so you know

this is not a stack video but just so you know so here’s a little label again i’m just kind of

iteration label and at the bottom I’m just kind of jumping back up to that label so this is a loop

you can imagine this is a while loop maybe I should draw that for fun I gotta find more excuses

to use my drophead so this is kind of like a while loop I’ll say while true maybe and here’s like the

body and then here’s like maybe the end of the body and so this jump statement just kind of goes back

this thing’s going to frustrate me so the first thing that happens is we read a portion

of the input file into the buffer so what are we doing we’re just using another system call

we’re saying let’s use the call code for read I can’t remember what that was it might have been

uh whoops oh I cleared the dang drawing oh hang on I have undo I think wait I have that I’m gonna

Okay, I’m not going to mess with it because this will actually terminate my whole program.

One of these keys, I forgot which one, will just kill the whole annotator.

So if we go up to SYS read in the defines, you can see that it is just call code zero.

So that means I’m telling the system I would like to read something.

First argument it wants is a file descriptor for the input file.

In a different video, I showed you how to use exactly this sort of thing to read standard

of thing to read standard input from the user or from another program that launched your program

but in this case r12 it’s not going to be the file handle zero for standard input it’s going

to be the actual file handle of the file that you’re trying to read from whatever that may have

been whatever the os gave you and then here here’s the address of where to store our characters and

if you look up again we decided to remember where the first byte in our buffer on the stack was or

like a local array like oh shoot let’s do it again let’s do more drawings you have like a function

right in c and then here it’s like we declared a local variable maybe not int maybe a character

array we’ll call it a or how about b for buffer and then we just gave it you know let’s say eight

bytes or something for the buffer i think i still have it set to two bytes obviously you want to use

more bytes for efficiency but just to prove that the loop actually works i’m keeping it small like

keeping it small like I said before whoa what’s all that you see that it’s like smearing okay

that’s not good so uh anyway we remembered where the uh buffer starts and it’s going to be r14

and then the copy buffer length is being you know used here uh and I think I have that set up to uh

from the input file and we want to read into our temporary buffer and we want to read at most

this many characters and then we say system call and it does all the work for us to read that many

characters and then we want to remember how many bytes were actually read because that could be

different from the number of bytes we requested maybe we’re at the end of the file maybe the

system is having like some kind of a buffer issue or something so we just want to remember how many

this little token up there to remind myself the temporary bytes read should

go there. Notice also that when we created the stack,

this is not a stack video again,

but like I made the stack buffer equal to the length of the buffer that I

actually wanted to use. So I use the same symbol,

copy buffer length and copy buffer length.

So then it’s going to try to figure out, okay,

how many bytes did we actually read? Let me,

let me do that while loop thing again. Cause I’m, I’m feeling it.

I’m feeling it.

feeling it did i really have to write the word true especially my bad penmanship so we’re doing

while true and then i’ll just say like read you know do some kind of a call to read maybe we’ll

say n equals read like a long you know the the r15 uh register is like the number of bytes that we

actually read and then we do a comparison here so we’re saying you know if you know n is um

if it’s equal to zero then we’re going to jump to the position in the while loop where we’re

um done which is actually past the while loop if you scroll down you’ll kind of see it so i’m just

going to say break so basically if we read zero bytes then we’re done reading bytes we’re just

finished so we just break the while loop and that’s what’s going to happen here when we say

the end of the body and you’ll see that in a second so then otherwise if we’re not done that

means execution is not going to jump it’s going to just fall through to the next statement here

and that’s going to be another system call code to write to the output file and it’s going to be

very similar we just load it up with the system call code for write let me just double check

is code one and then we’re going to give it the target for output which is going to be

a file handle so you know r13 r13 right here that’s a file handle that’s where we want to

write and then the next argument that it wants is the buffer and that’s going to be r14 which

is where we just what’s going on green grain there we go i’m having issues that’s uh the

that’s uh the buffer that we just read into right so if you look here r14 green oh my gosh okay so

green we uh read from uh the buffer pointed to by r14 and then we uh we read into the buffer

pointed to by r14 and then we are using that same buffer pointed to by r14 in order to grab

R15 says how many files or sorry, how many bytes?

Green, I’m having such problems.

R15 says how many bytes do you want to write to the file?

That should be our return value from before, right?

Because we want to read a certain number of bytes from the file

and then write exactly that many bytes.

If we did more, we would be writing some junk data probably.

And if we wrote less, then we would be missing data in the output file

that was originally in the input file.

So you want to do it exactly.

in the input file so you want to do it exactly and of course i’m just doing a system call here

but like i said before your program should probably be a little bit smarter and check

rax after you write to the file just to make sure that you know how many bytes were actually written

like what if you what if you read a hundred bytes but then you wrote 90 bytes only right that would

be like a bad situation so you’d want to do some branching logic there so that if you read 100 and

100 and you wrote 90 you probably want to backtrack the position of the file by like 10.

You know just to make sure that you can actually get all of the bytes into the output file

and seeking backwards 10 that’s just another system call. It’s not in this program but it’s

not too bad. You just make another system call give it the right call code tell it how far back

you want to go no problem. So then when by the time we make it here we’re going to jump back

We’re going to jump back up to the iterate label.

Let’s see, where’s that?

Yeah, right there.

We’re going to jump back up to the iterate label.

So basically this is how the while loop continues looping.

So let me clear that real fast.

Eventually when we’re out of data to read, you know, that’s because if we try to read

and we end up with zero bytes, that means we’re done.

We’ll break the while loop.

Then we’re going to go down here.

Notice how it says copy file done.

What I was talking about before, just sort of jumping past the end of the body.

so if we are done we jump to copy file done and all it does is just restore a

bunch of registers for us and then return to the caller so copy file we have that

handled now and if I go back up here file tests I guess I can I feel bad about

this I can uncomment the success message but we probably should have done more

checking on the reads and the rights just to make sure that we actually wrote

to make sure that we actually wrote the correct amount of characters and everything succeeded

every time and if not we exit the program and only if everything went well then we print a

successful message at this point in this program the way it’s written it could totally fail and

it’ll still say that it was successful so just keep that in mind that’s bad bad for your users

okay i think we have everything that we need now i can probably just run the program and

run. Notice now, oh shoot, I didn’t even show this to you before. Let me just emphasize this

one more time real fast. So I’m going to comment out the part where we actually copy the file

and I’m going to remove the output file. Then I’m going to run the whole program again. Notice how

the two MD5 sums are different. So remember before I said that MD5 sums, they’re basically

fingerprints. They’re not actually considered secure in the modern era. I just use them because

and fun if you are interested in security you probably want to use a

modern hashing algorithm so don’t use md5 but I am but don’t but I am and it’s

basically saying the fingerprint here of input is different than the fingerprint

here of the output quickly indicating to me that the files are different so if I

list the contents of the directory obviously that’s true because the output

file is empty still but if I uncomment this part here where I’m actually

here where I’m actually copying the file then I should see that the fingerprints

match and then if I look at what’s inside of the output file it should match

what’s inside of the input file really fast let me open up a terminal and I’ll

just cat the input file so this is all I added you know why hello there add some

stuff this is definitely more than two characters so we can be sure that our

buffer loop is actually working and so I’m gonna do clear and make run and now

And now the output file, notice how it has a matching file size.

It’s got 38 bytes in there.

Same thing for the input.

And then the thumbprints or, you know, the signatures, whatever you want to call it,

the hashes, they match exactly, indicating that we probably have two identical files.

Even with MD5, even though it’s old and not considered secure,

the chances of two files kind of having random differences,

not hacked differences, but just like random differences,

and having the same fingerprint is like astronomically almost impossible so if i do

cat input oops input.txt that’s what’s in there like we showed before and if i cat the output

now it is uh the same thing why hello there added some stuff for both i could make this as big as i

wanted to just for fun maybe let’s do uh let’s do a nano on the input file

I’ll just add like I don’t even know what I’m doing I’m just going to type a bunch of stuff

oh wait what is this remember that thing that people got taught a long time ago it was like

the quick brown fox jumped over the lazy dog and this was supposed to be all the letters of the

alphabet there was an there’s another one that I just heard about and I don’t remember exactly

look it up on the internet it’s it’s pretty cool i think i need to memorize this and stop using the

lazy dog it was um something like those sphinx of quartz hear my vow or something like that

you know what i’m gonna look it up for you right now i don’t want to do the wrong thing

it’s really cool uh i’ll type brown fox and then uh

Hear my vow.

Oh, Sphinx of black quartz.

Judge my vow.

That’s what it is.

Okay, so I’m going to go back to my little VM here.

Sphinx of black quartz.

Judge my…

I didn’t even write vow correctly.

Judge my vow.

And I think that has all the letters in the alphabet too.

probably less spaces. I wonder if that’s, I mean, that’s what the internet says. If this is true,

and it has all the letters of the alphabet, that’s going to be awesome. I’m going to memorize that

for sure. Sphinx of black quartz, judge my vow. Anyway, so I’m just kind of adding stuff into

this file. And if I run, let me save that here. I’ll do a clear. And then I’ll do, um,

I’ll cat the input file here, and then cat the output file. So you can see they’re different.

again let me remove the output file just in case I can’t remember if I’m supposed

to remove it manually or if I put that into the program we’ll just try it like

this so now they match and then if I cat again the output file notice how it’s a

perfect copy of the input file nice so I think that’s pretty much everything

that I wanted to show you maybe um well maybe we can use a more efficient buffer

we can use a more efficient buffer now that it’s done the copy buffer we could change this to like

eight kilobytes or something we should end up with the same result let me run this just as is

and see if it ends up being the same thing without erasing the file first yeah it looks good let me

remove the output file and then run it one more time so make run and then uh cat the output file

yeah okay so it still works but um you know whereas before we were just using a two byte buffer

two byte buffer there’s like very little chance except maybe at the end of the file that we would

request more data than the file had but using the return value of the read operation always told us

exactly how much was read by the operating system on the other hand if we have a giant buffer we

could request way more bytes than the file could ever have because that file is way less than eight

kilobytes so again we still want to look at the return value to make sure we know how many bytes

how many bytes should be sent into the right file.

So I guess that’s everything that I wanted to tell you

about reading and writing files using system calls.

I hope you’ve learned a lot of stuff

and you enjoyed this video and had a little bit of fun.

Thank you so much for watching.

I’m going to cut the video.

I’ll see you, whoops.

I’ll see you in the next video.

Hey everybody.

Thanks for watching this video again

from the bottom of my heart.

I really appreciate it.

I do hope you did learn something

I do hope you did learn something and have some fun

If you could do me a please a small little favor. Could you please subscribe and follow this channel or these videos or

Whatever it is you do on the current social media website that you’re looking at right now

It would really mean the world to me and it’ll help make more videos and grow this community

So we’ll be able to do more videos longer videos better videos or just I’ll be able to keep making videos in general

So please do do me a kindness and and subscribe

and subscribe. You know sometimes I’m sleeping in the middle of the night and

I just wake up because I know somebody subscribed or followed. It just wakes me

up and I get filled with joy. That’s exactly what happens every single time.

So you could do it as a nice favor to me or you could you control me if you want

to just wake me up in the middle of the night just subscribe and then I’ll just

wake up. I promise that’s what will happen. Also if you look at the middle of

the screen right now you should see a QR code which you can scan in order to go

to the website which I think is also named somewhere at the bottom of this

also named somewhere at the bottom of this video and it’ll take you to my main website where you

can just kind of like see all the videos I published and the services and tutorials and

things that I offer and all that good stuff and uh if you have a suggestion for uh uh clarifications

or errata or just future videos that you want to see please leave a comment or if you just want to

say hey what’s up what’s going on you know just send me a comment whatever I also wake up for

I wake up in a cold sweat and I’m like,

it would really mean the world to me.

I would really appreciate it.

So again, thank you so much for watching this video

and enjoy the cool music as I fade into the darkness,

which is coming for us all.

Thank you.

Master File Copying with System Calls in x86-64 YASM Assembly

Comments

Leave a Reply Cancel reply