hello friends, today you will learn what actually happens when you type the command ls *.c
in your terminal.
Terminology
Terminal
Way back when, computers were huge multi-user systems owned by universities and corporations. The machine itself was located in a secure room that ordinary users didn’t visit. Instead, they interacted with it via a terminal. In the early days, the terminal would have been a printer (a teletype, hence TTY). Input was via punch cards and output via a printer. Later, a terminal was a display with a keyboard.
Today’s terminals are software representations of the old physical terminals, often running on a GUI. It provides an interface into which users can type commands and that can print text. When you SSH into your Linux server, the program that you run on your local computer and type commands into is a terminal.
Shell
The broadest definition of a shell is a program that runs other programs, but when you hear shell in the Linux world, it almost certainly refers to a command line shell — the program that creates and manages the command line interface into which users type commands.
For example, when you type “ls” into a terminal connected by SSH to your Linux server, you are asking the shell to run the “ls” program and to print out a list of files in the current directory to your terminal. The shell also provides a programming language of sorts, shell script, that can be used to tie together multiple commands.
The most common shell is Bash, the Bourne Again Shell, but there are several variants; Ubuntu uses the Dash shell, and some Linux users prefer the Fish or ZSH shells.
System call
A system call is a mechanism that provides the interface between a process and the operating system. It is a programmatic method in which a computer program requests a service from the kernel of the OS.
Typing ls
Split input
The first step is tokenization. The shell has to interpret the single string ‘ls -l’ which we then pass to a function that splits up the string into separate ‘tokens’. In this example, ‘ls -l’ is composed of two tokens, ‘ls’, and ‘-l.’ To do this, we use strtok()
, a function that parses strings by a specific delimiter, i.e. a space, or a colon. In this case, we split the tokens by using the space between ls and -l as a delimiter. This function replaces those found delimiters with a special character called null terminators which separates a single string into separate strings (tokens), allowing them to be easily pushed into a matrix, or an array of strings. So, in our example, the original string, ls -l
, now is a matrix pointing to the first string ‘ls’ and a second string ‘-l’, followed by NULL terminator signifying the end of the matrix.
Check for aliases
The shell will run through each alias it has on file and if the first argument matches the name of that alias, it replaces it with the alias. It will run through one more time to check for another alias, rinse, and repeat until there are no more aliases. aliases for BASH can be found in a file called bashrc
in the home directory.
Replace variables
This is pretty much the same process as alias replacement. It replaces variables it finds, such as $PATH, $?, *, etc, with the corresponding values. The * is unique in that it converts to any file. But, it can be inside of some other word, like m*.c, which would make it match only files that start with ‘m’ and end with ‘.c’
Replace Builtins
Builtins are command that are hard coded into the shell Once the arguments have been converted properly, the shell compares the first argument with a list of strings hard coded into the shell. If it matches any of these strings, it runs a function that corresponds with that string. This is how builtins are works.
execution
And now we get to the fun part: execution. To talk about executing files and running commands, we need to talk about this thing called the PATH. The path is a colon-separated list of directories inside of your computer. The path is just one string in a matrix of strings stored in a special environment variable The environment variable contains data and configuration information that could be useful to an application. One of the strings in this environment contains the PATH. The PATH is useful because it contains the directories which the shell should look in to find executables. In our example, ls
is actually just one of the many programs that is in the /bin/
directory. When the user inputs a command or program to execute, it will look to see if is in one of the PATH directories. Without the PATH, the user would have to type the absolute path to the program to execute, in this case /bin/ls
, in order to execute the ls
program. However, since the shell automatically checks the see of the program to execute is in one of the PATH directories, all the user needs to do is type ls
, and the shell will still be able to execute the proper program. So the next job of the shell is to search through the entire matrix of strings (the environment) until it finds the PATH. So, the shell finds a match! It finds the word ‘PATH’ within the environment and now it needs to attach the command we passed it, ‘ls’ to every file it finds in the path until it finds a match. There are a couple different ways to go about checking a file’s existence, and further, checking it’s accessibility and if it’s executable. No matter how the programmer goes about it, what ends up happening is ‘ls’ is attached to directories in the path until it finds the executable file it was looking for. For example, ‘/usr/bin/ls’ is an executable file so it would be the final string we pass on to the next function. This finally gets to the part we’ve been waiting for. The function runs the shell witll fork another child process and run a system call execve with the program path and it’s argument. why fork ?? imagine if the program exits do you really want your shell to exit too. The shell has done it’s primary job. So now what? Well, now that the ‘ls’ program has run, and the file’s contents are printed out onto the prompt using PS1 (), the shell resets itself and is ready to run again. How the shell works on a high level is that it’s constantly stuck in an infinite while loop, persay. That’s why we can continue to run command after command after command. It will continue waiting on commands unless an EOF, or a close signal is struck.
Conclusion
This is a 101 about how the shell works of course it’s still doing a lot of other things related to memory, access verifications, file descriptors and many more.