How Linux Works by Brian Ward¶
Author: Brian Ward Timeline: 2020-10-25 till date.
I have been using Linux since 2007, but I have not dived into the depths of the topic. I would like in-depth Linux Kernel knowledge, so that I can both appreciate the system better and be a more powerful developer.
However, I would like objective goals:
Be able to analyse system performance.
Be able to compile kernels from source and boot them.
Be able to edit the grub bootloader config.
Be able to debug a running process using syslogs.
The fact that I do not yet know what I would like to do is proof that I lack definitive knowledge of the system.
Chapter 1 The Big Picture¶
The Linux Kernel provides libraries and interfaces with which programs can interact with devices. Although, as this book details later, what the Kernel considers to be devices is quite interesting.
Linux Process Management involves the starting, pausing, resuming and termination of processes.
Linux processes share compute time by performing operations within allocated time slices. These time slices give processes enough time for some significant computation, and are extremely small.
This time-sharing is perceived as multi-tasking. And it is called context switching.
Questions I have on the topic:
Can kernel time slice lengths be configured?
How does a kernel record the status of a process before context-switching?
And when ready to context-switch, how does the kernel know which to pick next?
What is CPU user mode?
How does a process get interrupted?
What is kernel mode?
The Kernel runs inbetween processes during a context-switch.
The Commandments for Memory Management:
The Kernel Keepeth its own private, inaccessible memory space.
Unto each process is its own section of memory.
Thou shalt not covet thy neighbour’s memory space.
If thou chooseth, some sections of thine memory may not be altered.
The System may useth the disks as though they were like unto memory, should it need to.
Device Drivers and Management¶
A device is typically accessible only in Kernel mode, so as to prevent improper access.
So can I think of kernel mode as what the CPU does inbetween time slices so that it can execute kernel commands? Like when a process asks to access a disk, the kernel has to respond during the context switching?
System Calls and Support¶
System calls (
syscalls) perform specific tasks that a user process
alone cannot do well or at all. Opening files, reading or writing to
files, all qualify.
exec are very important.
fork instructs the kernel to create a nearly identical copy of the process.
exec instructs the kernel to start a
program that is passed to
exec and replace the current process.
All user processes, except
init start through
fork. You run
to start a new program instead of running a copy of an existing
Interestingly, this must be how virtualenvs cannot be sourced within
a shell script. Trying:
ls in a terminal, the shell calls
create copy of the current shell, start this, and then exit.
The user-space or
userland is main memory that the kernel allocates
for user processes.
A user is an entity that can run processes and own files. A user is associated with a username, but the kernel uses userids to identify a user. User space processes are owned by a user, and processes are run as a user.
root is a privileged user which can terminate any other user’s
processes and read/write any file on the file system.
root might be powerful, but it still runs in User mode, not kernel
Groups are sets of users.
Basic Commands and Directory Heirarchy¶
The Bourne Shell
A shell is a program that runs commands. The Bourne Shell, was
developed at Bell Labs. Linux uses the Bourne-Again shell,
commonly known as
chsh to change the default shell on a Linux system.
Using the Shell¶
echo does not need quotes.
cat performs concatenations on a list of files or input streams.
Standard Input and Standard Output¶
Unix processes use I/O streams to read and write data. Streams are very flexible: their source can be a file, a device, a terminal, or even the output stream from another process.
cat without an argument puts you into
STDIN mode, where
cat will echo back everything you type into it. When you type,
you are sending inputs to
cat reads this and redirects
CTRL-D to exit).
STDIN input on a terminal, and depending on the
program, terminates it.
CTRL-C terminates a program, irrespective
of the input or output.
Each process gets an
STDOUT stream to write to.
cat writes to
STDERR is covered later.
Both these can be redirected.
Covers standard commands you should already know:
ls, cp, mv, touch, rm & echo.
grep to find a string within a directory. Note: ripgrep (
and the_silver_searcher (
ag) are much, much faster.
grep <find what> <find where>
Note: If you use shell expansions (globbing) in the
section, these are expanded first, and might not be what you want.
less provides a scrolling view on
less supports the
/ search mechanism that
pwd prints the current working directory. Use
pwd -P to resolve
symbolic links as well.
diff is used to spot the differences between 2 files.
diff -u provides a way for other programs to analyse the output.
file can be used to guess the file type of a given file.
find <directory> -name <filename> -print can be used to find a
certain file in a directory tree. Remember, if you must use
enclose it in
find <directory> -iname <filename>
will turn off case-sensitivity.
locate uses a cached file index for a file, and is faster
for this reason. However, if the file is newer than the index,
locate won’t find it.
tail return the top and bottom
n lines of a stream
head -<n> will show
<n> number of lines.
tail +<n> will print everything from line number
Changing Your Password and Shell¶
passwd can change the password, and
chsh can change the default
Files beginning with a
. are configuration files.
Linux programs use text based files for configuration.
Environment and Shell Variables¶
STUFF=blah is how you assign a value to a variable in the shell.
Note the absence of spaces around the :code:`=`. A shell variable is
local to the current process. However, an environment variable is
passed to processes spawned by this process as well.
Note that all environment variables are passed to child processes.
Child Processes and Inherited Environments¶
This creates interesting problems, such as needing to start a Python2
process from a Python3 environment. If you activate a virtualenv
subprocess.check_output to run
python, the default Python
will be the same as the parent process (Python 3 here). If you
have a weird use case where you would want to do this, ensure you don’t
source the virtualenv, instead, run the parent python script
using the absolute path to the virtual environment’s python
executable (found in
<envdir>/bin/python). Again, note that
this is not the Python executable that was used to make the virtualenv.
The Command Path¶
PATH is a very important Environment variable. It contains a
separated list of directories where the current shell will search
This is an interesting scenario. Carefully, try
to clear the value of
PATH. Now try running commands you’ve learnt
so far. If they execute, these are native unix commands. If they do not,
these are binaries that were possibly available in some of the library
folders such as
When appending to the path, use
The default is
emacs mode. Sacrilege. Turn on
“[vi] plays a bit like a video game.” LOL.
Getting Online Help¶
man -k <keyword> can be used when you want to search for a manual
page by a keyword.
man <section> command can go to a section of a command.
Higher-level Unix programming library documentation
Device interface and driver information
File descriptions (system configuration files)
File formats, conventions, and encodings (ASCII, suffixes, and so on)
System commands and servers
info is a more detailed format for online manuals, adopted by the GNU Project.
Documentation can be sometimes found in
info do not read these.
Shell Input and Output¶
command > file can send the output of a file to a file, clobbering (erasing) the
contents of the original file.
set -C can prevent clobbering in bash.
command >> file can append the output to the file.
command1 | command2 streams the
STDERR is an error stream. To redirect this, you need to use stream II
command 2> file. To send
it to the address of
command 2>&1 merges both streams.
command > file 2>&1 will send both streams to a file.
Standard Input Redirection¶
command < input will send
input to the
Understanding Error Messages¶
Anatomy of a UNIX Error Message¶
Protip: Address errors on a first-come, first-serve basis.
Errors will have the following components:
The program name
The file name
$ ls /asdkl ls: cannot access /asdkl: No such file or directory
No such file or directory
Not a directory, Is a directory
No space left on device
Operation not permitted
Listing and Manipulating Processes¶
A process is a running program.
ps is used to list processes.
ps x: show all of your running processes.
ps ax: show all processes on the system.
ps u: include more detailed information.
ps w: show full command names.
A processes can be killed using a signal from the Kernel.
kill command to send signals. A signal is a message
that the kernel sends to a process.
kill <pid> send the
TERM signal to a process. This signal
tells the process that it needs to quit, and gives it time for
any cleanup, if needed.
kill -STOP <pid> freezes a process. This way, the process can be
kill -CONT <pid> continues/resumes a frozen process.
kill -KILL <pid> is the most brutal way to kill a process. This
will end the process without waiting for any cleanup.
Shells also support Job Control, a way to send
TSTP (similar to
to a running process in the foreground using
fg (bring to foreground) or
bg (continue in background) commands.
tmux are good choices of programs to send noninteractive
programs to the background.
Send any command directly to the background by suffixing the command
& before running it.
Note that when sending a process to the background, it is always
preferred to ensure that the
stdin are remapped.
File Modes and Permissions¶
Every Unix file has a set of permissions that determine whether a user
can read, write, or run that file. Use
ls -l to view this information.
-rw-r--r-- 1 juser somegroup 7041 Mar 26 19:34 endnotes.html
The mode, the first string, represents the file’s permissions and some extra information.
The file type indicates what this is.
- indicates a regular file.
d indicates a directory. There are other types which will come up
The rest of it can be summed up as groups of 3:
inapplicable. The permission bits indicate what rights the user(s)
in question has. The first set is the owner, the second is the group
members, the third group is everyone else.
chmod a+r <file> will give everyone read permissions on the file.
chmod o+r <file> will give other users (not owner and not group members)
read permissions on the file.
chmod +rwx <file> will give the user all permissions on the file.
chmod g+rx <file> will give the group members read and execute permissions on the file.
chmod g-x <file> will remove execute access from group members.
chmod o-r <file> will remove read access from other users.
Although you can use numbers to set the direct permissions, this is much easier to read.