This chapter gives an introduction to some general Unix and GNU/Linux concepts. It is a good idea to read this chapter thoroughly if you do not have any UNIX or GNU/Linux experience. Many concepts covered in this chapter are used in this book and in GNU/Linux.
one of Linux' strengths is multi-tasking. Multi-tasking means that multiple processes can run at the same time. You might wonder why this is important to you, because most people are using one application at a time. Besides the more obvious reason that it is just handy to browse while you have a word processor running in the background, multi-tasking is a bare necessity for Unix-like systems. Even if you have launched no applications there are a bunch of processes running in the background. Some processes might provide network services, others sit there showing a login prompt on other consoles, and there is even a process that executes scheduled tasks. These processes that are running in the background are often called daemon (not to be confused with the word demon, a daemon is a protective angel). At a later stage we are going to look at how you can move processes to the background yourself (see Chapter 8).
Note: Note that on single-processor systems processes are not really running simultaneously. In reality a smart scheduler in the kernel is dividing CPU time between processes, giving the illusion that processes are running simultaneously.
Operating systems store data in filesystems. A filesystem is basically a collection of directories that hold files, like the operating system, user programs and user data. In GNU/Linux there is only one filesystem hierarchy, this means GNU/Linux doesn't have multiple drives (e.g. A:, C:, D:), like Windows. The filesystem looks like a tree, with a root directory (/) which has no parent directory, branches, and leaves (directories with no subdirectories). Directories are separated using the "/" character.
Figure 4-1 shows the structure of a filesystem. You can see that the root directory (/) has two child directories; bin and home. The home directory has two child directories, joe and jack. The diagram shows the full pathname of each directory. The same notation is used for files. Suppose that there is a file named memo.txt in the /home/jack directory, the full path of the file is /home/jack/memo.txt.
Each directory has to special entries, ".", and "..". "." refers to the same directory, ".." to the parent directory. These entries can be used for making relative paths. Suppose that you are working in the jack directory. From this directory you can reference to the joe directory with ../joe.
You might have started to wonder how it is possible to access other devices or partitions than the hard disk partition which holds the root filesystem. Linux uses the same approach as UNIX for accessing other mediums. Linux allows the system administrator to connect a device to any directory in the filesystem structure. This process is named "mounting". For example, one could mount the CD-ROM drive to the /cdrom directory. If the mount was correct, the files on the CD-ROM can be accessed through this directory. The mounting process is described in detail in Section 7.5.
The Filesystem Hierarchy Standard Group has attempted to create a standard that describes which directories should be available on a GNU/Linux system. Nowadays most major distributions use the Filesystem Hierarchy Standard as a guideline. This section describes some mandatory directories on GNU/Linux systems.
Please note that GNU/Linux does not have a separate directory for each application (like Windows), files are ordered by function and type. For example, the binaries for most common user programs are stored in /usr/bin, and their libraries in /usr/lib. This is a short overview of important directories:
/bin: essential user binaries that should still be available in case the /usr is not mounted.
/dev: device files. These are special files used to access certain devices.
/etc: the /etc directory contains all important configuration files.
/home: contains home directories for individual users.
/lib: essential system libraries (like glibc), and kernel modules.
/root: home directory for the root user.
/sbin: essential binaries that are used for system administration.
/tmp: a world-writable directory for temporary files.
X11R6: the X Window System.
/usr/bin: stores the majority of the user binaries.
/usr/lib: libraries that are not essential for the system to boot.
/usr/sbin: non-essential system administration binaries.
/var: variable data files, like logs.
In UNIX and Linux almost everything is represented as a file, including devices. Each GNU/Linux system has a a directory with special files, named /dev. Each file in the /dev directory represents a device. You might wonder how this is done; a device file is a special file because it has two special numbers, the major and the minor number. The kernel knows which device a device file represents by these numbers. The following example shows these numbers for a device:
$ ls -l /dev/zero crw-rw-rw- 1 root root 1, 5 Apr 22 2003 /dev/zero
The ls lists files and information about files. In this example information about the /dev/zero device is listed. This particular device has 1 as the major device number, and 5 as the minor device number.
Note: If you have the kernel sources unpacked after installing Slackware, you can find a comprehensive list of all major devices with their minor and major numbers in /usr/src/linux/Documentation/devices.txt. An up-to-date list is also available online at: ftp://ftp.kernel.org/pub/linux/docs/device-list/
For the Linux kernel there are two types of devices: character and block devices. Character devices can be read byte by byte, block devices can not. Block devices are read per block (for example 4096 bytes at a time). Whether a device is a character or block device is determined by the nature of the device. For example, most storage media are block devices, and most input devices are character devices. Block devices have one distinctive advantage, namely that they can be cached. This means that commonly read or written blocks are stored in a special area of the system memory, named the cache. Memory is much faster than most storage media, so it is a huge performance benefit to perform read and write operations on commonly used blocks in memory. Of course, eventually changes have to be written to the storage medium.
There are two categories of devices that we are going to look into in detail, because understanding the naming of these devices can be crucial for partitioning a hard disk, or for mounting. Almost all modern computers use ATA hard disks and CD-ROMs. Under Linux these devices are named in the following way:
/dev/hda - master device on the first ATA channel /dev/hdb - slave device on the first ATA channel /dev/hdc - master device on the second ATA channel /dev/hdd - slave device on the second ATA channel
On most computers the hard disk is the master device on the first ATA channel (/dev/hda), and the CD-ROM the master device on the second ATA channel. Hard disk partitions have the device name plus a number. For example, /dev/hda1 is the first partition on the /dev/hda disk.
Most average PCs do not have SCSI hard disks or CD-ROM drives, but SCSI is often used for USB drives. For SCSI drives the following notation is used:
/dev/sda - First SCSI disk /dev/sdb - Second SCSI disk /dev/sdc - Third SCSI disk /dev/scd0 - First CD-ROM /dev/scd1 - Second CD-ROM /dev/scd2 - Third CD-ROM
Partitions are notated in the same way as ATA disks; /dev/sda1 is the first partition on the first SCSI disk.