Hidden Strengths of Unix
2010-12-11
Introduction
GNU/Linux, popularly referred to simply as Linux, has its roots in Unix. As such, it has a very different design and philosophy from Microsoft Windows, the system it is most often compared with. Many of the comparisons between Linux and Windows focus on how Linux deals with typical tasks one would perform on a Windows system. What they don't tell you is that its Unix heritage provides Linux with a lot of functionality not found in Windows. Most people come to appreciate this only after extensive Linux or Unix usage, but it makes Unix systems immensely powerful and flexible, which even casual users can benefit from. This essay explains some of this Unix functionality and how it can benefit you.
The Shell
It all starts at the shell. The shell accepts typed commands from the user and executes them. Back in the days when Unix was created, this was the way people interacted with computers. Many of the things that make Unix so powerful can be traced back to this.
A fundamental concept in Unix is that of the file. A file is basically any sort of data, typically stored on a disk. Documents are files. Images are files. All programs on your computer are files. Files are organized in directories, which are themselves a kind of file. This means that directories can contain other directories. This leads to a hierarchical organization, where the location of a file (called the path to that file) is given by a list of directory names, and finally the file name. An example path is /usr/bin/gzip, which indicates starting at the root of the filesystem, then traversing the usr directory, followed by the bin directory, finally reaching the file gzip.
One of the fundamental influences on the development of Unix, called the Unix philosophy, is to do one thing, and do it well. Accordingly, a Unix system contains hundreds of little programs that all perform very specific tasks. There is a program called ls, which lists files. There is a program called cp, which copies files. There is a program called grep, which searches the contents of files. Etcetera. These are the commands you invoke from the shell. Thus, the vocabulary that the shell understands can be extended by adding more of these little programs.
One of the major innovations of Unix is the invention of the pipe, which can be used to feed the output of one program as input to another. This plays together with the many little programs that each performing a single task. Given a program than can find files that have not been accessed in the last 30 days (find), a program that removes the files you specify (rm) and a program that takes a list of things and passes them as arguments to another command (xargs), you can use pipes to efficiently remove all files older than a certain date:
find . -atime 30 -print | xargs rm
Replace rm by a command that moves files to a different location (mv), and you can move old files to an archive, reusing the commands for finding these files and passing their names to a command.
The combination of little programs with pipes is pretty unique to Unix systems. Other systems often use a more monolithic approach, where a single program is responsible for one or more complex tasks. The Unix way has numerous advantages, though. First of all, the system is more flexible. By combining the programs in new ways, you can perform tasks that the authors of the programs may not have thought of. Secondly, because the individual programs are simple, they are easier to write correctly and it's easier to verify their correctness than would be the case with larger programs.
More information about the shell and how to use it:
Scripting Languages
Suppose you have some complex operation, requiring multiple steps, that you perform frequently. You could type in the commands (or click the right places, if you're using a graphical user interface) to execute this operation each time. However, that quickly becomes tedious. Unix offers you a way to automate these steps, called scripting.
Just like a script for a movie or stage play tells the actors what to do, a script in Unix parlance tells the computer what to do. The (conceptually) simplest kind of script is the shell script. Suppose the operation you wanted to perform is the deletion of old files, discussed previously. Now, you probably don't want to delete all files that you haven't recently used, so say you want to delete them only from the directories foo, bar, baz, and the directories contained in them. If you were to do this by hand, you could use the following commands:
find foo -atime 30 -print | xargs rm
find bar -atime 30 -print | xargs rm
find baz -atime 30 -print | xargs rm
You can very easily turn this into a shell script. Simply put it into a file (e.g. rm-old-files), and add a line near the top that reads
#! /bin/sh
This line will tell Unix that the file is a script, to be interpreted by /bin/sh, the standard shell. Finally, you need to make the script executable, which can be done by the command chmod +x rm-old-files. Now, you can use rm-old-files like you would any other program.
The fact that you have to specify the interpreter suggests that you can use other interpreters instead of /bin/sh. Indeed, you can use other languages than shell script by specifying different interpreters. Many scripting languages have been developed for Unix, a few popular ones being Perl, Ruby, Python, and Tcl. Where shell script is a good choice for automating tasks that you would normally accomplish by typing commands in the shell, these languages are a better choice when you want to write more complex applications, anything from new shell commands to complex applications with graphical user interfaces and everything.
Although writing complex applications with scripting languages is probably beyond the grasp of many beginning users, writing simple shell scripts to simplify common tasks is easy once you know your way around the shell. Thus, scripting has a low entry barrier, but scales all the way up to full-blown applications. Scripting allows you to customize your environment to your needs, and thus forms an integral part of the lives of most Unix users. It's sad that scripting is almost completely ignored on other platforms.
More information on scripting in Unix:
- A quick guide to writing scripts using the bash shell
- UNIX Shell Scripting Awk and Sed
- Programming Ruby: The Pragmatic Programmer's Guide
- Perl Tutorials
Networking
One of the things distinguishing Unix from many other operating systems is its network-friendliness. Unix was built at a time when computers were very expensive, and multiple people would share a single computer, often by logging from a terminal in a different location. The terminal provided only a display and a keyboard, whereas all the actual processing was done in the computer. The same mechanisms that were used back then are still used now, although the modern equivalent of a terminal is usually a PC connecting to the Unix system using SSH or telnet. Even if you use the system with a keyboard and monitor physically attached to it, the system pretends you're using some kind of terminal.
Other than the latency your connection might have, it doesn't matter whether you're using a Unix system via the attached keyboard and monitor, via a different computer in the same building, or via a computer on the other side of Earth, or even a different planet. You can execute all the same commands you're used to, use the same scripting languages, etc.
Besides allowing you access to your system from pretty much any location, Unix systems also provide a number of commands to interact with other computers over the network. In fact, many common technologies in use on the Internet were originally developed on Unix systems. Of course, there is support for well-known services such as the World Wide Web, email and FTP, but also lesser known services like Usenet and IRC. In addition to that, there are many network services that are specific to Unix, such as SSH (which lets you log in to a remote system securely), NFS (which lets you access files on a remote system as if they were on the local system), talk (which lets you chat with other people on the Unix system), and many more.
The X Window System
So far, the focus has been mostly on the command line. This is because the command line is where most of the strengths of Unix are found. It's difficult to apply many of these concepts to graphical user interfaces. For example, while it's easy to see how one can feed the text that one command outputs as input to another command for further processing, it's more difficult how this would be done in a GUI environment, with windows and buttons and such. This goes a long way in explaining why operating systems which were designed to be used with a GUI, such as Windows and the old Mac OS, do not offer many of the abilities that Unix offers. They may offer some scripting facilities, but these don't appear to be used nearly as much as the scripting facilities of Unix.
The graphical user interface that most Unix systems used is the X Window System, introduced in 1985. It incorporates certain aspects from the Unix philosophy, which has led it to have features that other GUIs still lack.
One such feature is network transparency. The X Window System uses the client-server model, where the applications are clients connecting to a server which does the actual displaying of windows etc. Clients and servers can be run on different computers, which means you can have very flexible setups. For example, you can have one expensive and powerful computer to do the actual work, and have lots of cheap terminals running only X servers for people to interact with the central computer. That way, the bulk of administration work has to be done only on the central computer, you may only have to pay license fees for one computer, etc.
Another feature of the X Window System is its modular architecture. The X server only controls the hardware; the display, the keyboard, and the mouse. The actual drawing of windows is handled by a separate program called the window manager. Support for icons and taskbars would again go in different programs. Neither the X server nor the window manager provides things like buttons and scrollbars. Instead, applications use toolkits which implement that functionality. This setup offers great flexibility. If you don't like how a certain window manager manages your windows, you can exchange it for a different one. When the Motif toolkit was found too expensive to use, people wrote different ones.
By choosing your toolkits and window manager, you have a great deal of control over how your system looks and works. To give an idea of how great this flexibility is, think about how things work in Windows, Mac OS, or Mac OS X. You have your desktop, with icons on it, windows with buttons and such, and a taskbar or dock which you can use to switch to open windows or to start applications. Windows can be resized, moved over each other, minimized, and made to fill the whole screen. The whole system takes a while to load, often in the range of a few tens of seconds. Using the X Window System, you can get this, too. You can also get a setup that makes every window fill the entire screen, does not allow you to move or resize windows, does not provide any dock or taskbar (you switch windows using the keyboard), has no concept of a desktop on which to put icons, and takes less than a second to start even on older computers. You can also have a setup that extensively uses 3D, where you cannot only move and resize windows, but also rotate them and move them closer and further away (making them bigger and smaller).
To allow the clients to communicate with the X server, a certain protocol has to be used. The X Window System uses an extensible protocol, meaning that new functionality can be added without having to change what's already there. This has been used to incorporate features which weren't present in the original incarnation, such as support for efficient decoding and displaying of video, 3D support, translucency, scalable fonts, and more. Despite the introduction of all these new features, the core protocol is still the same, meaning an application that was developed 20 years ago can work with current X servers without any difficulty.
More information related to the X Window System:
Stability
Despite all its flexibility, Unix is also a very stable system. By stable, I don't mean that it doesn't crash (it should go without saying that the system doesn't crash!), but rather that it doesn't change too much. The system calls that were at the basis of UNIX System 2 in 1972 are still at the basis today.
Stability is a great advantage. The more stable the system is, the more software that was once written for it will continue to work. This means developers can focus on improving their software, rather than tracking the changes in the system. It also means that there's a good chance that if you somehow wanted to run some really old and abandoned software, it's probably easy to do so.
Portability
Another strength of Unix is the wide variety of hardware it runs on. Unix was the first operating system specifically designed to be portable to different computers. The computer it was developed on was an older, relatively underpowered one. As technology has advanced to the point where computers are now smaller, cheaper, and more powerful, even watches can run a form of Unix these days. At the same time, Unix has been used on large and powerful systems, and many of the world's most powerful supercomputers run variants of Unix.
Part of the success of Unix on all kinds of hardware is the simplicity of its parts. It's much easier to adapt Unix to the constraints of a specific environment than a system that is designed from the ground up to be used as a desktop operating system with a graphical user interface. Also, Unix is well understood by the industry, which means it's easier to find people who know how to customize Unix than people who know how to customize most other systems.
The portability of Unix also ties in with the stability mentioned in the previous section. Since Unix works largely the same way, no matter if you run it on a PC, a Macintosh, an old Amiga, or a Sun Workstation, concepts learned and programs written on one can be easily applied and used on the others. This is an advantage to users, and also contributes to the longevity of Unix and the variety of software available for it.
Further reading on portability:
Open Source
The flexibility of Unix doesn't stop at being able to mix and match programs and writing your own. Many programs that run on Unix systems, and even some Unix systems themselves (like GNU/Linux, FreeBSD, NetBSD, and OpenBSD) are open source. ‘Source’ refers to what the programmer typed when he wrote the software, and can be understood by humans. By contrast, ‘binary’ is what the software looks like once it has been compiled, and is understood by computers, but not humans.
When a program is open source, it means that you can look at how the software was written, change how it works, and distribute your changed version to whomever is interested. If you want to make a small change to how a program works, or add a feature to it, you don't have to figure out how it does everything it does, write a program that duplicates that behavior, and then make your changes. You can just modify the original program.
Open source has more benefits, besides flexibility. If the original developers of the software lose interest (for example, they may want to focus on a new product, or they went bankrupt or were bought out), that doesn't mean you have to stop using the software. As long as somebody, is interested in maintaining the software, it will continue to work, get security patches, and may even receive new features. That somebody could be you, if you have the technical skills. Or you could pay someone to do it for you.
Having access to the source code of a program, rather than just the program itself, also has the benefit that it can be made to work on other platforms. The binary code understood by a computer depends on the type of computer and the operating system it is running. Thus, a program will typically only run on a specific combination of operating system and computer architecture. A binary program that works with PCs running Linux would not work with PCs running FreeBSD, nor with Macintoshes running Linux. The combination of having the source code available and the similarity between Unix systems means you can probably get the program to work on different platforms with little effort. You may not even have to modify the source, often just compiling it for the new platform is enough.
While open source certainly isn't unique to Unix, there is an interesting synergy between the two. The original Unix was developed in an atmosphere of open sharing of source code. Closed source Unix systems have since been developed, but open source versions still exist; some newly written, some survivors from the old times. The original open culture surrounding Unix, as well as the decission of the Free Software Foundation to develop an open source Unix system, have led to a lot of open source software being created for Unix. This software is very slow to spread to other operating systems, such as Windows, because such systems do not expose the same functions that all Unix systems do, which means that software would have to be extensively modified to work on them.
More information about open source:
Conclusion
There is much more to Unix than meets the eye if you just take a casual look at it or read a review or comparison to Windows. While many of its features take effort to learn and use, they offer great flexibility and power once mastered. You don't know the true power of Unix until you've used it yourself! And the best part is that, these days, you can get it all for free.