Shell tutorial part 1

From LXF Wiki

LINUX SHELL PROGRAMMING

(Original version written by Marco Fioretti for LXF issue 65.)

Table of contents

Shell secrets


PART 1 Marco Fioretti explains how to unleash all the power of the command line.


RESOURCES

Get your command line news here For a general introduction to the command line, a complete source of example scripts, coding style advice, useful tricks, tutorials and much more is at [1] (http://www.shelldorado.com)shelldorado.com Another portal, [2] (http://www.linuxcommand.org)linuxcommand.org, aims to become your one-stop command line shop. Back in the realm of printed paper, a very good tutorial is O’Reilly’s Learning the Bash Shell by C Newham and B Rosenblatt (now in its second edition, ISBN: 1-56592-347-2).

The Linux platform is becoming a stronger desktop solution day by day, and part of the reason for this is the commitment by distribution authors to provide an exclusively graphical user interface, from installation to upgrade. We shouldn’t forget, though, that the command line interface still exists. It may not be as pretty as a GUI but this alternative interface has flexibility, and there are many cases where it can save you a lot of time. For example, you can use the command line to save and rerun long sequences of commands, or create new ones to be executed en masse. The great thing about the command line is that you don’t have to be a guru to use it. Even end users can use it, and will find it useful and even fun to write commands this way. And why take the trouble? Well, unlike in a GUI, where you can only click the buttons that somebody else thought were needed, you can create your own commands and tailor your system. For example, have you ever wanted to issue multiple commands with as little clicking as possible? If so, welcome to the shell prompt. Or consider the boot process, which involves a chain of text commands executed in a well-defined order and can be greatly customised if you know how.

In-out.png (http://linuxformat.co.uk/wiki/images/4/4a/In-out.png)

In fact, there’s nothing to stop you from getting the best of both worlds. Once a script works on the command line, it can be easily bound to a GUI’s desktop icon or menu entry. Then in the event of an emergency you can call on your command line knowledge to stay in operation. What about disaster recovery with a GUI - where do you point and click if you need to reinstall the X server?


Inside the shell


All command-line typing happens inside what’s known in Unixland as a shell. This is both a programming language and a command interpreter, providing an interface to the operating system. Every shell has control flow structures, manipulates variables and can be modified to suit the environment in which programs run - but often they do this in slightly different ways. There are hundreds of shells available for Linux, but you might only have a handful of them installed. To find which ones are installed in your system, type the following command:

$ cat /etc/shells

Bash (Bourne-Again Shell, an adaptation of the original Bourne shell originally written by Stephen Bourne at AT&T) is the most common choice on Linux systems. Another popular one is csh (C Shell), with a syntax more similar to the C programming language. tcsh is an enhanced, backward-compatible version of csh. The reason why there is more than one solution for (apparently) the same problem is the usual one: different shells may have different licences, and each one is optimised for slightly different uses. Find out more online at http://consult.cern.ch/writeup/shellchoice/main.ps, an in-depth discussion of these and other shell programs that was written before bash claimed dominance. Programs written in a shell language, or any other interpreted one, are normally called scripts. They are just plain text files containing sequences of commands. Scripts are loaded and executed by the interpreter line by line, just as if you were typing the same sequence of instructions at the prompt. More explicitly, there isn’t anything that you can place in a shell program that you can’t type at the shell prompt, and vice versa. Are scripts better or worse than normal, binary-compiled programs? No, just different. Binary programs are faster but take programs? No, just different. Binary programs are faster but take much more time to develop and test. Scripts are much quicker to write but normally run much slower. What is important, and to write but normally run much slower. What is important, and should be evaluated case by case, is that the total time spent writing and running the program is minimised. In practice, shell scripts are usually the best match for the custom programming skills and needs of home and small-office Linux users. Programs and keywords Software programs are specific binary files physically stored on the hard disk. As a command line-based programming language, every shell can launch them directly. Shells also have, however, a set of built-in keywords, or commands, not corresponding to any actual program. This can cause confusion the first time you study a shell script, so keep this distinction in mind. The executable programs visible by the shell are those stored in the directories contained in the PATH environment variable. On my machine the PATH value is:

 [marco@polaris marco]$ echo $PATH
                                  
/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X1   1R6/bin:/usr/
lib/jre/bin:/home/marco/bin:/usr/lib/jre/bin

The echo string is a built-in keyword: the shell will perform the corresponding action, which passes on to the terminal the content of the PATH variable, all by itself.

To save the user’s time, many shells implement command completion. To see it in action, just type at a Linux prompt the string “finge” and then hit the Tab key. The shell will scan all the executable programs in the $PATH, discover that only the finger executable matches the string you entered, and complete it on the command line. The same happens when you enter a partial directory or file name.

Fig2.png (http://linuxformat.co.uk/wiki/images/c/cd/Fig2.png)
2/ The sequence of commands in a sample pipe.

Ports and pipes


Now that we know why we’re in a shell, the next thing to understand is how data flows in, out and through it. Each program running at the prompt can be thought of as a black box with three default ports or streams: standard input (STDIN), standard output (STDOUT) and standard error (STDERR) (Fig 1).

STDIN is where all the input comes from: this is, for example, where the kernel forwards the keys you press on your keyboard.

STDOUT is where the program sends all the bytes it produces: reports, calculations and so on. STDERR is, as if you didn’t know, the emergency line reserved for error messages. What makes this architecture extremely powerful is the fact that the ports of different programs can be easily connected to each other or to files on your hard disk, creating on the fly a single virtual program with impressive capabilities.

Let’s demonstrate this by pretending our hard disk is full. To make room, we’ll find the 50 biggest files in the home directory, listing them in the terminal window and saving the list in a local file. By examining the list we would then be able to decide which files can be removed. The command for doing this would look something like:

find . -type f -exec ls -s {} \; | sort -n -r | head -50 | cat -n |
tee /tmp/bigfiles.list                                                                                                                          

Quite a handful, isn’t it? Don’t worry, it’s nothing to be scared of - it makes perfect sense when you understand it!

The first thing to understand is the role of the ‘|’ character, known in this context as the ‘pipe’ operator. What it does is connect – just like a pipe – the STDOUT port of the previous program straight to the STDIN input of the next one. To understand what’s going on, we’ll have to introduce some utilities – choosing ones that will be extremely handy for a Linux user. To follow the action, test the commands above on your machine, adding one section at a time. Start by typing everything up to (but not including) the first pipe sign. Hit enter and look at the result. Then retype (or recall with the up arrow key), add everything up to the second pipe sign, hit enter and note what else happened. Repeat till the end of the line.

The find program finds all the files matching certain criteria and, if requested exec(utes) on each of them, represented by the curly braces, the action between the exec string and the semi-colon. The full stop after the find command represents the variable current directory, but could be substituted by the any folder in your drive, or combination of them. ‘-type f’ means, consider only the objects of type file. ls is short for list: the -s option tells it to return the file name, preceded by its size in bytes. Consequently, this first piece of command will produce an unordered list of file sizes and names, one per line, no matter how deeply they are nested in the directory tree.

The sort command will rearrange the files in numerical (“-n”), reverse (“-r”) order. “head –50” will repeat only the first 50 lines of its input stream, “cat –n” will add a line number to everything it receives. In the discarding all the others, and midst of all this piping frenzy, tee will create a branch: its standard input will be both printed to the terminal (since it’s the last command) and saved to the file /tmp/bigfiles.list. The overall data flow is shown in Fig 2.


Quote me happy


You’ll recognise variables in shell scripts by a dollar sign prepended to the command. These can be read, or given values, in a very creative way thanks to all those funny quote keys scattered around your keyboard. The different quoting styles are very important in the shell, because each of them is interpreted differently. The single quotes (‘these ones’) are taken literally. Their whole content is used just as it is, a static chunk of text. Double quotes (“here they are”) are used to perform so-called substitutions. Before using their content, the shell will scan it for variables, recognizable by the dollar sign, and substitute their current value into the quoted string.

The last type of quotes (`inverse ones`) are the most powerful variety – and thus should be used with caution. Their content is considered a command to be executed. The result of such a command is then put and used in place of the original string - try running ls $HOME in the three types of quote! LXF