Passing runtime data to AWK

Shell script and AWK are very complementary languages. AWK was designed from its very beginnings at Bell Labs as a pattern-action language for short programs, ideally one or two lines long. It was intended to be used on the Unix shell interactive command line, or in shell scripts. Its feature set filled out some functionality that shell script at the time lacked, and often still lacks, as is the case with floating point numbers; it thereby (indirectly) brings much of the C language’s expressive power to the shell.

It’s therefore both common and reasonable to see AWK one-liners in shell scripts for data processing where doing the same in shell is unwieldy or impossible, especially when floating point operations or data delimiting are involved. While AWK’s full power is in general tragically underused, most shell script users and developers know about one of its most useful properties: selecting a single column from whitespace-delimited data. Sometimes, cut(1) doesn’t, uh, cut it.

In order for one language to cooperate with another usefully via embedded programs in this way, data of some sort needs to be passed between them at runtime, and here there are a few traps with syntax that may catch out unwary shell programmers. We’ll go through a simple example showing the problems, and demonstrate a few potential solutions.

Easy: Fixed data

Embedded AWK programs in shell scripts work great when you already know before runtime what you want your patterns for the pattern-action pairs to be. Suppose our company has a vendor-supplied program that returns temperature sensor data for the server room, and we want to run some commands for any and all rows registering over a certain threshold temperature. The output for the existing server-room-temps command might look like this:

$ server-room-temps
ID  Location    Temperature_C
1   hot_aisle_1 27.9
2   hot_aisle_2 30.3
3   cold_aisle_1    26.0
4   cold_aisle_2    25.2
5   outer       23.9

The task for the monitoring script is simple: get a list of all the locations where the temperature is above 28°C. If there are any such locations, we need to email the administrator the full list. Easy! It looks like every introductory AWK example you’ve ever seen—it could be straight out of the book. Let’s type it up on the shell to test it:

$ server-room-temps | awk 'NR > 1 && $3 > 28 {print $2}'
hot_aisle_2

That looks good. The script might end up looking something like this:

#!/bin/sh
alerts=/var/cache/temps/alerts
server-room-temps |
    awk 'NR > 1 && $3 > 28 {print $2}' > "$alerts" || exit
if [ -s "$alerts" ] ; then
    mail -s 'Temperature alert' sysadmin < "$alerts"
fi

So, after writing the alerts data file, we test if with [ -s ... ] to see whether it’s got any data in it. If it does, we send it all to the administrator with mail(1). Done!

We set that running every few minutes with cron(8) or systemd.timer(5), and we have a nice stop-gap solution until the lazy systems administrator gets around to fixing the Nagios server. He’s probably just off playing ADOM again…

Hard: runtime data

A few weeks later, our sysadmin still hasn’t got the Nagios server running, because his high elf wizard is about to hit level 50, and there’s a new request from the boss: can we adjust the script so that it accepts the cutoff temperature data as an argument, and other departments can use it? Sure, why not. Let’s mock that up, with a threshold of, let’s say, 25.5°C.

$ server-room-temps > test-data
$ threshold=25.5
$ awk 'NR > 1 && $3 > $threshold {print $2}' test-data
hot_aisle_1
hot_aisle_2

Wait, that’s not right. There are three lines with temperatures over 25.5°C, not two. Where’s cold_aisle_1?

Looking at the code more carefully, you realize that you assumed your shell variable would be accessible from within the AWK program, when of course, it isn’t; AWK’s variables are independent of shell variables. You don’t know why the hell it’s showing those two rows, though…

Maybe we need double quotes?

$ awk "NR > 1 && $3 > $threshold {print $2}" test-data
awk: cmd. line:1: NR > 1 &&  > 25.5 {print}
awk: cmd. line:1:            ^ syntax error

Hmm. Nope. Maybe we need to expand the variable inside the quotes?

$ awk 'NR > 1 && $3 > "$threshold" {print $2}' test-data
hot-aisle-1
hot-aisle-2
cold-aisle-1
cold-aisle-2
outer

That’s not right, either. It seems to have printed all the locations, as if it didn’t test the threshold at all.

Maybe it should be outside the single quotes?

$ awk 'NR > 1 && $3 > '$threshold' {print $2}' test-data
hot-aisle-1
hot-aisle-2
cold-aisle-1

The results look right, now … ah, but wait, we still need to quote it to stop spaces expanding

$ awk 'NR > 1 && $3 > '"$threshold"' {print $2}' test-data
hot-aisle-1
hot-aisle-2
cold-aisle-1

Cool, that works. Let’s submit it to the security team and go to lunch.

Caught out

To your surprise, the script is rejected. The security officer says you have an unescaped variable that allows arbitrary code execution. What? Where? It’s just AWK, not SQL…!

To your horror, the security officer demonstrates:

$ threshold='0;{system("echo rm -fr /*");exit}'
$ echo 'NR > 1 && $3 > '"$threshold"' {print $2}'
NR > 1 && $3 > 0;{system("echo rm -fr /*");exit} {print $2}
$ awk 'NR > 1 && $3 > '"$threshold"' {print $2}' test-data
rm -fr /bin /boot /dev /etc /home /initrd.img ...

Oh, hell… if that were installed, and someone were able to set threshold to an arbitrary value, they could execute any AWK code, and thereby shell script, that they wanted to. It’s AWK injection! How embarrassing—good thing that was never going to run as root (…right?) Back to the drawing board …

Validating the data

One approach that might come readily to mind is to ensure that no unexpected characters appear in the value. We could use a case statement before interpolating the variable into the AWK program to check it contains no characters outside digits and a decimal:

case $threshold in
    *[!0-9.]*) exit 2 ;;
esac

That works just fine, and it’s appropriate to do some data validation at the opening of the script, anyway. It’s certainly better than leaving it as it was. But we learned this lesson with PHP in the 90s; you don’t just filter on characters, or slap in some backslashes—that’s missing the point. Ideally, we need to safely pass the data into the AWK process without ever parsing it as AWK code, sanitized or nay, so the situation doesn’t arise in the first place.

Environment variables

The shell and your embedded AWK program may not share the shell’s local variables, but they do share environment variables, accessible in AWK’s ENVIRON array. So, passing the threshold in as an environment variable works:

$ THRESHOLD=25.5
$ export THRESHOLD
$ awk 'NR > 1 && $3 > ENVIRON["THRESHOLD"] {print $2}' test-data
hot-aisle-1
hot-aisle-2
cold-aisle-1

Or, to be a little cleaner:

$ THRESHOLD=25.5 \
    awk 'NR > 1 && $3 > ENVIRON["THRESHOLD"] {print $2}' test-data
hot-aisle-1
hot-aisle-2
cold-aisle-1

This is already much better. AWK will parse our data only as a variable, and won’t try to execute anything within it. The only snag with this method is picking a name; make sure that you don’t overwrite another, more important environment variable, like PATH, or LANG

Another argument

Passing the data as another argument and then reading it out of the ARGV array works, too:

$ awk 'BEGIN{ARGC--} NR > 1 && $3 > ARGV[2] {print $2}' test-data 25.5

This method is also safe from arbitrary code execution, but it’s still somewhat awkward because it requires us to decrease the argument count ARGC by one so that AWK doesn’t try to process a file named “25.5” and end up upset when it’s not there. AWK arguments can mean whatever you need them to mean, but unless told otherwise, AWK generally assumes they are filenames, and will attempt to iterate through them for lines of data to chew on.

Here’s another way that’s very similar; we read the threshold from the second argument, and then blank it out in the ARGV array:

$ awk 'BEGIN{threshold=ARGV[2];ARGV[2]=""}
    NR > 1 && $3 > threshold {print $2}' test-data 25.5

AWK won’t treat the second argument as a filename, because it’s blank by the time it processes it.

Pre-assigned variables

There are two lesser-known syntaxes for passing data into AWK that allow you safely to assign variables at runtime. The first is to use the -v option:

$ awk -v threshold="$threshold" \
    'NR > 1 && $3 > threshold {print $2}' \
    test-data

Another, perhaps even more obscure, is to set them as arguments before the filename data, using the var=value syntax:

$ awk 'NR > 1 && $3 > threshold {print $2}' \
    threshold="$threshold" test-data

Note that in both cases, we still quote the $threshold expansion; this is because the shell is expanding the value before we pass it in.

The difference between these two syntaxes is when the variable assignment occurs. With -v, the assignment happens straight away, before reading any data from the input sources, as if it were in the BEGIN block of the program. With the argument form, it happens when the program’s data processing reaches that argument. The upshot of that is that you could test several files with several different temperatures in one hit, if you wanted to:

$ awk 'NR > 1 && $3 > threshold {print $2}' \
    threshold=25.5 test-data-1 threshold=26.0 test-data-2

Both of these assignment syntaxes are standardized in POSIX awk.

These are my preferred methods for passing runtime data; they require no argument count munging, avoid the possibility of trampling on existing environment variables, use AWK’s own variable and expression syntax, and most importantly, the chances of anyone reading the script being able to grasp what’s going on are higher. You can thereby avoid a mess of quoting and back-ticking that often plagues these sorts of embedded programs.

Safety not guaranteed

If you take away only one thing from this post, it might be: don’t interpolate shell variables in AWK programs, because it has the same fundamental problems as interpolating data into query strings in PHP. Pass the data in safely instead, using either environment variables, arguments, or AWK variable assignments. Keeping this principle in mind will serve you well for other embedded programs, too; stop thinking in terms of escaping and character whitelists, and start thinking in terms of passing the data safely in the first place.

Bash hostname completion

As part of its programmable completion suite, Bash includes hostname completion. This completion mode reads hostnames from a file in hosts(5) format to find possible completions matching the current word. On Unix-like operating systems, it defaults to reading the file in its usual path at /etc/hosts.

For example, given the following hosts(5) file in place at /etc/hosts:

127.0.0.1      localhost
192.0.2.1      web.example.com www
198.51.100.10  mail.example.com mx
203.0.113.52   radius.example.com rad

An appropriate call to compgen would yield this output:

$ compgen -A hostname
localhost
web.example.com
www
mail.example.com
mx
radius.example.com
rad

We could then use this to complete hostnames for network diagnostic tools like ping(8):

$ complete -A hostname ping

Typing ping we and then pressing Tab would then complete to ping web.example.com. If the shopt option hostcomplete is on, which it is by default, Bash will also attempt host completion if completing any word with an @ character in it. This can be useful for email address completion or for SSH username@hostname completion.

We could also trigger hostname completion in any other Bash command line (regardless of complete settings) with the Readline shortcut Alt+@ (i.e. Alt+Shift+2). This works even if hostcomplete is turned off.

However, with DNS so widely deployed, and with system /etc/hosts files normally so brief on internet-connected systems, this may not seem terribly useful; you’d just end up completing localhost, and (somewhat erroneously) a few IPv6 addresses that don’t begin with a digit. It may seem even less useful if you have your own set of hosts in which you’re interested, since they may not correspond to the hosts in the system’s /etc/hosts file, and you probably really do want them looked up via DNS each time, rather than maintaining static addresses for them.

There’s a simple way to make host completion much more useful by defining the HOSTFILE variable in ~/.bashrc to point to any other file containing a list of hostnames. You could, for example, create a simple file ~/.hosts in your home directory, and then include this in your ~/.bashrc:

# Use a private mock hosts(5) file for completion
HOSTFILE=$HOME/.hosts

You could then populate the ~/.hosts file with a list of hostnames in which you’re interested, which will allow you to influence hostname completion usefully without messing with your system’s DNS resolution process at all. Because of the way the Bash HOSTFILE parsing works, you don’t even have to fake an IP address as the first field; it simply scans the file for any word that doesn’t start with a digit:

# Comments with leading hashes will be excluded
external.example.com
router.example.com router
github.com
google.com
...

You can even include other files from it with an $include directive!

$include /home/tom/.hosts.home
$include /home/tom/.hosts.work

This really surprised me when reading the source, because I don’t think /etc/hosts files generally support that for their usual name resolution function. I would love to know if any systems out there actually do support this.

The behaviour of the HOSTFILE variable is a bit weird; all of the hosts from the HOSTFILE are appended to the in-memory list of completion hosts each time the HOSTFILE variable is set (not even just changed), and host completion is attempted, even if the hostnames were already in the list. It’s probably sufficient just to set the file once in ~/.bashrc.

This setup allows you to set hostname completion as the default method for all sorts of network-poking tools, falling back on the usual filename completion if nothing matches with -o default:

$ complete -A hostname -o default curl dig host netcat ping telnet

You could also use hostname completions for ssh(1), but to account for hostname aliases and other ssh_config(5) tricks, I prefer to read Host directives values from ~/.ssh/config for that.

If you have machine-readable access to the complete zone data for your home or work domain, it may even be worth periodically enumerating all of the hostnames into that file, perhaps using rndc dumpdb -zones for a BIND9 setup, or using an AXFR request. If you have a locally caching recursive nameserver, you could even periodically examine the contents of its cache for new and interesting hosts to add to the file.

Custom commands

As users grow more familiar with the feature set available to them on UNIX-like operating systems, and grow more comfortable using the command line, they will find more often that they develop their own routines for solving problems using their preferred tools, often repeatedly solving the same problem in the same way. You can usually tell if you’ve entered this stage if one or more of the below applies:

  • You repeatedly search the web for the same long commands to copy-paste.
  • You type a particular long command so often it’s gone into muscle memory, and you type it without thinking.
  • You have a text file somewhere with a list of useful commands to solve some frequently recurring problem or task, and you copy-paste from it a lot.
  • You’re keeping large amounts of history so you can search back through commands you ran weeks or months ago with ^R, to find the last time an instance of a problem came up, and getting angry when you realize it’s fallen away off the end of your history file.
  • You’ve found that you prefer to run a tool like ls(1) more often with a non-default flag than without it; -l is a common example.

You can definitely accomplish a lot of work quickly with shoving the output of some monolithic program through a terse one-liner to get the information you want, or by developing muscle memory for your chosen toolbox and oft-repeated commands, but if you want to apply more discipline and automation to managing these sorts of tasks, it may be useful for you to explore more rigorously defining your own commands for use during your shell sessions, or for automation purposes.

This is consistent with the original idea of the Unix shell as a programming environment; the tools provided by the base system are intentionally very general, not prescribing how they’re used, an approach which allows the user to build and customize their own command set as appropriate for their system’s needs, even on a per-user basis.

What this all means is that you need not treat the tools available to you as holy writ. To leverage the Unix philosophy’s real power, you should consider customizing and extending the command set in ways that are useful to you, refining them as you go, and sharing those extensions and tweaks if they may be useful to others. We’ll discuss here a few methods for implementing custom commands, and where and how to apply them.

Aliases

The first step users take toward customizing the behaviour of their shell tools is often to define shell aliases in their shell’s startup file, usually specifically for interactive sessions; for Bash, this is usually ~/.bashrc.

Some aliases are so common that they’re included as commented-out suggestions in the default ~/.bashrc file for new users. For example, on Debian systems, the following alias is defined by default if the dircolors(1) tool is available for coloring ls(1) output by filetype:

alias ls='ls --color=auto'

With this defined at startup, invoking ls, with or without other arguments, will expand to run ls --color=auto, including any given arguments on the end as well.

In the same block of that file, but commented out, are suggestions for other aliases to enable coloured output for GNU versions of the dir and grep tools:

#alias dir='dir --color=auto'
#alias vdir='vdir --color=auto'

#alias grep='grep --color=auto'
#alias fgrep='fgrep --color=auto'
#alias egrep='egrep --color=auto'

Further down still, there are some suggestions for different methods of invoking ls:

#alias ll='ls -l'
#alias la='ls -A'
#alias l='ls -CF'

Commenting these out would make ll, la, and l work as commands during an interactive session, with the appropriate options added to the call.

You can check the aliases defined in your current shell session by typing alias with no arguments:

$ alias
alias ls='ls --color=auto'

Aliases are convenient ways to add options to commands, and are very common features of ~/.bashrc files shared on the web. They also work in POSIX-conforming shells besides Bash. However, for general use, they aren’t very sophisticated. For one thing, you can’t process arguments with them:

# An attempt to write an alias that searches for a given pattern in a fixed
# file; doesn't work because aliases don't expand parameters
alias grepvim='grep "$1" ~/.vimrc'

They also don’t work for defining new commands within scripts for certain shells:

#!/bin/bash
alias ll='ls -l'
ll

When saved in a file as test, made executable, and run, this script fails:

./test: line 3: ll: command not found

So, once you understand how aliases work so you can read them when others define them in startup files, my suggestion is there’s no point writing any. Aside from some very niche evaluation tricks, they have no functional advantages over shell functions and scripts.

Functions

A more flexible method for defining custom commands for an interactive shell (or within a script) is to use a shell function. We could declare our ll function in a Bash startup file as a function instead of an alias like so:

# Shortcut to call ls(1) with the -l flag
ll() {
    command ls -l "$@"
}

Note the use of the command builtin here to specify that the ll function should invoke the program named ls, and not any function named ls. This is particularly important when writing a function wrapper around a command, to stop an infinite loop where the function calls itself indefinitely:

# Always add -q to invocations of gdb(1)
gdb() {
    command gdb -q "$@"
}

In both examples, note also the use of the "$@" expansion, to add to the final command line any arguments given to the function. We wrap it in double quotes to stop spaces and other shell metacharacters in the arguments causing problems. This means that the ll command will work correctly if you were to pass it further options and/or one or more directories as arguments:

$ ll -a
$ ll ~/.config

Shell functions declared in this way are specified by POSIX for Bourne-style shells, so they should work in your shell of choice, including Bash, dash, Korn shell, and Zsh. They can also be used within scripts, allowing you to abstract away multiple instances of similar commands to improve the clarity of your script, in much the same way the basics of functions work in general-purpose programming languages.

Functions are a good and portable way to approach adding features to your interactive shell; written carefully, they even allow you to port features you might like from other shells into your shell of choice. I’m fond of taking commands I like from Korn shell or Zsh and implementing them in Bash or POSIX shell functions, such as Zsh’s vared or its two-argument cd features.

If you end up writing a lot of shell functions, you should consider putting them into separate configuration subfiles to keep your shell’s primary startup file from becoming unmanageably large.

Examples from the author

You can take a look at some of the shell functions I have defined here that are useful to me in general shell usage; a lot of these amount to implementing convenience features that I wish my shell had, especially for quick directory navigation, or adding options to commands:

Other examples

Variables in shell functions

You can manipulate variables within shell functions, too:

# Print the filename of a path, stripping off its leading path and
# extension
fn() {
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

This works fine, but the catch is that after the function is done, the value for name will still be defined in the shell, and will overwrite whatever was in there previously:

$ printf '%s\n' "$name"
foobar
$ fn /home/you/Task_List.doc
Task_List
$ printf '%s\n' "$name"
Task_List

This may be desirable if you actually want the function to change some aspect of your current shell session, such as managing variables or changing the working directory. If you don’t want that, you will probably want to find some means of avoiding name collisions in your variables.

If your function is only for use with a shell that provides the local (Bash) or typeset (Ksh) features, you can declare the variable as local to the function to remove its global scope, to prevent this happening:

# Bash-like
fn() {
    local name
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

# Ksh-like
# Note different syntax for first line
function fn {
    typeset name
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
}

If you’re using a shell that lacks these features, or you want to aim for POSIX compatibility, things are a little trickier, since local function variables aren’t specified by the standard. One option is to use a subshell, so that the variables are only defined for the duration of the function:

# POSIX; note we're using plain parentheses rather than curly brackets, for
# a subshell
fn() (
    name=$1
    name=${name##*/}
    name=${name%.*}
    printf '%s\n' "$name"
)

# POSIX; alternative approach using command substitution:
fn() {
    printf '%s\n' "$(
        name=$1
        name=${name##*/}
        name=${name%.*}
        printf %s "$name"
    )"
}

This subshell method also allows you to change directory with cd within a function without changing the working directory of the user’s interactive shell, or to change shell options with set or Bash options with shopt only temporarily for the purposes of the function.

Another method to deal with variables is to manipulate the positional parameters directly ($1, $2 … ) with set, since they are local to the function call too:

# POSIX; using positional parameters
fn() {
    set -- "${1##*/}"
    set -- "${1%.*}"
    printf '%s\n' "$1"
}

These methods work well, and can sometimes even be combined, but they’re awkward to write, and harder to read than the modern shell versions. If you only need your functions to work with your modern shell, I recommend just using local or typeset. The Bash Guide on Greg’s Wiki has a very thorough breakdown of functions in Bash, if you want to read about this and other aspects of functions in more detail.

Keeping functions for later

As you get comfortable with defining and using functions during an interactive session, you might define them in ad-hoc ways on the command line for calling in a loop or some other similar circumstance, just to solve a task in that moment.

As an example, I recently made an ad-hoc function called monit to run a set of commands for its hostname argument that together established different types of monitoring system checks, using an existing script called nmfs:

$ monit() { nmfs "$1" Ping Y ; nmfs "$1" HTTP Y ; nmfs "$1" SNMP Y ; }
$ for host in webhost{1..10} ; do
> monit "$host"
> done

After that task was done, I realized I was likely to use the monit command interactively again, so I decided to keep it. Shell functions only last as long as the current shell, so if you want to make them permanent, you need to store their definitions somewhere in your startup files. If you’re using Bash, and you’re content to just add things to the end of your ~/.bashrc file, you could just do something like this:

$ declare -f monit >> ~/.bashrc

That would append the existing definition of monit in parseable form to your ~/.bashrc file, and the monit function would then be loaded and available to you for future interactive sessions. Later on, I ended up converting monit into a shell script, as its use wasn’t limited to just an interactive shell.

If you want a more robust approach to keeping functions like this for Bash permanently, I wrote a tool called Bashkeep, which allows you to quickly store functions and variables defined in your current shell into separate and appropriately-named files, including viewing and managing the list of names conveniently:

$ keep monit
$ keep
monit
$ ls ~/.bashkeep.d
monit.bash
$ keep -d monit

Scripts

Shell functions are a great way to portably customize behaviour you want for your interactive shell, but if a task isn’t specific only to an interactive shell context, you should instead consider putting it into its own script whether written in shell or not, to be invoked somewhere from your PATH. This makes the script useable in contexts besides an interactive shell with your personal configuration loaded, for example from within another script, by another user, or by an X11 session called by something like dmenu.

Even if your set of commands is only a few lines long, if you need to call it often–especially with reference to other scripts and in varying contexts– making it into a generally-available shell script has many advantages.

/usr/local/bin

Users making their own scripts often start by putting them in /usr/local/bin and making them executable with sudo chmod +x, since many Unix systems include this directory in the system PATH. If you want a script to be generally available to all users on a system, this is a reasonable approach. However, if the script is just something for your own personal use, or if you don’t have the permissions necessary to write to this system path, it may be preferable to have your own directory for logical binaries, including scripts.

Private bindir

Unix-like users who do this seem to vary in where they choose to put their private logical binaries directory. I’ve seen each of the below used or recommended:

  • ~/bin
  • ~/.bin
  • ~/.local/bin
  • ~/Scripts

I personally favour ~/.local/bin, but you can put your scripts wherever they best fit into your HOME directory layout. You may want to choose something that fits in well with the XDG standard, or whatever existing standard or system your distribution chooses for filesystem layout in $HOME.

In order to make this work, you will want to customize your login shell startup to include the directory in your PATH environment variable. It’s better to put this into ~/.profile or whichever file your shell runs on login, so that it’s only run once. That should be all that’s necessary, as PATH is typically exported as an environment variable for all the shell’s child processes. A line like this at the end of one of those scripts works well to extend the system PATH for our login shell:

PATH=$HOME/.local/bin:$PATH

Note that we specifically put our new path at the front of the PATH variable’s value, so that it’s the first directory searched for programs. This allows you to implement or install your own versions of programs with the same name as those in the system; this is useful, for example, if you like to experiment with building software in $HOME.

If you’re using a systemd-based GNU/Linux, and particularly if you’re using a display manager like GDM rather than a TTY login and startx for your X11 environment, you may find it more robust to instead set this variable with the appropriate systemd configuration file. Another option you may prefer on systems using PAM is to set it with pam_env(8).

After logging in, we first verify the directory is in place in the PATH variable:

$ printf '%s\n' "$PATH"
/home/tom/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

We can test this is working correctly by placing a test script into the directory, including the #!/bin/sh shebang, and making it executable by the current user with chmod(1):

$ cat >~/.local/bin/test-private-bindir
#!/bin/sh
printf 'Working!\n'
^D
$ chmod u+x ~./local/bin/test-private-bindir
$ test-private-bindir
Working!

Examples from the author

I publish the more generic scripts I keep in ~/.local/bin, which I keep up-to-date on my personal systems in version control using Git, along with my configuration files. Many of the scripts are very short, and are intended mostly as building blocks for other scripts in the same directory. A few examples:

  • gscr(1df): Run a set of commands on a Git repository to minimize its size.
  • fgscr(1df): Find all Git repositories in a directory tree and run gscr(1df) over them.
  • hurl(1df): Extract URLs from links in an HTML document.
  • maybe(1df): Exit with success or failure with a given probability.
  • rfcr(1df): Download and read a given Request for Comments document.
  • tot(1df): Add up a list of numbers.

For such scripts, I try to write them as much as possible to use tools specified by POSIX, so that there’s a decent chance of them working on whatever Unix-like system I need them to.

On systems I use or manage, I might specify commands to do things relevant specifically to that system, such as:

  • Filter out uninteresting lines in an Apache HTTPD logfile with awk.
  • Check whether mail has been delivered to system users in /var/mail.
  • Upgrade the Adobe Flash player in a private Firefox instance.

The tasks you need to solve both generally and specifically will almost certainly be different; this is where you can get creative with your automation and abstraction.

X windows scripts

An additional advantage worth mentioning of using scripts rather than shell functions where possible is that they can be called from environments besides shells, such as in X11 or by other scripts. You can combine this method with X11-based utilities such as dmenu(1), libnotify’s notify-send(1), or ImageMagick’s import(1) to implement custom interactive behaviour for your X windows session, without having to write your own X11-interfacing code.

Other languages

Of course, you’re not limited to just shell scripts with this system; it might suit you to write a script completely in a language like awk(1), or even sed(1). If portability isn’t a concern for the particular script, you should use your favourite scripting language. Notably, don’t fall into the trap of implementing a script in shell for no reason …

#!/bin/sh
awk 'NF>2 && /foobar/ {print $1}' "$@"

… when you can instead write the whole script in the main language used, and save a fork(2) syscall and a layer of quoting:

#!/usr/bin/awk -f
NF>2 && /foobar/ {print $1}

Versioning and sharing

Finally, if you end up writing more than a couple of useful shell functions and scripts, you should consider versioning them with Git or a similar version control system. This also eases implementing your shell setup and scripts on other systems, and sharing them with others via publishing on GitHub. You might even go so far as to write a Makefile to install them, or manual pages for quick reference as documentation … if you’re just a little bit crazy …

Shell config subfiles

Large shell startup scripts (.bashrc, .profile) over about fifty lines or so with a lot of options, aliases, custom functions, and similar tweaks can get cumbersome to manage over time, and if you keep your dotfiles under version control it’s not terribly helpful to see large sets of commits just editing the one file when it could be more instructive if broken up into files by section.

Given that shell configuration is just shell code, we can apply the source builtin (or the . builtin for POSIX sh) to load several files at the end of a .bashrc, for example:

source ~/.bashrc.options
source ~/.bashrc.aliases
source ~/.bashrc.functions

This is a better approach, but it still binds us into using those filenames; we still have to edit the ~/.bashrc file if we want to rename them, or remove them, or add new ones.

Fortunately, UNIX-like systems have a common convention for this, the .d directory suffix, in which sections of configuration can be stored to be read by a main configuration file dynamically. In our case, we can create a new directory ~/.bashrc.d:

$ ls ~/.bashrc.d
options.bash
aliases.bash
functions.bash

With a slightly more advanced snippet at the end of ~/.bashrc, we can then load every file with the suffix .bash in this directory:

# Load any supplementary scripts
for config in "$HOME"/.bashrc.d/*.bash ; do
    source "$config"
done
unset -v config

Note that we unset the config variable after we’re done, otherwise it’ll be in the namespace of our shell where we don’t need it. You may also wish to check for the existence of the ~/.bashrc.d directory, check there’s at least one matching file inside it, or check that the file is readable before attempting to source it, depending on your preference.

The same method can be applied with .profile to load all scripts with the suffix .sh in ~/.profile.d, if we want to write in POSIX sh, with some slightly different syntax:

# Load any supplementary scripts
for config in "$HOME"/.profile.d/*.sh ; do
    . "$config"
done
unset -v config

Another advantage of this method is that if you have your dotfiles under version control, you can arrange to add extra snippets on a per-machine basis unversioned, without having to update your .bashrc file.

Here’s my implementation of the above method, for both .bashrc and .profile:

Prompt directory shortening

The common default of some variant of \h:\w\$ for a Bash prompt PS1 string includes the \w escape character, so that the user’s current working directory appears in the prompt, but with $HOME shortened to a tilde:

tom@sanctum:~$
tom@sanctum:~/Documents$
tom@sanctum:/usr/local/nagios$

This is normally very helpful, particularly if you leave your shell for a time and forget where you are, though of course you can always call the pwd shell builtin. However it can get annoying for very deep directory hierarchies, particularly if you’re using a smaller terminal window:

tom@sanctum:/chroot/apache/usr/local/perl/app-library/lib/App/Library/Class:~$

If you’re using Bash version 4.0 or above (bash --version), you can save a bit of terminal space by setting the PROMPT_DIRTRIM variable for the shell. This limits the length of the tail end of the \w and \W expansions to that number of path elements:

tom@sanctum:/chroot/apache/usr/local/app-library/lib/App/Library/Class$ PROMPT_DIRTRIM=3
tom@sanctum:.../App/Library/Class$

This is a good thing to include in your ~/.bashrc file if you often find yourself deep in directory trees where the upper end of the hierarchy isn’t of immediate interest to you. You can remove the effect again by unsetting the variable:

tom@sanctum:.../App/Library/Class$ unset PROMPT_DIRTRIM
tom@sanctum:/chroot/apache/usr/local/app-library/lib/App/Library/Class$

Testing exit values in Bash

In Bash scripting (and shell scripting in general), we often want to check the exit value of a command to decide an action to take after it completes, likely for the purpose of error handling. For example, to determine whether a particular regular expression regex was present somewhere in a file options, we might apply grep(1) with its POSIX -q option to suppress output and just use the exit value:

grep -q regex options

An approach sometimes taken is then to test the exit value with the $? parameter, using if to check if it’s non-zero, which is not very elegant and a bit hard to read:

# Bad practice
grep -q regex options
if (($? > 0)); then
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
fi

Because the if construct by design tests the exit value of commands, it’s better to test the command directly, making the expansion of $? unnecessary:

# Better
if grep -q regex options; then
    # Do nothing
    :
else
    printf '%s\n' 'myscript: Pattern not found!\n' >&2
    exit 1
fi

We can precede the command to be tested with ! to negate the test as well, to prevent us having to use else as well:

# Best
if ! grep -q regex options; then
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
fi

An alternative syntax is to use && and || to perform if and else tests with grouped commands between braces, but these tend to be harder to read:

# Alternative
grep -q regex options || {
    printf '%s\n' 'myscript: Pattern not found!' >&2
    exit 1
}

With this syntax, the two commands in the block are only executed if the grep(1) call exits with a non-zero status. We can apply && instead to execute commands if it does exit with zero.

That syntax can be convenient for quickly short-circuiting failures in scripts, for example due to nonexistent commands, particularly if the command being tested already outputs its own error message. This therefore cuts the script off if the given command fails, likely due to ffmpeg(1) being unavailable on the system:

hash ffmpeg || exit 1

Note that the braces for a grouped command are not needed here, as there’s only one command to be run in case of failure, the exit call.

Calls to cd are another good use case here, as running a script in the wrong directory if a call to cd fails could have really nasty effects:

cd wherever || exit 1

In general, you’ll probably only want to test $? when you have specific non-zero error conditions to catch. For example, if we were using the --max-delete option for rsync(1), we could check a call’s return value to see whether rsync(1) hit the threshold for deleted file count and write a message to a logfile appropriately:

rsync --archive --delete --max-delete=5 source destination
if (($? == 25)); then
    printf '%s\n' 'Deletion limit was reached' >"$logfile"
fi

It may be tempting to use the errexit feature in the hopes of stopping a script as soon as it encounters any error, but there are some problems with its usage that make it a bit error-prone. It’s generally more straightforward to simply write your own error handling using the methods above.

For a really thorough breakdown of dealing with conditionals in Bash, take a look at the relevant chapter of the Bash Guide.

Bash history expansion

Setting the Bash option histexpand allows some convenient typing shortcuts using Bash history expansion. The option can be set with either of these:

$ set -H
$ set -o histexpand

It’s likely that this option is already set for all interactive shells, as it’s on by default. The manual, man bash, describes these features as follows:

-H  Enable ! style history substitution. This option is on
    by default when the shell is interactive.

You may have come across this before, perhaps to your annoyance, in the following error message that comes up whenever ! is used in a double-quoted string, or without being escaped with a backslash:

$ echo "Hi, this is Tom!"
bash: !": event not found

If you don’t want the feature and thereby make ! into a normal character, it can be disabled with either of these:

$ set +H
$ set +o histexpand

History expansion is actually a very old feature of shells, having been available in csh before Bash usage became common.

This article is a good followup to Better Bash history, which among other things explains how to include dates and times in history output, as these examples do.

Basic history expansion

Perhaps the best known and most useful of these expansions is using !! to refer to the previous command. This allows repeating commands quickly, perhaps to monitor the progress of a long process, such as disk space being freed while deleting a large file:

$ rm big_file &
[1] 23608
$ du -sh .
3.9G    .
$ !!
du -sh .
3.3G    .

It can also be useful to specify the full filesystem path to programs that aren’t in your $PATH:

$ hdparm
-bash: hdparm: command not found
$ /sbin/!!
/sbin/hdparm

In each case, note that the command itself is printed as expanded, and then run to print the output on the following line.

History by absolute index

However, !! is actually a specific example of a more general form of history expansion. For example, you can supply the history item number of a specific command to repeat it, after looking it up with history:

$ history | grep expand
 3951  2012-08-16 15:58:53  set -o histexpand
$ !3951
set -o histexpand

You needn’t enter the !3951 on a line by itself; it can be included as any part of the command, for example to add a prefix like sudo:

$ sudo !3850

If you include the escape string \! as part of your Bash prompt, you can include the current command number in the prompt before the command, making repeating commands by index a lot easier as long as they’re still visible on the screen.

History by relative index

It’s also possible to refer to commands relative to the current command. To subtitute the second-to-last command, we can type !-2. For example, to check whether truncating a file with sed worked correctly:

$ wc -l bigfile.txt
267 bigfile.txt
$ printf '%s\n' '11,$d' w | ed -s bigfile.txt
$ !-2
wc -l bigfile.txt
10 bigfile.txt

This works further back into history, with !-3, !-4, and so on.

Expanding for historical arguments

In each of the above cases, we’re substituting for the whole command line. There are also ways to get specific tokens, or words, from the command if we want that. To get the first argument of a particular command in the history, use the !^ token:

$ touch a.txt b.txt c.txt
$ ls !^
ls a.txt
a.txt

To get the last argument, add !$:

$ touch a.txt b.txt c.txt
$ ls !$
ls c.txt
c.txt

To get all arguments (but not the command itself), use !*:

$ touch a.txt b.txt c.txt
$ ls !*
ls a.txt b.txt c.txt
a.txt  b.txt  c.txt

This last one is particularly handy when performing several operations on a group of files; we could run du and wc over them to get their size and character count, and then perhaps decide to delete them based on the output:

$ du a.txt b.txt c.txt
4164    a.txt
5184    b.txt
8356    c.txt
$ wc !*
wc a.txt b.txt c.txt
16689    94038  4250112 a.txt
20749   117100  5294592 b.txt
33190   188557  8539136 c.txt
70628   399695 18083840 total
$ rm !*
rm a.txt b.txt c.txt

These work not just for the preceding command in history, but also absolute and relative command numbers:

$ history 3
 3989  2012-08-16 16:30:59  wc -l b.txt
 3990  2012-08-16 16:31:05  du -sh c.txt
 3991  2012-08-16 16:31:12  history 3
$ echo !3989^
echo -l
-l
$ echo !3990$
echo c.txt
c.txt
$ echo !-1*
echo c.txt
c.txt

More generally, you can use the syntax !n:w to refer to any specific argument in a history item by number. In this case, the first word, usually a command or builtin, is word 0:

$ history | grep bash
 4073  2012-08-16 20:24:53  man bash
$ !4073:0
man
What manual page do you want?
$ !4073:1
bash

You can even select ranges of words by separating their indices with a hyphen:

$ history | grep apt-get
 3663  2012-08-15 17:01:30  sudo apt-get install gnome
$ !3663:0-1 purge !3663:3
sudo apt-get purge gnome

You can include ^ and $ as start and endpoints for these ranges, too. 3* is a shorthand for 3-$, meaning “all arguments from the third to the last.”

Expanding history by string

You can also refer to a previous command in the history that starts with a specific string with the syntax !string:

$ !echo
echo c.txt
c.txt
$ !history
history 3
 4011  2012-08-16 16:38:28  rm a.txt b.txt c.txt
 4012  2012-08-16 16:42:48  echo c.txt
 4013  2012-08-16 16:42:51  history 3

If you want to match any part of the command line, not just the start, you can use !?string?:

$ !?bash?
man bash

Be careful when using these, if you use them at all. By default it will run the most recent command matching the string immediately, with no prompting, so it might be a problem if it doesn’t match the command you expect.

Checking history expansions before running

If you’re paranoid about this, Bash allows you to audit the command as expanded before you enter it, with the histverify option:

$ shopt -s histverify
$ !rm
$ rm a.txt b.txt c.txt

This option works for any history expansion, and may be a good choice for more cautious administrators. It’s a good thing to add to one’s .bashrc if so.

If you don’t need this set all the time, but you do have reservations at some point about running a history command, you can arrange to print the command without running it by adding a :p suffix:

$ !rm:p
rm important-file

In this instance, the command was expanded, but thankfully not actually run.

Substituting strings in history expansions

To get really in-depth, you can also perform substitutions on arbitrary commands from the history with !!:gs/pattern/replacement/. This is getting pretty baroque even for Bash, but it’s possible you may find it useful at some point:

$ !!:gs/txt/mp3/
rm a.mp3 b.mp3 c.mp3

If you only want to replace the first occurrence, you can omit the g:

$ !!:s/txt/mp3/
rm a.mp3 b.txt c.txt

Stripping leading directories or trailing files

If you want to chop a filename off a long argument to work with the directory, you can do this by adding an :h suffix, kind of like a dirname call in Perl:

$ du -sh /home/tom/work/doc.txt
$ cd !$:h
cd /home/tom/work

To do the opposite, like a basename call in Perl, use :t:

$ ls /home/tom/work/doc.txt
$ document=!$:t
document=doc.txt

Stripping extensions or base names

A bit more esoteric, but still possibly useful; to strip a file’s extension, use :r:

$ vi /home/tom/work/doc.txt
$ stripext=!$:r
stripext=/home/tom/work/doc

To do the opposite, to get only the extension, use :e:

$ vi /home/tom/work/doc.txt
$ extonly=!$:e
extonly=.txt

Quoting history

If you’re performing substitution not to execute a command or fragment but to use it as a string, it’s likely you’ll want to quote it. For example, if you’ve just found through experiment and trial and error an ideal ffmpeg command line to accomplish some task, you might want to save it for later use by writing it to a script:

$ ffmpeg -f alsa -ac 2 -i hw:0,0 -f x11grab -r 30 -s 1600x900 \
> -i :0.0+1600,0 -acodec pcm_s16le -vcodec libx264 -preset ultrafast \
> -crf 0 -threads 0 "$(date +%Y%m%d%H%M%S)".mkv 

To make sure all the escaping is done correctly, you can write the command into the file with the :q modifier:

$ echo '#!/usr/bin/env bash' >ffmpeg.sh
$ echo !ffmpeg:q >>ffmpeg.sh

In this case, this will prevent Bash from executing the command expansion "$(date ... )", instead writing it literally to the file as desired. If you build a lot of complex commands interactively that you later write to scripts once completed, this feature is really helpful and saves a lot of cutting and pasting.

Thanks to commenter Mihai Maruseac for pointing out a bug in the examples.

Avoiding the arrow keys

For shell users, moving to the arrow keys on the keyboard is something of an antipattern, moving away from the home row so central to touch typists. It therefore helps to find ways to avoid using the arrow keys in order to maintain your flow.

Bash

The arrow keys in Bash are used to move back and forth on the current command line (left and right), and up and down through the command history (up and down). This leads to the old shell user’s maxim:

“Those that don’t remember history are doomed to press Up repeatedly.”

There are alternatives to both functions. To move left or right by one character on the command line without deleting characters already placed, we can use Ctrl-B and Ctrl-F.

However, to make things a bit faster, there are four other key combinations to move back and forth on a line that are worth learning too:

  • Alt-B — Move back a word
  • Alt-F — Move forward a word
  • Ctrl-A — Move to the start of the line
  • Ctrl-E — Move to the end of the line

Similarly, to move up and down through history, we can use Ctrl-P and Ctrl-N respectively. Don’t forget that rather than keep tapping one of these, you can search backward or forward through history with Ctrl-R and Ctrl-S.

Whoops, I think I just taught you some Emacs.

Vim

To avoid the arrow keys in normal mode in Vim, use h, j, k, and l instead. This can take a little getting used to, but the benefits in terms of comfort for your hands and speed is easily worth it; using the arrow keys is one of the Vim anti-patterns.

If you’re asking “How do I avoid the arrow keys to move in insert mode?”, the answer is that you shouldn’t be moving in insert mode at all; it’s inefficient. When in insert mode you should be inserting text, and maybe using backspace or delete every now and then. To move through text, switching back to normal mode is vastly more efficient.

Moving in command mode is similar. If you need to move around on the command line, in most cases you should pull up the command window so that you can edit the command in normal mode, and then just tap Enter to run it.

In general, when you start thinking about moving through any kind of text, you should reflexively hit Esc or Ctrl-[ to go into normal mode, in order to take advantage of a whole keyboard’s worth of navigation shortcuts.

Bash prompts

You can tell a lot about a shell user by looking at their prompt. Most shell users will use whatever the system’s default prompt is for their entire career. Under many GNU/Linux distributions, this prompt includes the username, the hostname, and the current working directory, along with a $ sigil for regular users, and a # for root.

tom@sanctum:~$

This format works well for most users, and it’s so common that it’s often used in tutorials to represent the user’s prompt in lieu of the single-character standard $ and #. It shows up as the basis for many of the custom prompt setups you’ll find online.

Some users may have made the modest steps of perhaps adding some brackets around the prompt, or adding the time or date:

[10:11][tom@sanctum:~]$

Still others make the prompt into a grand multi-line report of their shell’s state, or indeed their entire system, full of sound and fury and signifying way too much:

-(23:38:57)-(4 users)-(0.00, 0.02, 0.05)-
-(Linux sanctum 3.2.0-3-amd64 x86_64 GNU/Linux)-
[tom@sanctum:/dev/pts/2:/home/tom]{2}!?[MAIL]$

Then there are the BSD users and other minimalist contrarians who can’t abide anything more than a single character cluttering their screens:

$

Getting your prompt to work right isn’t simply a cosmetic thing, however. If done right and in a style consistent with how you work, it can provide valuable feedback on how the system is reacting to what you’re doing with it, and also provide you with visual cues that could save you a lot of confusion — or even prevent your making costly mistakes.

I’ll be working in Bash. Many of these principles work well for Zsh as well, but Zsh users might want to read Steve Losh’s article on his prompt setup first to get an idea of the syntactical differences.

Why is this important?

One of the primary differences between terminals and windowed applications is that there tends to be much less emphasis on showing metadata about the running program and its environment. For the most part, to get information about what’s going on, you need to request it with calls like pwd, whoami, readlink, env, jobs, and echo $?.

The habit of keeping metadata out of the terminal like this unless requested may appeal to a sense of minimalism, but like a lot of things in the command line world, it’s partly the olden days holding us back. When mainframes and other dinosaurs roamed the earth, terminal screens were small, and a long or complex prompt amounted to a waste of space and cycles (or even paper). In our futuristic 64-bit world, where tiny CRTs are a thing of the past and we think nothing of throwing gigabytes of RAM at a single process, this is much less of a problem.

Customising the prompt doesn’t have many “best practices” to it like a lot of things in the shell; there isn’t really a right way to do it. Some users will insist it’s valuable to have the path to the terminal device before every prompt, which I think is crazy as I almost never need that information, and when I do, I can get it with a call to tty. Others will suggest that including the working directory makes the length of the prompt too variable and distracting, whereas I would struggle to work without it. It’s worth the effort to design a prompt that reflects your working style, and includes the information that’s consistently important to you on the system.

So, don’t feel like you’re wasting time setting up your prompt to get it just right, and don’t be so readily impressed by people with gargantuan prompts in illegible rainbow colors that they copied off some no-name wiki. If you use Bash a lot, you’ll be staring at your prompt for several hours a day. You may as well spend an hour or two to make it useful, or at least easier on the eyes.

The basics

The primary Bash prompt is stored in the variable PS1. Its main function is to signal to the user that the shell is done with its latest foreground task, and is ready to accept input. PS1 takes the form of a string that can contain escape sequences for certain common prompt elements like usernames and hostnames, and which is evaluated at printing time for escape sequences, variables, and command or function calls within it.

You can inspect your current prompt definition with echo. If you’re using GNU/Linux, it will probably look similar to this:

tom@sanctum:~$ echo $PS1
\u@\h:\w\$

PS1 works like any other Bash variable; you can assign it a new value, append to it, and apply substitutions to it. To add the text bash to the start of my prompt, I could do this:

tom@sanctum:~$ PS1=bash-"$PS1"
bash-tom@sanctum:~$

There are other prompts — PS2 is used for “continued line” prompts, such as when you start writing a for loop and continue another part of it on a new line, or terminate your previous line with a backslash to denote the next line continues on from it. By default this prompt is often a right angle bracket >:

$ for f in a b c; do
>   do echo "$f"
> done

There are a couple more prompt strings which are much more rarely seen. PS3 is used by the select structure in shell scripts; PS4 is used while tracing output with set -x. Ramesh Natarajan breaks this down very well in his article on the Bash prompt. Personally, I only ever modify PS1, because I see the other prompts so rarely that the defaults work fine for me.

As PS1 is a property of interactive shells, it makes the most sense to put its definitions in $HOME/.bashrc on GNU/Linux or BSD systems. It’s likely there’s already code in there doing just that. For Mac OS X, you’ll likely need to put it into $HOME/.bash_profile instead.

Escapes

Bash will perform some substitutions for you in the first pass of its evaluation of your prompt string, replacing escape sequences with appropriate elements of information that are often useful in prompts. Below are the escape sequences as taken from the Bash man page:

\a     an ASCII bell character (07)
\d     the date in "Weekday Month Date" format (e.g., "Tue May 26")
\D{format}
       the format is passed to strftime(3) and the result is inserted into
       the prompt string; an empty format results in a locale-specific time
       representation. The braces are required
\e     an ASCII escape character (033)
\h     the hostname up to the first `.'
\H     the hostname
\j     the number of jobs currently managed by the shell
\l     the basename of the shell's terminal device name
\n     newline
\r     carriage return
\s     the name of the shell, the basename of $0 (the portion following
       the final slash)
\t     the current time in 24-hour HH:MM:SS format
\T     the current time in 12-hour HH:MM:SS format
\@     the current time in 12-hour am/pm format
\A     the current time in 24-hour HH:MM format
\u     the username of the current user
\v     the version of bash (e.g., 2.00)
\V     the release of bash, version + patch level (e.g., 2.00.0)
\w     the current working directory, with $HOME abbreviated with a tilde
       (uses the value of the PROMPT_DIRTRIM variable)
\W     the basename of the current working directory, with $HOME
       abbreviated with a tilde
\!     the history number of this command
\#     the command number of this command
\$     if the effective UID is 0, a #, otherwise a $
\nnn   the character corresponding to the octal number nnn
\\     a backslash
\[     begin a sequence of non-printing characters, which could be used to
       embed a terminal control sequence into the prompt
\]     end a sequence of non-printing characters

As an example, the default PS1 definition on many GNU/Linux systems uses four of these codes:

\u@\h:\w\$

The \! escape sequence, which expands to the history item number of the current command, is worth a look in particular. Including this in your prompt allows you to quickly refer to other commands you may have entered in your current session by history number with a ! prefix, without having to actually consult history:

$ PS1="(\!)\$"
(6705)$ echo "test"
test
(6706)$ !6705
echo "test"
test

Keep in mind that you can refer to history commands relatively rather than by absolute number. The previous command can be referenced by using !! or !-1, and the one before that by !-2, and so on, provided that you have left the histexpand option on.

Also of particular interest are the delimiters \[ and \]. You can use these to delimit sections of “non-printing characters”, typically things like terminal control characters to do things like moving the cursor around, starting to print in another color, or changing properties of the terminal emulator such as the title of the window. Placing these non-printing characters within these delimiters prevents the terminal from assuming an actual character has been printed, and hence correctly manages things like the user deleting characters on the command line.

Colors

There are two schools of thought on the use of color in prompts, very much a matter of opinion:

  • Focus should be on the output of commands, not on the prompt, and using color in the prompt makes it distracting, particularly when output from some programs may also be in color.
  • The prompt standing out from actual program output is a good thing, and judicious use of color conveys useful information rather than being purely decorative.

If you agree more with the first opinion than the second, you can probably just skip to the next section of this article.

Printing text in color in terminals is done with escape sequences, prefixed with the \e character as above. Within one escape sequence, the foreground, background, and any styles for the characters that follow can be set.

For example, to include only your username in red text as your prompt, you might use the following PS1 definition:

PS1='\[\e[31m\]\u\[\e[0m\]'

Broken down, the opening sequence works as follows:

  • \[ — Open a series of non-printing characters, in this case, an escape sequence.
  • \e — An ASCII escape character, used to confer special meaning to the characters that follow it, normally to control the terminal rather than to enter any characters.
  • [31m — Defines a display attribute, corresponding to foreground red, for the text to follow; opened with a [ character and terminated with an m.
  • \] — Close the series of non-printing characters.

The terminating sequence is very similar, except it uses the 0 display attribute to reset the text that follows back to the terminal defaults.

The valid display attributes for 16-color terminals are:

  • Style
    • 0 — Reset all attributes
    • 1 — Bright
    • 2 — Dim
    • 4 — Underscore
    • 5 — Blink
    • 7 — Reverse
    • 8 — Hidden
  • Foreground
    • 30 — Black
    • 31 — Red
    • 32 — Green
    • 33 — Yellow
    • 34 — Blue
    • 35 — Magenta
    • 36 — Cyan
    • 37 — White
  • Background
    • 40 — Black
    • 41 — Red
    • 42 — Green
    • 43 — Yellow
    • 44 — Blue
    • 45 — Magenta
    • 46 — Cyan
    • 47 — White

Note that blink may not work on a lot of terminals, often by design, as it tends to be unduly distracting. It’ll work on a linux console, though. Also keep in mind that for a lot of terminal emulators, by default “bright” colors are printed in bold.

More than one of these display attributes can be applied in one hit by separating them with semicolons. The below will give you underlined white text on a blue background, after first resetting any existing attributes:

\[\e[0;4;37;44m\]

It’s helpful to think of these awkward syntaxes in terms of opening and closing, rather like HTML tags, so that you open before and then reset after all the attributes of a section of styled text. This usually amounts to printing \[\e[0m\] after each such passage.

These sequences are annoying to read and to type, so it helps to put commonly used styles into variables:

color_red='\[\e[31m\]'
color_blue='\[\e[34m\]'
color_reset='\[\e[0m\]'

PS1=${color_red}'\u'${color_reset}'@'${color_blue}'\h'${color_reset}

A more robust method is to use the tput program to print the codes for you, and put the results into variables:

color_red=$(tput setaf 1)
color_blue=$(tput setaf 4)

This may be preferable if you use some esoteric terminal types, as it queries the terminfo or termcap databases to generate the appropriate escapes for your terminal.

The set of eight colors — or sixteen, depending on how you look at it — can be extended to 256 colors in most modern terminal emulators with a little extra setup. The escape sequences are not very different, but their range is vastly increased to accommodate the larger palette.

A useful application for prompt color can be a quick visual indication of the privileges available to the current user. The following code changes the prompt to green text for a normal user, red for root, and yellow for any other user attained via sudo:

color_root=$(tput setaf 1)
color_user=$(tput setaf 2)
color_sudo=$(tput setaf 3)
color_reset=$(tput sgr0)

if (( EUID == 0 )); then
    PS1="\\[$color_root\\]$PS1\\[$color_reset\\]"
elif [[ $SUDO_USER ]]; then
    PS1="\\[$color_sudo\\]$PS1\\[$color_reset\\]"
else
    PS1="\\[$color_user\\]$PS1\\[$color_reset\\]"
fi

Otherwise, you could simply use color to make different regions of your prompt more or less visually prominent.

Variables

Provided the promptvars option is enabled, you can reference Bash variables like $? to get the return value of the previous command in your prompt string. Some people find this valuable to show error codes in the prompt when the previous command given fails:

$ shopt -s promptvars
$ PS1='$? \$'
1 $ echo "test"
test
0 $ fail
-bash: fail: command not found
127 $

The usual environment variables like USER and HOME will work too, though it’s preferable to use the escape sequences above for those particular variables. Note that it’s important to quote the variable references as well, so that they’re evaluated at runtime, and not during the assignment itself.

Commands

If the information you want in your prompt isn’t covered by any of the escape codes, you can include the output of commands in your prompt as well, using command substitution with the syntax $(command):

$ PS1='[$(uptime) ]\$ '
[ 01:21:59 up 7 days, 13:04,  7 users, load average: 0.30, 0.40, 0.36 ]$

This requires the promptvars option to be set, the same way variables in the prompt do. Again, note that you need to use single quotes so that the command is run when the prompt is being formed, and not immediately as part of the assignment.

This can include pipes to format the data with tools like awk or cut, if you only need a certain part of the output:

$ PS1='[$(uptime | cut -d: -f5) ]\$ '
[ 0.36, 0.34, 0.34 ]$

For clarity, during prompt setup in .bashrc, it makes sense to use functions instead for reasonably complex commands:

prompt_load() {
    uptime | cut -d: -f5
}
PS1='[$(prompt_load) ]'$PS1

Note that as a normal part of command substitution, trailing newlines are stripped from the output of the command, so here the output of uptime appears on a single line.

A common usage of this pattern is showing metadata about the current directory, particularly if it happens to be a version control repository or working copy; you can use this syntax with functions to show the type of version control system, the current branch, and whether any changes require committing. Working with Git, Mercurial, and Subversion most often, I include the relevant logic as part of my prompt function.

When appended to my PS1 string, $(prompt vcs) gives me prompts that look like the following when I’m in directories running under the appropriate VCS. The exclamation marks denote that there are uncommitted changes in the repositories.

[tom@conan:~/.dotfiles](git:master!)$
[tom@conan:~/Build/tmux](svn:trunk)$
[tom@conan:~/Build/vim](hg:default!)$

In general, where this really shines is adding pieces to your prompt conditionally, to make them collapsible. Certain sections of your prompt therefore only show up if they’re relevant. This snippet, for example, prints the number of jobs running in the background of the current interactive shell in curly brackets, but only if the count is non-zero:

prompt_jobs() {
    local jobc
    while read -r _; do
        ((jobc++))
    done < <(jobs -p)
    if ((jobc > 0)); then
        printf '{%d}' "$jobc"
    fi
}

It’s important to make sure that none of what your prompt does takes too long to run; an unresponsive prompt can make your terminal sessions feel very clunky.

Note that you can also arrange to run a set of commands before the prompt is evaluated and printed, using the PROMPT_COMMAND. This tends to be a good place to put commands that don’t actually print anything, but that do need to be run before or after each command, such as operations working with history:

PROMPT_COMMAND='history -a'

Switching

If you have a very elaborate or perhaps even computationally expensive prompt, it may occasionally be necessary to turn it off to revert to a simpler one. I like to handle this by using functions, one of which sets up my usual prompt and is run by default, and another which changes the prompt to the minimal $ or # character so often used in terminal demonstrations. Something like the below works well:

prompt_on() {
    PS1="$color_prompt"'\u@\h:\w'"$(prompt_jobs)"'\$'"${color_reset}"' '
}
prompt_off() {
    PS1='\$'
}
prompt_on

You can then switch your prompt whenever you need to by typing prompt_on and prompt_off.

This can also be very useful if you want to copy text from your terminal into documentation, or into an IM or email message; it removes your distracting prompt from the text, where it would otherwise almost certainly differ from that of the user following your instructions. This is also occasionally helpful if your prompt does not work on a particular machine, or the machine is suffering a very high load average that means your prompt is too slow to load.

Further reading

Predefined prompt strings are all over the web, but the above will hopefully enable you to dissect what they’re actually doing more easily and design your own. To take a look at some examples, the relevant page on the Arch Wiki is a great start. Ramesh Natarajan over at The Geek Stuff has a great article with some examples as well, with the curious theme of making your prompt as well-equipped as Angelina Jolie.

Finally, please feel free to share your prompt setup in the comments (whether you’re a Bash user or not). It would be particularly welcome if you explain why certain sections of your prompt are so useful to you for your particular work.

Default grep options

When you’re searching a set of version-controlled files for a string with grep, particularly if it’s a recursive search, it can get very annoying to be presented with swathes of results from the internals of the hidden version control directories like .svn or .git, or include metadata you’re unlikely to have wanted in files like .gitmodules.

GNU grep uses an environment variable named GREP_OPTIONS to define a set of options that are always applied to every call to grep. This comes in handy when exported in your .bashrc file to set a “standard” grep environment for your interactive shell. Here’s an example of a definition of GREP_OPTIONS that excludes a lot of patterns which you’d very rarely if ever want to search with grep:

GREP_OPTIONS=
for pattern in .cvs .git .hg .svn; do
    GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern
done
export GREP_OPTIONS

Note that --exclude-dir is a relatively recent addition to the options for GNU grep, but it should only be missing on very legacy GNU/Linux machines by now. If you want to keep your .bashrc file compatible, you could apply a little extra hackery to make sure the option is available before you set it up to be used:

GREP_OPTIONS=
if grep --help | grep -- --exclude-dir &>/dev/null; then
    for pattern in .cvs .git .hg .svn; do
        GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern"
    done
fi
export GREP_OPTIONS

Similarly, you can ignore single files with --exclude. There’s also --exclude-from=FILE if your list of excluded patterns starts getting too long.

Other useful options available in GNU grep that you might wish to add to this environment variable include:

  • --color — On appropriate terminal types, highlight the pattern matches in output, among other color changes that make results more readable
  • -s — Suppresses error messages about files not existing or being unreadable; helps if you find this behaviour more annoying than useful.
  • -E, -F, or -P — Pick a favourite “mode” for grep; devotees of PCRE may find adding -P for grep‘s experimental PCRE support makes grep behave in a much more pleasing way, even though it’s described in the manual as being experimental and incomplete

If you don’t want to use GREP_OPTIONS, you could instead simply set up an alias:

alias grep='grep --exclude-dir=.git'

You may actually prefer this method as it’s essentially functionally equivalent, but if you do it this way, when you want to call grep without your standard set of options, you only have to prepend a backslash to its call:

$ \grep pattern file

Commenter Andy Pearce also points out that using this method can avoid some build problems where GREP_OPTIONS would interfere.

Of course, you could solve a lot of these problems simply by using ack … but that’s another post.