Since version 3 (circa 2004), bash has a built-in regular expression comparison operator, represented by =~. A lot of scripting tricks that use grep or sed can now be handled by bash expressions and the bash expressions might just give you scripts that are easier to read and maintain. As with other comparison operators (e.g., -lt or ==), bash will return a zero if an expression like $digit =~ "[[0-9]]" shows that the variable on the left matches the expression on the right and a one otherwise. This example test asks whether the value of $digit matches a single digit.
if [[ $digit =~ [0-9] ]]; then echo "$digit is a digit" else echo "oops" fi
If you're wondering what is meant by "regular expression", a brief explanation is in order. A regular expression is some sequence of characters that represents a pattern. For example, the [0-9] in the example above will match any single digit where [A-Z] would match any capital letter. [A-Z]+ would match any sequence of capital letters. The expression ^[A-Z]+$ would, on the other hand, match a string that contains only capital letters. Got that? ^ = the beginning of a string, $ = the end of a string and + = more of the same.
You can also check whether a reply to a prompt is numeric with similar syntax:
echo -n "Your answer> " read REPLY if [[ $REPLY =~ ^[0-9]+$ ]]; then echo Numeric else echo Non-numeric fi
Bash's regex can be fairly complicated. In the test below, we're asking whether the value of our $email variable looks like an email address. Notice that the first expression (the account name) can contain letters, digits and some special characters. The + to the right of the first ] means that we can have any number of such characters. We then see the @ sign sitting between the username and the email domain -- and a literal dot (\.) between the primary part of the domain name and the "com", "net", "gov", etc. part. The comparison is then enclosed in double brackets.
#!/bin/bash read -p "Enter email: " email if [[ "$email" =~ ^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$ ]] then echo "This email address looks fine: $email" else echo "This email address is flawed: $email" fi
Similarly, you can construct tests that determine whether the value of variables is in the proper format for an IP address:
#!/bin/bash if [ $# != 1 ]; then echo "Usage: $0 address" exit 1 else ip=$1 fi if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then echo "Looks like an IPv4 IP address" elif [[ $ip =~ ^[A-Fa-f0-9:]+$ ]]; then echo "Could be an IPv6 IP address" else echo "oops" fi
Bash also provides for some simplified looping. Want to loop 100 times? Just do something like this:
for n in {1..100} do echo $n done
And you can loop through letters or through various ranges of letters or numbers using expressions such as these. You don't have to start with 1 or a and you can move backwards through the list.
{a..z} {z..a} {c..f} {5..25} {10..-10}
Want to see how these ranges work? You can also just try expanding them with the echo command.
$echo {a..z} a b c d e f g h i j k l m n o p q r s t u v w x y z $ echo {5..-1} 5 4 3 2 1 0 -1
What a swell shell!
2-Minute Linux Tip: Learn how to use the history command
Read more of Sandra Henry-Stocker's Unix as a Second Language blog and follow the latest IT news at ITworld, Twitter and Facebook.