Random identity generation in Linux

crowds of people on a networked grid of circuits
Gremlin / Getty Images

If you need to generate a list of names and addresses to test an application or a script that you're working on, Linux can make that surprisingly easy. There's a command called "rig" that will create name, address and phone number listings. As far as I can tell, out of the box, it only works with U.S. addresses and area codes. However, if this is indeed the case, you might be able to work around this problem.

To use the rig command, you can just type "rig" on the command line, and a single name and address will be generated. You will see something like this:

$ rig
Mavis English
1015 Tulip St
Anderson, IN  46018
(317) xxx-xxxx

To generate a list with many addresses, use the -c option and specify the number of addresses that you want to see.

$ rig -c 3
Curt Rhodes
750 Orrand Dr
Kinston, NC  28501
(919) xxx-xxxx

Glenna Sheppard
531 Buncaneer Dr
Seattle, WA  98109
(206) xxx-xxxx

Georgina Burke
840 Plinfate St
Orlando, FL  32802
(407) xxx-xxxx

You've probably noticed that the phone numbers in these identity records have an area code, but only a series of x's for the phone numbers. Later in this post, I'll demonstrate one way that you can get beyond this.

If, for some reason, you need only male or female names in your generated list, you can use the -m (male) or -f (female) option.

$ rig -c 3 -m                       $ rig -f -c 3
Eduardo Mathis                      Alicia Lara
183 Kennel Ln                       853 Willow Rd
Appleton, WI  54911                 Roanoke, VA  24022
(414) xxx-xxxx                      (703) xxx-xxxx

Tristan Mckee                       Mindy Romero
608 Lake Dr                         846 Burnet Dr
Miami, FL  33152                    Emporia, KS  66801
(305) xxx-xxxx                      (316) xxx-xxxx

Randy Chavez                        Ina Morris
654 Bourg St                        556 Cedarwood Ln
Spokane, WA  99210                  Passadena, CA  91109 <== oops!
(509) xxx-xxxx                      (818) xxx-xxxx

It's easy to redirect the output to a file to save it for your intended use.

$ rig -c 100 > IDs

Putting your rig command into a script might make it a little easier to use, though it doesn't add much to the command. In this gen_random_IDs script, we prompt the user for the number of identity records to be generated and redirect the output into a file. It uses the bash PID to randomize the file name (e.g., IDs.3255) to lessen the likelihood that a file with the same name already exists.

#!/bin/bash

if [ $# == 0 ]; then
    echo -n "number of records to generate> "
    read num
else
    num=$1
fi

rig -c $num > IDs.$$
echo "$num identity records are in the IDs.$$ file"

You could also turn your rig commands into an easy bash alias:

alias genIDs='rig -c 1000 > IDs'

Adding phone numbers

If you would prefer seeing phone numbers in place of all those xxx-xxxx strings, you can do a little more work to make that happen. You can create random fictitious phone numbers to go along with your fictitious identities. In this next script, I use an internal bash function called RANDOM to create the needed digits to replace the xxx-xxxx strings that rig provides. The syntax shown is meant to ensure that we get numeric strings with exactly 3 and 4 digits.

The script generates the list of identities using the rig command and then runs back through the list to replace the xxx-xxxx strings with the generated phone numbers.

#!/bin/bash

if [ $# == 0 ]; then
    echo -n "number of IDs to generate> "
    read num
else
    num=$1
fi

if [ -f IDs ]; then
    rm IDs
fi

rig -c $num > IDs.$$

while IFS= read -r line
do
  if [[ $line == *"xxx-xxxx" ]]; then
    areacode=`echo $line | cut -c1-5`
    echo -n "$areacode " >> IDs
    echo $((100 + RANDOM % 899))-$((1000 + RANDOM % 8999)) >> IDs
  else
    echo "$line" >> IDs
  fi
done < IDs.$$

# remove temp file
rm IDs.$$

echo "Your generated identities are in the IDs file"

In this second version of the gen_random_IDs script, the rig output is written to the IDs.$$ file, and the revised (final) identity records are written to the IDs file. Any file by that name that exists when the script is started is simply removed. You are, of course, welcome to change any of this behavior to adjust the script to your preferences.

Output from that last script will look like this. Keep in mind that the phone numbers are completely random and do not likely resemble phone numbers in the cities shown, though the area codes are likely OK.

$ cat IDs
Silvia Frederick
163 Shalton Dr
Beloit, WI  53511
(608) 776-7085

Mildred Joyner
116 Spring County Blvd
Albany, NY  12212
(518) 491-5250

Going international

The rig command gets the information that it provides from files in /usr/share/rig. If you want it to generate names and addresses that resemble those in another country, you might get away with replacing the content of these files. On the other hand, your success will probably depend on the extent to which the addresses match the format of the current content. The rig command doesn't seem to deal well with city names that have more than one word in them like "San Francisco" or "New York". It won't likely deal well with area codes that have more than one component either.

Adding data

The data files that rig uses have as many as 1,000 entries for some of the fields. The counts on my system show:

$ cd  /usr/share/rig
$ wc -l *
 1000 fnames.idx	<== 1,000 first names for women
 1000 lnames.idx	<== 1,000 last names
   61 locdata.idx	<== 61 cities and states
 1000 mnames.idx	<== 1,000 first names for men
   60 street.idx	<== 1,000 street names
 3121 total

That means it can generate as many as 2 million different names. There's no reason you can't add more if you're so inclined. Just follow the format.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2021 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)