Unix: How random is random?

Unix: How random is random?
Credit: Vladimer Shioshvili

On Unix systems, random numbers are generated in a number of ways and random data can serve many purposes. From simple commands to fairly complex processes, the question “How random is random?” is worth asking.

EZ random numbers

If all you need is a casual list of random numbers, the RANDOM variable is an easy choice. Type "echo $RANDOM" and you'll get a number between 0 and 32,767 (the largest number that two bytes can hold).

$ echo $RANDOM
29366

Of course, this process is actually providing a "pseudo-random" number. As anyone who thinks about random numbers very often might tell you, numbers generated by a program have a limitation. Programs follow carefully crafted steps, and those steps aren’t even close to being truly random. You can increase the randomness of RANDOM's value by seeding it (i.e., setting the variable to some initial value). Some just use the current process ID (via $$) for that. Note that for any particular starting point, the subsequent values that $RANDOM provides are quite predictable.

$ RANDOM=$$;echo $RANDOM; echo $RANDOM;echo $RANDOM
7424
28301
30566
$ RANDOM=$$;echo $RANDOM; echo $RANDOM;echo $RANDOM
7424
28301
30566

If you need random numbers fairly frequently, maybe another seed would work better. Here we're using the number of seconds since the Unix epoch.

$ RANDOM=`date +%s`;echo $RANDOM;echo $RANDOM;echo $RANDOM
32077
1397
32029
$ RANDOM=`date +%s`;echo $RANDOM;echo $RANDOM;echo $RANDOM
16116
16487
11588

You can also use the shuf command to generate pseudo-random numbers. In the command below, we're generating 10 numbers between 0 and 32,767. The shuf command should start each sequence with a different number (no need for seeding).

$ shuf -i 0-32767 -n 10
32157
16611
24087
28301
9088
4662
12780
30518
7549
12830

More complex random data

For more serious requirements for random data, such as its use in encryption, some more truly random data comes into play. The /dev/random and /dev/urandom files get beyond the predictability of programming by making use of environmental noise gathered from device drivers and other system sources and stored it in an “entropy pool”.

Pseudo-random number generation (often referred to as “PRNG”) on Unix systems makes use of these two files. From the command line, these files look like this:

crw-rw-rw- 1 root root 1, 8 Jun 18 13:24 random
crw-rw-rw- 1 root root 1, 9 Jun 18 13:24 urandom

Like most, if not all, of the files in /dev, these files are both zero-length files and, like /dev/null, provide a special service that isn’t obvious by looking at a file listing. The /dev/random and /dev/urandom files can be used to generate numbers that at least approach approximate random values and random numbers are key to encrypting content in order to prevent it from being predictable.

Using the stat command, you can get a more descriptive listing than ls provides for either of these files. Here’s the listing for /dev/urandom. Notice the zero length and the date stamps. This file was generated when the system last booted.

$ stat /dev/urandom
  File: /dev/urandom
  Size: 0               Blocks: 0          IO Block: 4096   character special file
Device: 6h/6d   Inode: 1056        Links: 1     Device type: 1,9
Access: (0666/crw-rw-rw-)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-07-15 13:24:49.719736172 -0400
Modify: 2017-07-15 13:24:49.719736172 -0400
Change: 2017-07-15 13:24:49.719736172 -0400
 Birth: -

Examining entropy

To get an idea how much entropy is available on a system, you can look at the special file named “entropy_avail” – /proc/sys/kernel/random/entropy_avail to be more precise. Note that this file lives in the /proc file system—not a file system like those we generally work in, but a file system related to the kernel and running processes. The entropy_avail file will look like it’s empty, but displaying its contents tell you what you need to know.

-r--r--r-- 1 root root 0 Jul 15 16:01 entropy_avail

To get a feel for how much pseudo-random data is available in your entropy pool, you can run this command:

$ cat /proc/sys/kernel/random/entropy_avail
2684

The number shown appears to represent the number of bits of entropy that have been collected. Even 2,684 might not seem like much in a world in which we routinely speak in terms of terrabytes, but numbers above 100 are said to be a good sign. In addition, the number will change frequently. Check three times in a row, and you might see something like this.

$ cat /proc/sys/kernel/random/entropy_avail
2683
$ cat /proc/sys/kernel/random/entropy_avail
2684
$ cat /proc/sys/kernel/random/entropy_avail
2493

The two files — /dev/random and /dev/urandom — consume the entropy pool and work nearly the same except for one important distinction — /dev/random will block when it runs out of entropy and might halt a process while /dev/urandom will never block, but might have less entropy. The /dev/urandom file appears to be the more reliable choice today.

Randomness vs. entropy

Now that the word “entropy” has entered the discussion, let’s consider the relationship of the words randomness and entropy. While tightly related, they don’t mean exactly the same thing. Entropy is reminiscent of a coin toss and is a measure of the uncertainty of an outcome, while randomness is related to a probabilistic distribution. For computer folk, the terms are often used as if they mean exactly the same thing.

Generating files with random data

You can create a file of pseudo-random data if you need one. In this command, we create a 1 gigabyte file called “myfile” and then examine the first line with an od command just to get a feel for what was created.

Creating the file:

$ head -c 1G < /dev/urandom > myfile

Looking at the file:

$ ls -l myfile
-rw-rw-r-- 1 shs shs 1073741824 Jul 14 15:10 myfile
$ head -1 myfile | od -bc
0000000 210 365 102 233 332 203 075 262 302 064 255 110 265 372 365 176
        210 365   B 233 332 203   = 262 302   4 255   H 265 372 365   ~
0000020 274 243 116 012
        274 243   N  \n

Generating random numbers

You can use /dev/urandom to generate pseudo-random numbers on the command line like this.

$ od -vAn -N4 -tu < /dev/urandom
 2760998497

Commands like this that pull data from /dev/urandom and use od to process it can generate nearly random numbers. Run the same command numerous times and you’ll see that you get a range of numbers.

$ od -vAn -N4 -tu < /dev/urandom
  184254494
$ od -vAn -N4 -tu < /dev/urandom
 3081534763

Two of the options used with these of commands are particularly interesting. The -N controls the size of the output as the number in bytes. So, -N4 means the resultant number should be four bytes long. This doesn’t mean the resultant number provided can’t be a small number like 12—just that it will use four bytes. The largest numbers you will see will have ten digits. Switch to -N5 and you’ll get two numbers—one using 4 bytes and one using 1. Omit the -N option, and you’ll get a continuous stream of numbers—at least until you get tired of looking at them and hit ^C.

$ od -vAn -N5 -tu < /dev/urandom
  823515068        196
$ od -vAn -tu < /dev/urandom
 2860283906 3419549082 3207848245 2737687912
 1333710913  933348251 2572772980 1418852288
 3788708580  870673152 4083922259 1506538622
 3772099425 3296232922  692742105  818767715
 3576300418 2497391372 3756319951 1357979412
 1588018330  740469378 3140770678  958473449
  187769983  168320294  393843609 3925659647
 1320592631 3858359323 1435946222 2841928818
   99971705 1732928020 2292358742 2367929537

Do this same thing with /dev/random and you’re likely to run out of steam fairly quickly.

$ od -vAn -tu < /dev/random
  897129786 2714319998 1496103441 4272099144
   99601145 3584433910 2759928205  817917225
 1692688250 3711124362  787695563 2107932582
  427417199 3136902189 1527656210 2881971698
 3895588188 1111869233 1024834659 3486503580
 4184363003 3255228299  634631930 2477891792
^Z
[1]+  Stopped                 od -vAn -tu < /dev/random

Note that the ^Z was used to suspend the process when it hung on the command line. Also keep in mind that the entropy pool gets used up and regenerates.

Beyond /dev/urandom

Given the limitations of /dev/random and /dev/urandom, there are also some interesting options. There are now more than a dozen a hardware random number generators (also known as “true random number generators”, often referred to by the acronym “TRNG”) available today. In addition, “Entropy as a Service” (EaaS) is available as an option that could dramatically change the nature of randomness on systems close to you. More on this soon!

To get a quick introduction to EaaS, check out NIST’s introduction—and stay tuned for some additional insights here on NetworkWorld.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Must read: 10 new UI features coming to Windows 10