The beauty of links on Unix servers

Symbolic and hard links provide a way to avoid duplicating data on Unix/Linux systems, but the uses and restrictions vary depending on which kind of link you choose to use. Let's look at how links can be most useful, how you can find and identify them, and what you need to consider when setting them up.

Hard vs soft links

Don't let the names fool you. It's not an issue of malleability, but a very big difference in how each type of link is implemented in the file system. A soft or "symbolic" link is simply a file that points to another file. If you look at a symbolic link using the ls command, you can easily tell that it's a symbolic link. The first character in a long listing gives this away quite easily.

lrwxrwxrwx 1 lguy lguy  5 Dec 8 2015 maybe-not -> maybe

How soft (symbolic) links work is easy to determine. If you were to look at the contents of a symbolic link, you would see very clearly that all it contains is the file system path (absolute or relative) to some other file on the system -- usually in another directory, though this won't always be the case. And, to be honest, a symbolic link could actually be set up to point to itself, though there would be little value in such a link.

Looking at the symlink listed above, note that we have to use a special command -- readlink -- to see its contents as the OS will assume we want to see the contents of the file that it refers to -- not the link itself.

$ cat maybe-not
$ od -bc maybe-not
$ readlink maybe-not

And here's another. This one points to a file in an altogether different directory. Notice that the link in this case is using an absolute pathname where the one shown earlier was relative (in the current directory).

$ ls -l AU
lrwxrwxrwx 1 ec2-user ec2-user 30 Apr 25 21:10 AU -> /opt/apps/maps/world/australia
$ readlink AU

Note that the order of files in the ln command is important. The command ln -s orig mycopy will create a file named mycopy that points to a file named orig.

$ ln -s orig mycopy

If we create a symbolic link that points to itself, the OS will determine that it would go into an infinite loop trying to resolve it and issues a "too many levels of symbolic links". You'd also see this if each of two links pointed at each other or a series of symbolic links eventually looped around to the first in the list.

$ ln -s onelink otherlink
$ ln -s otherlink onelink
$ ls -l *link
lrwxrwxrwx 1 lguy lguy 9 Apr 26 08:46 onelink -> otherlink
lrwxrwxrwx 1 lguy lguy 7 Apr 26 08:46 otherlink -> onelink
$ cat onelink
cat: onelink: Too many levels of symbolic links

While it might not be apparent in the example below, the file myself did not exist before the ln command was used. Symbolic links do not require that the file being pointed to exists on the system.

$ ln -s myself myself
$ cat myself
cat: myself: Too many levels of symbolic links
$ readlink myself

In addition, hard links can only be used for files -- not for directories. This might be intended as another way to prevent loops from forming.

Hard links are an altogether different kind of file. In fact, having both types of files called "links" can be somewhat misleading. Unlike symbolic links, hard links are not so easily identified on the command line. There's no special character that tells you that a file is a hard link. And, if you look at the content of a hard link, you won't see anything that tells you that the file is any different than most of the files you work with on a daily basis. And, frankly, it really isn't.

While a symbolic link simply points to another file, a hard link and the file that it was created to connect to are really no different from each other. When you create a hard link, you simply create a second reference to file content that resides somewhere in a file system. There is no difference between the two (or more) files that refer to the same data content other than their names or locations (or both). Think of a symbolic link as being a "pointer" to a file. Think of a hard link as file content having a presence in multiple locations or through different names.

In the command shown below, we use the -i option with ls so that the inode number is included in the listing. The inode allows the OS to find the file contents and includes all the metadata (e.g., permissions, owner and group). Note that the local file (hard link) that was set up to point to the one in the /opt/apps directory both use the same inode.

$ ls -ldi /opt/apps/maps/world/australia/mapdata
1725 -rw-r--r-- 2 root root 3712689 Apr 25 21:13 /opt/apps/maps/world/australia/mapdata
$ ls -ldi ./mapdata
1725 -rw-r--r-- 2 root root 3712689 Apr 25 21:13 ./mapdata

Disclaimer: The example above is very unlikely for one important reason. A hard link can only be created if it refers to a file in the same file system. This is because inodes are unique to a file system. Links that cross file system boundaries must be symbolic links.

Creating links

Whether you are creating a symbolic or hard link on a Unix or Linux system, you use the ln command. However, to create a symbolic link, you use the -s option. To create a hard link, you don't need an option.

A command to create a symbolic link might look like this:

$ ln -s /opt/apps/maps/world/australia AU
$ ls -l AU
lrwxrwxrwx 1 map-makr map-team 30 Apr 25 21:10 AU -> /opt/apps/maps/world/australia

Whatever content is available at /opt/apps/maps/world/australia will be available in the local directory as well.

$ ln /opt/apps/maps/world/australia/mapdata .
$ ls -l mapdata
-rw-r--r-- 2 root root 3712689 Apr 25 21:13 mapdata
$ pwd

Below the surface

Because hard links refer to the target file's inodes, the owner and permissions of the original file will be the same for a hard link. After all, a hard link doesn't have its own inode, so there's no place to store new information. Symbolic links will belong to the user who creates them.

Symlinks can refer to files anywhere on the system (regardless of which file system actually contains them) because they only contain the name. Hard links, on the other hand, refer to files by their inode numbers and these are unique to each file system, so they can't reach outside their file system's limits.

Removing a hard link doesn't remove the file as long as some other file still makes use of the inode. So, creating a hard link and then removing the original file basically does the same thing as moving the original file.

You can tell that a file has more than one presence (i.e., has had a hard link set up) by looking at the links field in a long file listing.

-rw-rw-r-- 2 lguy lguy 3712689 Sep  6  2016 copy

Interestingly, some systems will include an option with the find command that allows you to find the files that link to the same content (i.e., find all the hard links).

414290 3628 -rw-rw-r--   2 lguy lguy  3712689 Sep  6  2016 ./copy
414290 3628 -rw-rw-r--   2 lguy lguy  3712689 Sep  6  2016 ./orig

When links are most useful

Symbolic links are most useful for avoiding complicated paths. Don't want to have to remember a path that is 73 characters long? No problem, just create a symbolic link to remember where it is.

Symbolic links make it easy to make some set of files appear to exist in multiple locations without having to make separate copies.

Hard links are most useful for keeping file content in a single location -- avoiding duplication of what might be a very large amount of data.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2017 IDG Communications, Inc.