Infrequently Asked Questions: CD ripping, awk, and file permissions

Set up zero-click CD ripping, get reports on user accounts with awk, and get started with the fine points of directory permissions.

If you browse around any Linux forum, you'll come upon a question which should not need to be asked. For those of us who have been around a while, the question is as unanswerable as it is set. And yet still it comes.

Which Linux distribution is the best?

On the face of it, this is an easy to answer question and then the realization hits you: what is "best"? This is where the question becomes far too open ended and you then need to start the game of 20 Questions.

"What do you need it for?", "what hardware do you have?" and so on. It soon becomes obvious that the question really is meaningless. It would help if the questioners would mention some of this up front—and don't we all wish that they had read Eric S Raymond and Rick Moen's "How to ask questions the smart way"? To be brutally honest, it would be better if the question, without context, were not asked at all.

Let's take the example of the average desktop user - they need email, Internet browsing, maybe a word processor and possibly the opportunity to install a game or two. Immediately we can see that it really doesn't matter which distribution is installed, they will all do this.

And that's the great thing about Linux. It can do whatever you want it to. Just install the correct software and the distros come down to cosmetics— package management, hardware detection, icons. The distros themselves, with some special exceptions, are largely the same.

What would be more useful is if they could say how easily they wish to install it. Normally, at this point, all of the following will turn up: "I use Gentoo and it's easy"/"I use Slackware and it's easy"/"After installing LFS 50 times, I have no problems with it". And a holy tug-o-war with a 50-ended rope begins amongst the followers of any distribution present at the time.

It's up to you, the user, to make the choice of which one you prefer. So go to the net, search for distros (or go to Distrowatch) and start downloading. Install one, try it out for a week and then install a different one. Repeat as many times as it takes for you to settle on one you like. Then, and only then, will you really know "which Linux distro is the best".

In the end it all boils down to these questions: what are my personal preferences - do I want to be involved in what goes on on my machine, or leave the whole set-up and inner workings to someone else, with the odd little change that I need done? It's about thinking for yourself, trying something new and coming to your own conclusion.

-- XavierP

Editor's note: If you're planning to use community sources of help, such as mailing lists and web forums, you're probably going to get the most relevant, easiest-to-understand help from a user who is running the same distribution you are. So sign up for and "lurk" on your local Linux user group mailing list, on the mailing lists devoted to the major programs you're planning on installing, and on You'll probably soon see one or more helpful posters who are using a certain distribution. Try that one.

Save the Music

How can I rip my extensive CD collection to my hard drive so I don't have to keep changing CDs, especially while traveling?

The obvious answer is to rip the CDs to your hard drive.

Ripping CDs is a trivial task for Linux users. There are many GUI based rippers - Grip, KAudioCreator, SoundJuicer - and there are ways to do this via the command line. Which is what we will be doing.

Firstly, you need to be sure that you have the codecs you require. Tip: if you are planning to copy the tracks to your portable music device, check to see which formats it supports before ripping and encoding the tracks.

Once you have installed (if you don't already have them) the required codecs - the MPlayer homepage is a good place to get them - it's then a matter of ripping and encoding the tracks. If you are planning to do this via a GUI based tool, you may as well stop reading now, these tools are very simple to use and tend to have clearly titled options and switches.

We, however, are going to use the command line for our ripping and encoding. For copying the CD and then encoding it, you need the following: cdparanoia lame and the assorted dependencies described on their homepages. Your distro will either have these installed from the start or have them in packages suited to the distro. Otherwise, go to their websites and install from source.

As always, the man page (man cdparanoia) will give you the greatest amount of information, but for the purposes of this article it's too extensive. To begin with, run the command cdparanoia -B. This will batch copy the tracks and separate them so you don't just get one long track. The tracks will be ripped with the .wav extension. For many people, this will be enough and the tracks (plus the codecs we grabbed earlier) will be playable on the PC or laptop.

Incidentally, the cdparanoia -B command will rip every track on the CD by default. If you don't want this, you will need to specify the tracks. For example, cdparanoia -B 5 will rip track 5 and cdparanoia -B 3-6 will rip tracks 3 to 6.

As with cdparanoia, lame has a large number of switches that can be used. The man page (man lame), again, will detail these for you. For the purposes of this article, we will convert the .wav to .mp3 format. This is because it is one of the more popular formats and will also play on most portable devices.

Lame can do this conversion very easily. The command is lame -b 192 track1.wav track1.mp3 This tells lame to convert the file using a bitrate of 192. You can experiment with different bitrates to improve the sound, but 192 should suffice in most cases.

If you want to take this a little further and really replicate the GUI experience on the CLI, you could install libcddb and run the cddb_query tool. This gets the CDDB information for your CD (if it is known to the database).

Your tracks are now ready to be played or copied to your music device.

-- XavierP

Save the Music, Part 2

I haven't used my server's CDROM drive since I installed Linux. Can I set it up to rip CDs unattended?

Sure. Here's a shell function that you can add to your .bashrc—just ssh to your server and type rip-cds, and it will rip CDs until you log out. Since it ejects each CD when done, just walk by the server every once in a while to put in a new CD and close the drawer.

  while true; do
      if setcd -i | grep 'CD tray is open' > /dev/null ; then
        echo "Please insert the next CD"
        sleep 1
        abcde -xN

This shell function uses "abcde", a handy wrapper for cdparanoia and your encoding tool of choice that also searches the Internet for the CD and track titles. The -x option ejects the CD when you're done, and the -N option puts abcde into non-interactive mode.

Thanks to Nick Moffitt for the script this is based on. See the man page for abcde for more hints on using it to organize your music library just the way you want.

-- Don Marti

Into to awk

Which command can I use if I want to see a list of all users in Linux? The command "who" tells who is online now, logged in, but I need the command which will show all registered users.

Sometimes we require text processing that is slightly too complex to be practical using grep, sed and cut, and not quite troublesome enough to go and try Perl (nothing wrong with Perl, honest!).

Enter awk, a small and quick (less memory overhead than perl, or several processes chained together via pipes and faster to load from disk). Perl is a superset of awk. The makers of Perl knew awk, and originally wrote Perl for the situations where awk didn't quite cut it. Awk is valuable, none the less.

An example of plain usage where it replaces a combination of grep and cut (plus some numeric comparison) is in this thread. It's used to process the /etc/passwd file there, which is a very plain thing to use it for. This command prints the username and the user's full name for each user whose uid is greater than 99. (On the questioner's distribution, uids 99 and below are reserved for the system.)

awk -F: '$3 >99 {print $1" "$5}' /etc/passwd

This thread covers a more complex query: delete lines in file A if they have a corresponding entry in file B.

Basically what happens is that the names in file B are put into an array, and then we check the content of that array against every line of file A and print it if it's not in the array.

Something else that people who use their .bash_history well do is to try and remove duplicate lines from time to time (and no, I don't mean the HISTCONTROL=ignoredups type of thing or stuff you NEVER want in your history, like ls, which HISTIGNORE=*PROMPT*: cd*:ls * will take care of). I mean when you do the same thing five or six times, interrupted by something else. If you don't use bash's timestamp feature, this little thing will do just that, without destroying the order in which things appeared (unlike sort -u).

awk '!x[$0]++' ~/.bash_history

What's happening? $0 refers to a line, and x[$0] looks the line up in an associative array; if it's present it returns the (positive) index, if it's not it's being added and 0 is returned. ! turns the number into the logical complement (if we got zero awk's default action is executed, and the record gets printed); and the number of occurrences gets added up. Quick and easy.

If you happen to be using the bash time-stamps, a little modification will make it work for you, too.

awk 'BEGIN{RS="#[0-9]{10}"}!x[$0]++' ~/.bash_history

Another nice use is to "grep" for more than one pattern, which, with the real grep, takes several invocations using pipes, if the order of the conditions isn't well-defined and a single regex can catch them. With awk it becomes something like, as seen in this thread,

awk '/pattern1/ && /pattern2/' file

This is actually enough, since the default action for awk if none is explicitly given, is to print if the condition evaluates to true.

-- Tinkster

What does "execute" permission mean for a directory?

Unix/Linux file permissions are relatively simple, but there are some traps and pitfalls to the newcomer in systems administration. True, given the fact that more and more Linux-machines aren't actually multi-user systems, but are desktop-machine that serve only one user there's not THAT much to worry about anymore—or is there?

The permissions consist of three basic attributes: read, write and execute. We will cover three others: sticky, setuid and setgid, later. They do differ in their meaning when applied to a file or a directory, though. Yes, write access to a directory means that you get the right to write a new file to a directory. In that sense a directory works like a file. But what about execute? For a directory it means that the x attribute allows you to change into that directory, as with the cd command, or list its content. But how is listing its content different from reading it? The information about the files, such as ownership, size, timestamp and the like depend on the 'x' bit. Let's get this straight - if the permissions on a directory called "forbidden" are 'drw-------' the owner can't cd into it, but can get the names of the files via an 'ls -l forbidden'. ls will tell you which files you're not allowed to look at. If the permissions are 'd-wx------' you can cd into it, but you get permission denied from ls, and that's that.

$ chmod u-r forbidden
$ ls -ld forbidden/
d-wxr--r--  2 tink   users 384 2006-08-16 10:39 forbidden/
$ ls -l forbidden/
ls: forbidden/: Permission denied

$ chmod u+r,u-x forbidden/
$ ls -ld forbidden/

ls: forbidden/ Permission denied
ls: forbidden/maillog: Permission denied
ls: forbidden/test.awk: Permission denied
ls: forbidden/ Permission denied
ls: forbidden/ls.php: Permission denied
total 0

One implication of this is that, if you have execute, but not read rights on a directory that you can read the content of a file in that directory if you know its name.

It's also important to know that permissions are evaluated in the order of user, then group, then other, and that the first one that matches will be used. So if you see

----rwx---  2 tink   users 384 2006-08-16 10:39 forbidden/

and you happen to be tink AND a member of the group users, you won't be able to use that file even though you own it (until, of course, you chmod it which you can since you're the owner).

Then there's the sticky-bit and the SUID and SGID bits. Sticky is only relevant for directories, it means that only the owner (plus root, of course) of a file has the right to delete it.

$ ls -ld /tmp/ 
drwxrwxrwt 122 root root 10368 2006-08-16 11:03 /tmp/

Even though /tmp is world-writable, not everyone can delete other people's files. Note the 't' in the access-rights on /tmp? That's the 'sticky' bit. SETUID and SETGID mean that an executable is run as the user/group that owns the file - use with caution! For directories the SETUID bit means nothing, but the SETGID means that files created in that directory will get the directory's owning group as their group owner, rather than the default group of the user who creates it.

Now, why do we keep harping on about those things when your machine is only being used by ONE user, anyway? That's for the reason of the single-most dangerous misbehavior of newcomers, using the system as root at all times. It can't be emphasized enough that not running as root not only protects you from your own fat fingers but also from an easy exploit via buggy software you may be using.

-- Tinkster

Learn more about this topic

abcde CD ripping tool


The GNU Awk User's Guide

This story, "Infrequently Asked Questions: CD ripping, awk, and file permissions" was originally published by LinuxWorld-(US).

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2006 IDG Communications, Inc.