Unix: How to select every 1,000th line from a file

Log files on Unix systems can easily grow to hundreds of thousands or even millions of lines. Here's a simple way to pluck out every Nth line.

Head and tail are great commands when you want to look only at the beginning or the ending of files. Getting a feel for how the lines in a file are changing over time, on the other hand, can take a lot of time if you've got to scan through thousands of lines. What if you could look at every 100th, 1,000th or 10,000th line? That's surprisingly easy if you use a particular sed command. And you can modify the command to change the frequency setting. The command for picking out every 1,000th line is sed -n '0~1000'. Changing "1000" to any other number increases or decreases the frequency with which lines are displayed.

$ sed -n '0~1000' /var/log/syslog

This command will display every 1,000th line. The -n tells sed not to display every line it encounters. In other words, it suppresses automatic display of the lines. The '0~1000' argument tells it to select each 1,000th line from the target file. The 0 tells it to start with line 0 which, of course, doesn't exist, and the 1000 tells it how many lines to skip over until you print again. Using a sed command like this, you can also display the 10th or 100th, etc. line is piped input. The last piped to sed command shown below will display every 250th login from the /var/log/wtmp file.

$ last | sed -n '0~250p'

You don't have to start at the beginning of the file if you don't want to. In the command below, you would start with line 500 and then print every 25th line from that point on:

$ sed -n '500~25p' /var/log/syslog

The numbers you select are up to you. The output will be displayed with line numbers that you can use to verify that your command is working as expected before you put it into use and to give you an idea where you are in the file or command output you are examining.

   500  shs      pts/4        2013-08-03 12:24 (
   525  shs      pts/4        2013-08-21 12:37 (pool-123-45-67-890.bltmmd.fios.verizon.net)

This is a useful sed command for scanning files at whatever granularity works for you.

Read more of Sandra Henry-Stocker's Unix as a Second Language blog and follow the latest IT news at ITworld, Twitter and Facebook.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Must read: 10 new UI features coming to Windows 10