The T-Files


Wed, 19 May 2004

The thousand and one reasons to love Perl: [6] Command line mode

Another great thing about Perl is that it can be used for a wide range of programming tasks. On one end of the spectrum, you can write complex applications with thousands of lines of code spread among dozens of files. On the opposite end, Perl can also be used for small shell scripts, or even directly from the command line (for ad hoc tasks). Here is a real world example.

The first rule of content management: If you have more than five pages to maintain, use some sort of content management system.

Suppose you have a web site with sixty static HTML pages. You chose to ignore the first rule of content management, and the only way to edit the pages is to, well, edit the pages (by opening them one by one in a text editor). Now you need to change some common part consistently on all pages, let us say to update the copyright message. Enter the Perl.

First, we need to write Perl code to find the string we want to replace (© 2003) and substitute it with the updated version (© 2003-2004):

s/© 2003/© 2003-2004/

That was trivial, but we already know that Perl is great for working with text. Today's lesson is about Perl's support for being a command line tool and that, in consequence, your work is already done. The other things that need to happen (opening all HTML files in the directory, reading them into memory, applying above substitution, and writing the modified file back to disk) can all be dealt with with command line switches:

perl -pe 's/© 2003/© 2003-2004/'  -i *.html  

So what do these options do?

-e
This switch is the heart of all Perl one-liners and lets you specify the program to be run (the one-line substitution snippet) on the command line (rather than reading it from a source file).
-p
This tells Perl to loop over all the specified input files, executing the program for every line (the line becomes the special variable $_) and print the (possibly changed) line afterwards. An alternative is -n, which does the same, but omits printing.
-i
specifies in-place editing, so that what we print is written back to the original file (instead of going to stdout). This rather dangerous switch can be instructed to make an optional backup.
*.html
Any files you name after all other options will be read and processed one by one.

See also the Perl manual about all the other interesting command-line options.